KarrLab/datanator_query_python

View on GitHub

Showing 190 of 190 total issues

Identical blocks of code found in 2 locations. Consider refactoring.
Open

        if limit > 0:
            pipeline = [{"$match": query}, {"$limit": limit}, {"$skip": skip}, lookup, {"$project": projection}]
        else:
            pipeline = [{"$match": query}, {"$skip": skip}, lookup, {"$project": projection}]
Severity: Major
Found in datanator_query_python/query/query_sabiork_old.py and 1 other location - About 2 hrs to fix
datanator_query_python/query/query_sabiork_old.py on lines 344..347

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 60.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def base26_dublet_for_bits_56_to_64(a):
    b0 = a[7]
    b1 = a[8] & 0x01
    h = b0 | b1 << 8
    return d26[h]
Severity: Major
Found in datanator_query_python/util/base26.py and 1 other location - About 2 hrs to fix
datanator_query_python/util/base26.py on lines 12..16

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 60.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def base26_triplet_1(a):
    b0 = a[0]
    b1 = a[1] & 0x3f
    h = b0 | b1 << 8
    return t26[h]
Severity: Major
Found in datanator_query_python/util/base26.py and 1 other location - About 2 hrs to fix
datanator_query_python/util/base26.py on lines 49..53

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 60.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

class RxnManager:

    def rxn_manager(self):
        return query_sabiork_old.QuerySabioOld(username=config.AtlasConfig.USERNAME, 
        password=config.AtlasConfig.PASSWORD, MongoDB=config.AtlasConfig.SERVER,
Severity: Major
Found in datanator_query_python/config/query_manager.py and 1 other location - About 2 hrs to fix
datanator_query_python/config/query_manager.py on lines 46..52

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 58.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

class TaxonManager:

    def txn_manager(self):
        return query_taxon_tree.QueryTaxonTree(username=config.AtlasConfig.USERNAME, 
        password=config.AtlasConfig.PASSWORD, MongoDB=config.AtlasConfig.SERVER,
Severity: Major
Found in datanator_query_python/config/query_manager.py and 1 other location - About 2 hrs to fix
datanator_query_python/config/query_manager.py on lines 37..43

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 58.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function parse_router has a Cognitive Complexity of 17 (exceeds 5 allowed). Consider refactoring.
Open

    def parse_router(self, lines=0):
        """Parse router lines to get API performance values.

        Args:
            lines(:obj:`int`, optional): Number of lines to parse. If 0, parse all lines.
Severity: Minor
Found in datanator_query_python/util/parse_heroku_logs.py - About 2 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function get_kinlawid_by_rxn has a Cognitive Complexity of 17 (exceeds 5 allowed). Consider refactoring.
Open

    def get_kinlawid_by_rxn(self, substrates, products):
        ''' Find the kinlaw_id defined in sabio_rk using 
            rxn participants' inchi string

            Args:
Severity: Minor
Found in datanator_query_python/query/query_sabiork.py - About 2 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function get_equivalent_protein_with_anchor has a Cognitive Complexity of 17 (exceeds 5 allowed). Consider refactoring.
Open

    def get_equivalent_protein_with_anchor(self, _id, max_distance, max_depth=float('inf')):
        '''
            Get replacement abundance value by taxonomic distance
            with the same kegg_orthology number.

Severity: Minor
Found in datanator_query_python/query/query_protein.py - About 2 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function get_equivalent_kegg_with_anchor_obsolete has a Cognitive Complexity of 16 (exceeds 5 allowed). Consider refactoring.
Open

    def get_equivalent_kegg_with_anchor_obsolete(self, ko, anchor, max_distance, max_depth=float('inf')):
        '''
            Get replacement abundance value by taxonomic distance
            with the same kegg_orthology number.

Severity: Minor
Found in datanator_query_python/query/query_protein.py - About 2 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 3 locations. Consider refactoring.
Open

        pipeline = [
             {'$match': {'tax_id': {'$in': src_tax_ids}}},
             {'$addFields': {"__order": {'$indexOfArray': [src_tax_ids, "$tax_id"]}}},
             {'$sort': {"__order": 1}},
             {"$project": projection}
Severity: Major
Found in datanator_query_python/query/query_taxon_tree.py and 2 other locations - About 1 hr to fix
datanator_query_python/query/query_kegg_orthology.py on lines 91..95
datanator_query_python/query/query_kegg_orthology.py on lines 114..118

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 49.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

        for key, val in obj.items():
            if isinstance(val, list):
                add_to_set[key] = {"$each": val}
            else:
                _set[key] = val
Severity: Major
Found in datanator_query_python/util/mongo_util.py and 1 other location - About 1 hr to fix
datanator_query_python/util/mongo_util.py on lines 154..158

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 49.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function get_kinlaw_by_rxn_ortho has a Cognitive Complexity of 15 (exceeds 5 allowed). Consider refactoring.
Open

    def get_kinlaw_by_rxn_ortho(self, substrates, products, dof=0,
                          projection={'kinlaw_id': 1, '_id': 0, "enzymes": 1},
                          bound='loose', skip=0, limit=0):
        ''' Find the kinlaw_id defined in sabio_rk using 
            rxn participants' inchikey
Severity: Minor
Found in datanator_query_python/query/query_sabiork_old.py - About 1 hr to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 3 locations. Consider refactoring.
Open

        pipeline = [
             {'$match': {'kegg_orthology_id': {'$in': kegg_ids}}},
             {'$addFields': {"__order": {'$indexOfArray': [kegg_ids, "$kegg_orthology_id" ]}}},
             {'$sort': {"__order": 1}},
             {"$project": projection}
Severity: Major
Found in datanator_query_python/query/query_kegg_orthology.py and 2 other locations - About 1 hr to fix
datanator_query_python/query/query_kegg_orthology.py on lines 114..118
datanator_query_python/query/query_taxon_tree.py on lines 333..337

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 49.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

        for key, val in obj.items():
            if isinstance(val, list):
                add_to_set[key] = {"$each": val}
            else:
                _set[key] = val
Severity: Major
Found in datanator_query_python/util/mongo_util.py and 1 other location - About 1 hr to fix
datanator_query_python/util/mongo_util.py on lines 118..122

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 49.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 3 locations. Consider refactoring.
Open

        pipeline = [
             {'$match': {'orthodb_id': {'$in': orthodb_ids}}},
             {'$addFields': {"__order": {'$indexOfArray': [orthodb_ids, "$orthodb_id" ]}}},
             {'$sort': {"__order": 1}},
             {"$project": projection}
Severity: Major
Found in datanator_query_python/query/query_kegg_orthology.py and 2 other locations - About 1 hr to fix
datanator_query_python/query/query_kegg_orthology.py on lines 91..95
datanator_query_python/query/query_taxon_tree.py on lines 333..337

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 49.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function extract_values has a Cognitive Complexity of 14 (exceeds 5 allowed). Consider refactoring.
Open

    def extract_values(self, obj, key):
        """Pull all values of specified key from nested JSON.
        """
        arr = []

Severity: Minor
Found in datanator_query_python/util/file_util.py - About 1 hr to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function get_equivalent_protein has a Cognitive Complexity of 14 (exceeds 5 allowed). Consider refactoring.
Open

    def get_equivalent_protein(self, _id, max_distance, max_depth=float('inf')):
        '''
            Get replacement abundance value by taxonomic distance
            with the same kegg_orthology number.

Severity: Minor
Found in datanator_query_python/query/query_protein.py - About 1 hr to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

        if op == "update":
            self.client[db][col].update_one(query,
                                            _update,
                                            upsert=True)
        else:
Severity: Major
Found in datanator_query_python/util/mongo_util.py and 1 other location - About 1 hr to fix
datanator_query_python/util/mongo_util.py on lines 163..168

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 45.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

        for bucket_all in r_all['top_kos']['buckets']:
            ko_all.add(bucket_all['top_ko']['hits']['hits'][0]['_source'].get(agg_field))
Severity: Major
Found in datanator_query_python/query/full_text_search.py and 1 other location - About 1 hr to fix
datanator_query_python/query/full_text_search.py on lines 414..415

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 45.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

        for bucket_abundance in r['aggregations']['top_kos']['buckets']:
            ko_abundance.add(bucket_abundance['top_ko']['hits']['hits'][0]['_source'][agg_field])
Severity: Major
Found in datanator_query_python/query/full_text_search.py and 1 other location - About 1 hr to fix
datanator_query_python/query/full_text_search.py on lines 304..305

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 45.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Severity
Category
Status
Source
Language