KarrLab/bpforms

View on GitHub

Showing 397 of 397 total issues

Function diff has a Cognitive Complexity of 53 (exceeds 5 allowed). Consider refactoring.
Open

    def diff(self, other):
        """ Determine the semantic difference between two biopolymer forms

        Args:
            other (:obj:`BpForm`): another biopolymer form
Severity: Minor
Found in bpforms/core.py - About 1 day to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

class CanonicalDnaForm(BpForm):
    """ Canonical DNA form """

    DEFAULT_FASTA_CODE = 'N'

Severity: Major
Found in bpforms/alphabet/dna.py and 3 other locations - About 1 day to fix
bpforms/alphabet/dna.py on lines 725..743
bpforms/alphabet/rna.py on lines 1136..1154
bpforms/alphabet/rna.py on lines 1158..1176

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 124.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

class RnaForm(BpForm):
    """ RNA form """

    DEFAULT_FASTA_CODE = 'N'

Severity: Major
Found in bpforms/alphabet/rna.py and 3 other locations - About 1 day to fix
bpforms/alphabet/dna.py on lines 725..743
bpforms/alphabet/dna.py on lines 747..765
bpforms/alphabet/rna.py on lines 1158..1176

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 124.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

class CanonicalRnaForm(BpForm):
    """ Canonical RNA form """

    DEFAULT_FASTA_CODE = 'N'

Severity: Major
Found in bpforms/alphabet/rna.py and 3 other locations - About 1 day to fix
bpforms/alphabet/dna.py on lines 725..743
bpforms/alphabet/dna.py on lines 747..765
bpforms/alphabet/rna.py on lines 1136..1154

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 124.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

class DnaForm(BpForm):
    """ DNA form """

    DEFAULT_FASTA_CODE = 'N'

Severity: Major
Found in bpforms/alphabet/dna.py and 3 other locations - About 1 day to fix
bpforms/alphabet/dna.py on lines 747..765
bpforms/alphabet/rna.py on lines 1136..1154
bpforms/alphabet/rna.py on lines 1158..1176

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 124.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

    sorted_x_links = sorted(all_x_links, key=lambda x: (
        x['l_polymer_col'] == x['r_polymer_col'] and x['l_pos'] == x['r_pos'],
        x['l_polymer_col'], x['r_polymer_col'],
        x['l_pos'], x['r_pos'],
        x['l_polymer_row'], x['l_track'],
Severity: Major
Found in bpforms/util.py and 1 other location - About 7 hrs to fix
bpforms/util.py on lines 578..583

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 118.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

    sorted_x_links = sorted(all_x_links, key=lambda x: (
        x['l_polymer_row'] == x['r_polymer_row'] and x['l_track'] == x['r_track'],
        x['l_polymer_row'], x['r_polymer_row'],
        x['l_track'], x['r_track'],
        x['l_polymer_col'], x['l_pos'],
Severity: Major
Found in bpforms/util.py and 1 other location - About 7 hrs to fix
bpforms/util.py on lines 608..613

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 118.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function export_ontos_to_obo has a Cognitive Complexity of 49 (exceeds 5 allowed). Consider refactoring.
Open

def export_ontos_to_obo(alphabets=None, filename=None, _max_monomers=None, _max_xlinks=None):
    """ Exports alphabets of residues and ontology of crosslinks to OBO format

    Args:
        alphabets (:obj:`list` of :obj:`core.Alphabet`, optional): alphabets to export        
Severity: Minor
Found in bpforms/util.py - About 7 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Monomer has 51 functions (exceeds 20 allowed). Consider refactoring.
Open

class Monomer(object):
    """ A monomeric form in a biopolymer

    Attributes:
        id (:obj:`str`): id
Severity: Major
Found in bpforms/core.py - About 7 hrs to fix

    Function is_c_terminus has a Cognitive Complexity of 45 (exceeds 5 allowed). Consider refactoring.
    Open

        def is_c_terminus(self, mol, atom, residue=True, convert_to_aa=False):
            """ Determine if an atom is an C-terminus
    
            Args:
                mol (:obj:`openbabel.OBMol`): molecule
    Severity: Minor
    Found in bpforms/alphabet/protein.py - About 6 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                elif modification['monomer'].startswith('CHEBI:'):
                    if include_annotations:
                        monomer = bpforms.Monomer().from_dict(
                            monomers[modification['residue']].to_dict(
                                alphabet=bpforms.protein_alphabet),
    Severity: Major
    Found in examples/pro.py and 1 other location - About 6 hrs to fix
    examples/pro.py on lines 624..636

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 108.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                if modification['monomer'] == 'PR:000026291':
                    if include_annotations:
                        monomer = bpforms.Monomer().from_dict(
                            monomers[modification['residue']].to_dict(
                                alphabet=bpforms.protein_alphabet),
    Severity: Major
    Found in examples/pro.py and 1 other location - About 6 hrs to fix
    examples/pro.py on lines 638..650

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 108.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Function run_trna has a Cognitive Complexity of 41 (exceeds 5 allowed). Consider refactoring.
    Open

    def run_trna(session, modomics_short_code_to_monomer, monomer_codes, out_filename):
        response = session.get(URL, params={
            'RNA_type': 'tRNA',
            'RNA_subtype': 'all',
            'organism': 'all species',
    Severity: Minor
    Found in examples/modomics.py - About 6 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function build_modomics has a Cognitive Complexity of 41 (exceeds 5 allowed). Consider refactoring.
    Open

        def build_modomics(self, alphabet, session, ph=None, major_tautomer=False, dearomatize=False):
            """ Build alphabet from MODOMICS
    
            Args:
                alphabet (:obj:`Alphabet`): alphabet
    Severity: Minor
    Found in bpforms/alphabet/rna.py - About 6 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

                if pdb_monomer.id not in ['A3A', 'DNR']:
                    monomer = smiles_to_monomer.get(can_smiles, None)
                    if monomer is not None:
                        for identifier in monomer.identifiers:
                            if identifier.ns == 'pdb-ccd':
    Severity: Major
    Found in bpforms/alphabet/dna.py and 1 other location - About 6 hrs to fix
    bpforms/alphabet/rna.py on lines 783..793

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 100.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

                if pdb_monomer.id not in ['A5O', 'AP7', 'CAR', 'GAO', 'UAR']:
                    monomer = smiles_to_monomer.get(can_smiles, None)
                    if monomer is not None:
                        for identifier in monomer.identifiers:
                            if identifier.ns == 'pdb-ccd':
    Severity: Major
    Found in bpforms/alphabet/rna.py and 1 other location - About 6 hrs to fix
    bpforms/alphabet/dna.py on lines 633..643

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 100.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    BpForm has 45 functions (exceeds 20 allowed). Consider refactoring.
    Open

    class BpForm(object):
        """ Biopolymer form
    
        Attributes:
            seq (:obj:`MonomerSequence`): sequence of monomeric forms of the biopolymer
    Severity: Minor
    Found in bpforms/core.py - About 6 hrs to fix

      Function parse_pdb_ccd_entry has a Cognitive Complexity of 39 (exceeds 5 allowed). Consider refactoring.
      Open

      def parse_pdb_ccd_entry(xml_file, valid_types):
          """ Parse an entry of the PDB CCD
      
          Args:
              xml_file (:obj:`io.BufferedReader`): XML file
      Severity: Minor
      Found in bpforms/alphabet/core.py - About 5 hrs to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  if i_left_p is not None and i_left_o is not None and \
                          ('HOP3' in atoms or mol.GetAtom(i_left_o).GetFormalCharge() == -1):
                      pdb_monomer.l_bond_atoms = [Atom(Monomer, element='P', position=i_left_p)]
                      pdb_monomer.l_displaced_atoms = [
                          Atom(Monomer, element='O', position=i_left_o, charge=-1)]
      Severity: Major
      Found in bpforms/alphabet/rna.py and 1 other location - About 5 hrs to fix
      bpforms/alphabet/dna.py on lines 586..590

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 92.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  if i_left_p is not None and i_left_o is not None and \
                          ('HOP3' in atoms or mol.GetAtom(i_left_o).GetFormalCharge() == -1):
                      pdb_monomer.l_bond_atoms = [Atom(Monomer, element='P', position=i_left_p)]
                      pdb_monomer.l_displaced_atoms = [
                          Atom(Monomer, element='O', position=i_left_o, charge=-1)]
      Severity: Major
      Found in bpforms/alphabet/dna.py and 1 other location - About 5 hrs to fix
      bpforms/alphabet/rna.py on lines 736..740

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 92.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Severity
      Category
      Status
      Source
      Language