KarrLab/bpforms

View on GitHub

Showing 397 of 397 total issues

Function get_structure has a Cognitive Complexity of 84 (exceeds 5 allowed). Consider refactoring.
Open

    def get_structure(self, include_all_hydrogens=False):
        """ Get an Open Babel molecule of the structure

        Args:
            include_all_hydrogens (:obj:`bool`, optional): if :obj:`True`, explicitly include all
Severity: Minor
Found in bpforms/core.py - About 1 day to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

File pro.py has 715 lines of code (exceeds 250 allowed). Consider refactoring.
Open

""" Generate BpForms for all of the proteins in PRO, verify
them, and calculate their properties

:Author: Jonathan Karr <karr@mssm.edu>
:Date: 2019-06-24
Severity: Major
Found in examples/pro.py - About 1 day to fix

    Function validate_bpform_bonds has a Cognitive Complexity of 82 (exceeds 5 allowed). Consider refactoring.
    Open

    def validate_bpform_bonds(form_type):
        """ Validate bonds in alphabet
    
        Args:
            form_type (:obj:`type`): type of BpForm
    Severity: Minor
    Found in bpforms/util.py - About 1 day to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function load_pdb_from_ftp has a Cognitive Complexity of 80 (exceeds 5 allowed). Consider refactoring.
    Open

    def load_pdb_from_ftp(max_entries):
        """ read the PDB database and parse the entries
    
        """
        # get amino acid set (canonical and non-canonical) from bpforms
    Severity: Minor
    Found in examples/pdb_analysis.py - About 1 day to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                try:
                    dimer_structure = dimer_form.get_structure()[0]
                    if dimer_form.get_formula() != OpenBabelUtils.get_formula(dimer_structure):
                        errors.append('Dimer of {} has incorrect formula: {} != {}'.format(
                            monomer.id, str(dimer_form.get_formula()), str(OpenBabelUtils.get_formula(dimer_structure))))
    Severity: Major
    Found in bpforms/util.py and 1 other location - About 1 day to fix
    bpforms/util.py on lines 260..273

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 170.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

                for i_o in range(1, 4):
                    id_o = 'OP' + str(i_o)
                    id_h = 'H' + id_o
                    if id_o in atoms and id_h in atoms:
                        atom_o = mol.GetAtom(atoms[id_o]['position'])
    Severity: Major
    Found in bpforms/alphabet/rna.py and 1 other location - About 1 day to fix
    bpforms/alphabet/dna.py on lines 491..503

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 170.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

                for i_o in range(1, 4):
                    id_o = 'OP' + str(i_o)
                    id_h = 'H' + id_o
                    if id_o in atoms and id_h in atoms:
                        atom_o = mol.GetAtom(atoms[id_o]['position'])
    Severity: Major
    Found in bpforms/alphabet/dna.py and 1 other location - About 1 day to fix
    bpforms/alphabet/rna.py on lines 641..653

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 170.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

            try:
                monomer_structure = monomer_form.get_structure()[0]
                if monomer_form.get_formula() != OpenBabelUtils.get_formula(monomer_structure):
                    errors.append('Monomeric form of {} has incorrect formula: {} != {}'.format(
                        monomer.id, str(monomer_form.get_formula()), str(OpenBabelUtils.get_formula(monomer_structure))))
    Severity: Major
    Found in bpforms/util.py and 1 other location - About 1 day to fix
    bpforms/util.py on lines 277..290

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 170.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    File dna.py has 622 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    """ Alphabet and BpForm to represent modified DNA
    
    :Author: Jonathan Karr <karr@mssm.edu>
    :Date: 2019-02-05
    :Copyright: 2019, Karr Lab
    Severity: Major
    Found in bpforms/alphabet/dna.py - About 1 day to fix

      Function parse_protein has a Cognitive Complexity of 69 (exceeds 5 allowed). Consider refactoring.
      Open

      def parse_protein(protein):
          """ Parse the modification information from a term for a modified protein
      
          Args:
              protein (:obj:`dict`): term for a modified protein
      Severity: Minor
      Found in examples/pro.py - About 1 day to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  for atom in openbabel.OBMolAtomIter(mol):
                      if atom.GetAtomicNum() == 15 and atom.GetIsotope() == 1:
                          i_left_p = atom.GetIdx()
                          atom.SetIsotope(0)
                      elif atom.GetAtomicNum() == 8 and atom.GetIsotope() == 1:
      Severity: Major
      Found in bpforms/alphabet/rna.py and 1 other location - About 1 day to fix
      bpforms/alphabet/dna.py on lines 538..547

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 155.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  for atom in openbabel.OBMolAtomIter(mol):
                      if atom.GetAtomicNum() == 15 and atom.GetIsotope() == 1:
                          i_left_p = atom.GetIdx()
                          atom.SetIsotope(0)
                      elif atom.GetAtomicNum() == 8 and atom.GetIsotope() == 1:
      Severity: Major
      Found in bpforms/alphabet/dna.py and 1 other location - About 1 day to fix
      bpforms/alphabet/rna.py on lines 688..697

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 155.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

      class CanonicalProteinForm(BpForm):
          """ Canonical protein form """
      
          DEFAULT_FASTA_CODE = 'X'
      
      
      Severity: Major
      Found in bpforms/alphabet/protein.py and 1 other location - About 1 day to fix
      bpforms/alphabet/protein.py on lines 922..942

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 151.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

      class ProteinForm(BpForm):
          """ Protein form """
      
          DEFAULT_FASTA_CODE = 'X'
      
      
      Severity: Major
      Found in bpforms/alphabet/protein.py and 1 other location - About 1 day to fix
      bpforms/alphabet/protein.py on lines 946..966

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 151.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                      if not engineered:
                          # for now, only get entries that are exclusively from one organism
                          if len(entry.org_taxid) == 1:
                              # does not include those whose taxid is 'unidentified' or 'synthetic'
                              taxid = list(entry.org_taxid)[0]
      Severity: Major
      Found in examples/pdb_analysis.py and 1 other location - About 1 day to fix
      examples/pdb_analysis.py on lines 312..327

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 138.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  if not engineered:
                      # for now, only get entries that are exclusively from one organism
                      if len(entry.org_taxid) == 1:
                          # does not include those whose taxid is 'unidentified' or 'synthetic'
                          taxid = list(entry.org_taxid)[0]
      Severity: Major
      Found in examples/pdb_analysis.py and 1 other location - About 1 day to fix
      examples/pdb_analysis.py on lines 427..442

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 138.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                      with gzip.open(f, 'rb') as gz:
      
                          het_set = set()
      
                          line = gz.readline().decode('utf-8').strip()
      Severity: Major
      Found in examples/pdb_analysis.py and 1 other location - About 1 day to fix
      examples/pdb_analysis.py on lines 135..152

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 129.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  with gzip.GzipFile(fileobj=f, mode='rb') as gz:
      
                      het_set = set()
      
                      line = gz.readline().decode('utf-8').strip()
      Severity: Major
      Found in examples/pdb_analysis.py and 1 other location - About 1 day to fix
      examples/pdb_analysis.py on lines 191..208

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 129.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Function build_dnamod has a Cognitive Complexity of 54 (exceeds 5 allowed). Consider refactoring.
      Open

          def build_dnamod(self, alphabet, ph=None, major_tautomer=False, dearomatize=False):
              """ Build monomeric forms from DNAmod
      
              Args:
                  alphabet (:obj:`Alphabet`): alphabet
      Severity: Minor
      Found in bpforms/alphabet/dna.py - About 1 day to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      File pdb_analysis.py has 513 lines of code (exceeds 250 allowed). Consider refactoring.
      Open

      """ Parse PDB database and analyze non-canonical amino acid
      
      :Author: Mike Zheng <xzheng20@colby.edu>
      :Date: 2019-07-24
      :Copyright: 2019, Karr Lab
      Severity: Major
      Found in examples/pdb_analysis.py - About 1 day to fix
        Severity
        Category
        Status
        Source
        Language