KarrLab/wc_kb_gen

View on GitHub

Showing 32 of 32 total issues

Function gen_tus has a Cognitive Complexity of 52 (exceeds 5 allowed). Consider refactoring.
Open

    def gen_tus(self):
        """ Creates transcription units with 5'/3' UTRs, polycistronic mRNAs, and other types of RNA (tRNA, rRNA, sRNA) """

        options = self.options
        five_prime_len = options.get('five_prime_len') # 7 bp default (E. coli, wikipedia)
Severity: Minor
Found in wc_kb_gen/random/genome.py - About 1 day to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

        for prot in cell.species_types.get(__type=wc_kb.prokaryote.ProteinSpeciesType):
            prot_specie = prot.species.get_or_create(compartment=cytosol)
            conc = round(abs(random.normal(loc=mean_protein_copy_number,scale=15))) / scipy.constants.Avogadro / mean_volume
            cell.concentrations.get_or_create(id='CONC({})'.format(prot_specie.id), species=prot_specie, value=conc, units=unit_registry.parse_units('M'))
Severity: Major
Found in wc_kb_gen/random/genome.py and 1 other location - About 7 hrs to fix
wc_kb_gen/random/genome.py on lines 437..440

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 111.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

        for rna in cell.species_types.get(__type=wc_kb.prokaryote.RnaSpeciesType):
            rna_specie = rna.species.get_or_create(compartment=cytosol)
            conc = round(abs(random.normal(loc=mean_rna_copy_number,scale=15))) / scipy.constants.Avogadro / mean_volume
            cell.concentrations.get_or_create(id='CONC({})'.format(rna_specie.id), species=rna_specie, value=conc, units=unit_registry.parse_units('M'))
Severity: Major
Found in wc_kb_gen/random/genome.py and 1 other location - About 7 hrs to fix
wc_kb_gen/random/genome.py on lines 442..445

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 111.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function gen_genome has a Cognitive Complexity of 38 (exceeds 5 allowed). Consider refactoring.
Open

    def gen_genome(self):
        '''Construct knowledge base components and generate the DNA sequence'''

        # get options
        options = self.options
Severity: Minor
Found in wc_kb_gen/random/genome.py - About 5 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

File genome.py has 372 lines of code (exceeds 250 allowed). Consider refactoring.
Open

"""
:Author: Ashwin Srinivasan <ashwins@mit.edu>
:Author: Bilal Shaikh <bilal.shaikh@columbia.edu>
:Author: Balazs Szigeti <balazs.szigeti@mssm.edu>
:Date: 2018-06-06
Severity: Minor
Found in wc_kb_gen/random/genome.py - About 4 hrs to fix

    Function gen_rnas_proteins has a Cognitive Complexity of 22 (exceeds 5 allowed). Consider refactoring.
    Open

        def gen_rnas_proteins(self):
            """ Creates RNA and protein objects corresponding to genes on chromosome. """
    
            cell = self.knowledge_base.cell
            options = self.options
    Severity: Minor
    Found in wc_kb_gen/random/genome.py - About 3 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function clean_and_validate_options has 65 lines of code (exceeds 25 allowed). Consider refactoring.
    Open

        def clean_and_validate_options(self):
            """ Apply default options and validate options """
    
            # Default options are loosely  based on Escherichia coli K-12
            # Nucleic Acids Research 41:D605-12 2013
    Severity: Major
    Found in wc_kb_gen/random/genome.py - About 2 hrs to fix

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

              if self.knowledge_base.cell.parameters.get_one(id='mean_volume') is not None:
                  mean_volume = self.knowledge_base.cell.parameters.get_one(id='mean_volume').value
              else:
                  mean_volume = 0.000000000000000067
                  print('"mean_volume" parameter is missing, using Mycoplasma pneumoniae value (6.7E-17L).')
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 1 other location - About 2 hrs to fix
      wc_kb_gen/random/complex.py on lines 46..50

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 54.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

              if self.knowledge_base.cell.parameters.get_one(id='mean_volume') is not None:
                  mean_volume = self.knowledge_base.cell.parameters.get_one(id='mean_volume').value
              else:
                  mean_volume = 0.000000000000000067
                  print('"mean_volume" parameter is missing, using Mycoplasma pneumoniae value (6.7E-17L).')
      Severity: Major
      Found in wc_kb_gen/random/complex.py and 1 other location - About 2 hrs to fix
      wc_kb_gen/random/genome.py on lines 431..435

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 54.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

              with open(seq_path, 'w') as file:
                  writer = Bio.SeqIO.FastaIO.FastaWriter(
                      file, wrap=70, record2title=lambda record: record.id)
                  writer.write_file(dna_seqs)
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 1 other location - About 2 hrs to fix
      wc_kb_gen/random/genome.py on lines 478..481

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 50.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

                  with open(seq_path, 'w') as file:
                      writer = Bio.SeqIO.FastaIO.FastaWriter(
                          file, wrap=70, record2title=lambda record: record.id)
                      writer.write_file(dna_seqs)
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 1 other location - About 2 hrs to fix
      wc_kb_gen/random/genome.py on lines 280..283

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 50.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

                          tu = self.knowledge_base.cell.loci.get_or_create(
                              id='tu_{}_{}'.format(i_chr + 1, i_gene + 1), __type=wc_kb.prokaryote.TranscriptionUnitLocus)
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 3 other locations - About 1 hr to fix
      wc_kb_gen/random/genome.py on lines 243..244
      wc_kb_gen/random/genome.py on lines 316..317
      wc_kb_gen/random/genome.py on lines 353..354

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 45.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

                              tu = self.knowledge_base.cell.loci.get_or_create(
                                  id='tu_{}_{}'.format(i_chr + 1, i_gene + 1), __type=wc_kb.prokaryote.TranscriptionUnitLocus)
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 3 other locations - About 1 hr to fix
      wc_kb_gen/random/genome.py on lines 243..244
      wc_kb_gen/random/genome.py on lines 353..354
      wc_kb_gen/random/genome.py on lines 364..365

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 45.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

                      gene = self.knowledge_base.cell.loci.get_or_create(
                          id='gene_{}_{}'.format(i_chr + 1, i_gene + 1), __type=wc_kb.prokaryote.GeneLocus)
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 3 other locations - About 1 hr to fix
      wc_kb_gen/random/genome.py on lines 316..317
      wc_kb_gen/random/genome.py on lines 353..354
      wc_kb_gen/random/genome.py on lines 364..365

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 45.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

                              tu = self.knowledge_base.cell.loci.get_or_create(
                                  id='tu_{}_{}'.format(i_chr + 1, i_gene + 1), __type=wc_kb.prokaryote.TranscriptionUnitLocus)
      Severity: Major
      Found in wc_kb_gen/random/genome.py and 3 other locations - About 1 hr to fix
      wc_kb_gen/random/genome.py on lines 243..244
      wc_kb_gen/random/genome.py on lines 316..317
      wc_kb_gen/random/genome.py on lines 364..365

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 45.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Function gen_components has a Cognitive Complexity of 10 (exceeds 5 allowed). Consider refactoring.
      Open

          def gen_components(self):
              """ Takes random samples of the generated rnas and proteins and assigns them functions based on the included list of proteins and rnas"""
      
              cell = self.knowledge_base.cell
              cytosol = cell.compartments.get_one(id='c')
      Severity: Minor
      Found in wc_kb_gen/random/observables.py - About 1 hr to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function gen_genome has 26 lines of code (exceeds 25 allowed). Consider refactoring.
      Open

          def gen_genome(self):
              '''Construct knowledge base components and generate the DNA sequence'''
      
              # get options
              options = self.options
      Severity: Minor
      Found in wc_kb_gen/random/genome.py - About 1 hr to fix

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

                    codons = [a + b + c for a in bases for b in bases for c in bases]
        Severity: Minor
        Found in wc_kb_gen/random/genome.py and 1 other location - About 45 mins to fix
        wc_kb_gen/random/observables.py on lines 36..36

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 35.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Avoid deeply nested control flow statements.
        Open

                                for gene in tu.genes:
                                    # creates ProteinSpecipe object for corresponding protein sequence(s)
                                    prot = self.knowledge_base.cell.species_types.get_or_create(
                                        id='prot_{}'.format(gene.id),
                                        __type=wc_kb.prokaryote.ProteinSpeciesType)
        Severity: Major
        Found in wc_kb_gen/random/genome.py - About 45 mins to fix

          Identical blocks of code found in 2 locations. Consider refactoring.
          Open

                      codons = [a + b + c for a in bases for b in bases for c in bases]
          Severity: Minor
          Found in wc_kb_gen/random/observables.py and 1 other location - About 45 mins to fix
          wc_kb_gen/random/genome.py on lines 460..460

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 35.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Severity
          Category
          Status
          Source
          Language