NationalGenomicsInfrastructure/ngi_pipeline

View on GitHub

Showing 242 of 242 total issues

Function parse_flowcell has a Cognitive Complexity of 25 (exceeds 5 allowed). Consider refactoring.
Open

def parse_flowcell(fc_dir):
    """
    Traverse a CASAVA-1.8 or 2.5 generated directory structure for the HiSeq 2500
    and return a dictionary of the elements it contains.

Severity: Minor
Found in ngi_pipeline/conductor/flowcell.py - About 3 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function record_project_job has a Cognitive Complexity of 24 (exceeds 5 allowed). Consider refactoring.
Open

def record_project_job(project, job_id, analysis_dir, workflow=None, engine='rna_ngi', run_mode='local', config=None, config_file_path=None):
    with get_session() as db_session:
        project_db_obj=ProjectAnalysis(project_id=project.project_id,
                                        job_id=job_id,
                                        project_name=project.name,
Severity: Minor
Found in ngi_pipeline/engines/rna_ngi/local_process_tracking.py - About 3 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function update_analysis has a Cognitive Complexity of 24 (exceeds 5 allowed). Consider refactoring.
Open

def update_analysis(project_id, status):
    charon_session=CharonSession()
    mail_analysis(project_id, engine_name='rna_ngi', level='INFO' if status else 'ERROR')
    new_sample_status='ANALYZED' if status else 'FAILED'
    new_seqrun_status='DONE' if status else 'FAILED'
Severity: Minor
Found in ngi_pipeline/engines/rna_ngi/local_process_tracking.py - About 3 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 3 locations. Consider refactoring.
Open

    matches = sorted(matches, key=lambda x:(int(x[0]) if x[0] != 'X' else x[0], int(x[1])))
Severity: Major
Found in scripts/gt_concordance.py and 2 other locations - About 3 hrs to fix
scripts/gt_concordance.py on lines 417..417
scripts/gt_concordance.py on lines 418..418

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 66.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 3 locations. Consider refactoring.
Open

    mismatches = sorted(mismatches, key=lambda x:(int(x[0]) if x[0] != 'X' else x[0], int(x[1])))
Severity: Major
Found in scripts/gt_concordance.py and 2 other locations - About 3 hrs to fix
scripts/gt_concordance.py on lines 416..416
scripts/gt_concordance.py on lines 418..418

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 66.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 3 locations. Consider refactoring.
Open

    lost = sorted(lost, key=lambda x:(int(x[0]) if x[0] != 'X' else x[0], int(x[1])))
Severity: Major
Found in scripts/gt_concordance.py and 2 other locations - About 3 hrs to fix
scripts/gt_concordance.py on lines 416..416
scripts/gt_concordance.py on lines 417..417

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 66.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

CharonSession has 28 functions (exceeds 20 allowed). Consider refactoring.
Open

class CharonSession(six.with_metaclass(Singleton, requests.Session)):
    
    def __init__(self, config=None, config_file_path=None):
        super(CharonSession, self).__init__()

Severity: Minor
Found in ngi_pipeline/database/classes.py - About 3 hrs to fix

    Function handle_sample_status has a Cognitive Complexity of 23 (exceeds 5 allowed). Consider refactoring.
    Open

    def handle_sample_status(analysis_object, sample, charon_reported_status):
        """ returns true of false wether the sample should be analyzed"""
        if charon_reported_status == "UNDER_ANALYSIS":
            if not analysis_object.restart_running_jobs:
                error_text = ('Charon reports seqrun analysis for project "{}" '
    Severity: Minor
    Found in ngi_pipeline/engines/utils.py - About 3 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function sbatch_piper_sample has a Cognitive Complexity of 23 (exceeds 5 allowed). Consider refactoring.
    Open

    def sbatch_piper_sample(command_line_list, workflow_name, project, sample,
                            libprep=None, restart_finished_jobs=False, files_to_copy=None,
                            config=None, config_file_path=None):
        """sbatch a piper sample-level workflow.
    
    
    Severity: Minor
    Found in ngi_pipeline/engines/piper_ngi/launchers.py - About 3 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                except CharonError as e:
                    error_text = ('Unable to update Charon for {}: '
                                  '{}'.format(label, e))
                    LOG.error(error_text)
                    if not config.get('quiet'):
    Severity: Major
    Found in ngi_pipeline/engines/piper_ngi/local_process_tracking.py and 1 other location - About 3 hrs to fix
    ngi_pipeline/engines/piper_ngi/local_process_tracking.py on lines 231..252

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 65.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                            try:
                                remote_sample=charon_session.sample_get(projectid=project_id, sampleid=sample_id)
                                charon_status = remote_sample.get(sample_status_field)
                                if charon_status and not charon_status == set_status:
                                    LOG.warning('Tracking inconsistency for {}: Charon status '
    Severity: Major
    Found in ngi_pipeline/engines/piper_ngi/local_process_tracking.py and 1 other location - About 3 hrs to fix
    ngi_pipeline/engines/piper_ngi/local_process_tracking.py on lines 255..260

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 65.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    File database.py has 301 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    import contextlib
    
    from ngi_pipeline.database.classes import CharonError, CharonSession
    from ngi_pipeline.engines.sarek.process import ProcessRunning, ProcessExitStatusSuccessful, \
        ProcessExitStatusFailed, ProcessExitStatusUnknown, ProcessConnector, SlurmConnector
    Severity: Minor
    Found in ngi_pipeline/engines/sarek/database.py - About 3 hrs to fix

      Function check_concordance has a Cognitive Complexity of 22 (exceeds 5 allowed). Consider refactoring.
      Open

      def check_concordance(sample, vcf_data, gt_data, config):
          project = sample.split('_')[0]
          matches = []
          mismatches = []
          lost = []
      Severity: Minor
      Found in scripts/gt_concordance.py - About 3 hrs to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function match_files_under_dir has a Cognitive Complexity of 22 (exceeds 5 allowed). Consider refactoring.
      Open

      def match_files_under_dir(dirname, pattern, pt_style="regex", realpath=True):
          """Find all the files under a directory that match pattern.
      
          :parm str dirname: The directory under which to search
          :param str pattern: The pattern against which to match
      Severity: Minor
      Found in ngi_pipeline/utils/filesystem.py - About 3 hrs to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      SarekAnalysis has 26 functions (exceeds 20 allowed). Consider refactoring.
      Open

      class SarekAnalysis(object):
          """
          Base class for the SarekAnalysis engine. This class contains the necessary methods for configuring and launching
          an analysis with the Sarek engine. However, some methods are not implemented (they are "abstract") and are
          expected to be implemented in subclasses, providing interfaces to the specialized analysis modes (e.g. Germline or
      Severity: Minor
      Found in ngi_pipeline/engines/sarek/models/sarek.py - About 3 hrs to fix

        Function update_coverage_for_sample_seqruns has a Cognitive Complexity of 21 (exceeds 5 allowed). Consider refactoring.
        Open

        def update_coverage_for_sample_seqruns(project_id, sample_id, piper_qc_dir,
                                               config=None, config_file_path=None):
            """Find all the valid seqruns for a particular sample, parse their
            qualimap output files, and update Charon with the mean autosomal
            coverage for each.
        Severity: Minor
        Found in ngi_pipeline/engines/piper_ngi/local_process_tracking.py - About 2 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function determine_library_prep_from_fcid has a Cognitive Complexity of 21 (exceeds 5 allowed). Consider refactoring.
        Open

        def determine_library_prep_from_fcid(project_id, sample_name, fcid):
            """Use the information in the database to get the library prep id
            from the project name, sample name, and flowcell id.
        
            :param str project_id: The ID of the project
        Severity: Minor
        Found in ngi_pipeline/utils/parsers.py - About 2 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function sbatch_piper_sample has 65 lines of code (exceeds 25 allowed). Consider refactoring.
        Open

        def sbatch_piper_sample(command_line_list, workflow_name, project, sample,
                                libprep=None, restart_finished_jobs=False, files_to_copy=None,
                                config=None, config_file_path=None):
            """sbatch a piper sample-level workflow.
        
        
        Severity: Major
        Found in ngi_pipeline/engines/piper_ngi/launchers.py - About 2 hrs to fix

          Function run_genotype_sample has a Cognitive Complexity of 19 (exceeds 5 allowed). Consider refactoring.
          Open

          def run_genotype_sample(context, sample, force=None):
              config = context.obj
              project = sample.split('_')[0]
              output_path = os.path.join(config.get('ANALYSIS_PATH'), project, 'piper_ngi/03_genotype_concordance')
          
          
          Severity: Minor
          Found in scripts/gt_concordance.py - About 2 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Function workflow_fastq_screen has a Cognitive Complexity of 19 (exceeds 5 allowed). Consider refactoring.
          Open

          def workflow_fastq_screen(input_files, output_dir, config):
              # Get the path to the fastq_screen command
              fastq_screen_path = config.get("paths", {}).get("fastq_screen")
              if not fastq_screen_path:
                  if find_on_path("fastq_screen", config):
          Severity: Minor
          Found in ngi_pipeline/engines/qc_ngi/workflows.py - About 2 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Severity
          Category
          Status
          Source
          Language