pombo-lab/gamtools

View on GitHub

Showing 473 of 473 total issues

File pipeline.py has 476 lines of code (exceeds 250 allowed). Consider refactoring.
Open

"""
===================
The pipeline module
===================

Severity: Minor
Found in lib/gamtools/pipeline.py - About 7 hrs to fix

    File matrix.py has 451 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    """
    =================
    The matrix module
    =================
    
    
    Severity: Minor
    Found in lib/gamtools/matrix.py - About 6 hrs to fix

      File cosegregation.py has 444 lines of code (exceeds 250 allowed). Consider refactoring.
      Open

      """
      ========================
      The cosegregation module
      ========================
      
      
      Severity: Minor
      Found in lib/gamtools/cosegregation.py - About 6 hrs to fix

        Similar blocks of code found in 2 locations. Consider refactoring.
        Open

                doublets_df.ix[
                    (doublets_df.chrom == chrom) & (
                        doublets_df.Pos_A >= chrom_lengths[chrom]),
                    ['Pos_A']] = np.array(
                        doublets_df.ix[
        Severity: Major
        Found in lib/gamtools/enrichment.py and 1 other location - About 6 hrs to fix
        lib/gamtools/enrichment.py on lines 171..178

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 105.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Similar blocks of code found in 2 locations. Consider refactoring.
        Open

                doublets_df.ix[
                    (doublets_df.chrom == chrom) & (
                        doublets_df.Pos_B >= chrom_lengths[chrom]),
                    ['Pos_B']] = np.array(
                        doublets_df.ix[
        Severity: Major
        Found in lib/gamtools/enrichment.py and 1 other location - About 6 hrs to fix
        lib/gamtools/enrichment.py on lines 161..168

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 105.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        File enrichment.py has 359 lines of code (exceeds 250 allowed). Consider refactoring.
        Open

        """
        =====================
        The enrichment module
        =====================
        
        
        Severity: Minor
        Found in lib/gamtools/enrichment.py - About 4 hrs to fix

          File main.py has 352 lines of code (exceeds 250 allowed). Consider refactoring.
          Open

          """
          ===============
          The main module
          ===============
          
          
          Severity: Minor
          Found in lib/gamtools/main.py - About 4 hrs to fix

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                def task_extract_contamination(self):
                    """Task to summarize results of fastq_screen runs"""
            
                    fastq_screen_files = [qc.screen.screen_out_path(
                        input_fastq) for input_fastq in self.args.input_fastqs]
            Severity: Major
            Found in lib/gamtools/pipeline.py and 1 other location - About 3 hrs to fix
            lib/gamtools/pipeline.py on lines 459..472

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 68.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                def task_extract_quality(self):
                    """Task to summarize results of fastqc runs"""
            
                    fastqc_files = [qc.fastqc.fastqc_data_file(
                        input_fastq) for input_fastq in self.args.input_fastqs]
            Severity: Major
            Found in lib/gamtools/pipeline.py and 1 other location - About 3 hrs to fix
            lib/gamtools/pipeline.py on lines 443..456

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 68.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            File call_windows.py has 299 lines of code (exceeds 250 allowed). Consider refactoring.
            Open

            """
            =======================
            The call_windows module
            =======================
            
            
            Severity: Minor
            Found in lib/gamtools/call_windows.py - About 3 hrs to fix

              InputFileMappingTasks has 26 functions (exceeds 20 allowed). Consider refactoring.
              Open

              class InputFileMappingTasks():
                  """Class for generating doit tasks from command-line arguments.
              
                  GAMtools "process_nps" command generates a set of doit tasks at
                  runtime based on a set of parameters passed via the command-line.
              Severity: Minor
              Found in lib/gamtools/pipeline.py - About 3 hrs to fix

                Cyclomatic complexity is too high in function parse_fastq_screen_output. (11)
                Open

                def parse_fastq_screen_output(fastq_screen_output):
                    """
                    Parse the output of a single fastq_screen file.
                
                    :param file fastq_screen_output: Open file object containing \
                Severity: Minor
                Found in lib/gamtools/qc/screen.py by radon

                Cyclomatic Complexity

                Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

                Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

                Construct Effect on CC Reasoning
                if +1 An if statement is a single decision.
                elif +1 The elif statement adds another decision.
                else +0 The else statement does not cause a new decision. The decision is at the if.
                for +1 There is a decision at the start of the loop.
                while +1 There is a decision at the while statement.
                except +1 Each except branch adds a new conditional path of execution.
                finally +0 The finally block is unconditionally executed.
                with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
                assert +1 The assert statement internally roughly equals a conditional statement.
                Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
                Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

                Source: http://radon.readthedocs.org/en/latest/intro.html

                Cyclomatic complexity is too high in function parse_module. (9)
                Open

                def parse_module(fastqc_module):
                    """
                    Parse a fastqc module from the table format to a line format (list).
                    Input is list containing the module. One list-item per line. E.g.:
                
                
                Severity: Minor
                Found in lib/gamtools/qc/fastqc.py by radon

                Cyclomatic Complexity

                Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

                Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

                Construct Effect on CC Reasoning
                if +1 An if statement is a single decision.
                elif +1 The elif statement adds another decision.
                else +0 The else statement does not cause a new decision. The decision is at the if.
                for +1 There is a decision at the start of the loop.
                while +1 There is a decision at the while statement.
                except +1 Each except branch adds a new conditional path of execution.
                finally +0 The finally block is unconditionally executed.
                with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
                assert +1 The assert statement internally roughly equals a conditional statement.
                Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
                Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

                Source: http://radon.readthedocs.org/en/latest/intro.html

                Similar blocks of code found in 3 locations. Consider refactoring.
                Open

                    def task_index_bam(self):
                        """Task to generate indexes for sorted bam files"""
                
                        for input_fastq in self.args.input_fastqs:
                            yield {
                Severity: Major
                Found in lib/gamtools/pipeline.py and 2 other locations - About 2 hrs to fix
                lib/gamtools/pipeline.py on lines 229..238
                lib/gamtools/pipeline.py on lines 269..278

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 52.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Cyclomatic complexity is too high in function process_file. (8)
                Open

                def process_file(filename):
                    """
                    Process a fastqc output file and calculate some summary statistics.
                    """
                
                
                Severity: Minor
                Found in lib/gamtools/qc/fastqc.py by radon

                Cyclomatic Complexity

                Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

                Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

                Construct Effect on CC Reasoning
                if +1 An if statement is a single decision.
                elif +1 The elif statement adds another decision.
                else +0 The else statement does not cause a new decision. The decision is at the if.
                for +1 There is a decision at the start of the loop.
                while +1 There is a decision at the while statement.
                except +1 Each except branch adds a new conditional path of execution.
                finally +0 The finally block is unconditionally executed.
                with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
                assert +1 The assert statement internally roughly equals a conditional statement.
                Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
                Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

                Source: http://radon.readthedocs.org/en/latest/intro.html

                Similar blocks of code found in 3 locations. Consider refactoring.
                Open

                    def task_make_bigwig(self):
                        """Task to create bigwigs from bedgraphs"""
                
                        for input_fastq in self.args.input_fastqs:
                            yield {
                Severity: Major
                Found in lib/gamtools/pipeline.py and 2 other locations - About 2 hrs to fix
                lib/gamtools/pipeline.py on lines 229..238
                lib/gamtools/pipeline.py on lines 241..250

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 52.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 3 locations. Consider refactoring.
                Open

                    def task_remove_duplicates(self):
                        """Task to remove PCR duplicates from sorted bam files"""
                
                        for input_fastq in self.args.input_fastqs:
                            yield {
                Severity: Major
                Found in lib/gamtools/pipeline.py and 2 other locations - About 2 hrs to fix
                lib/gamtools/pipeline.py on lines 241..250
                lib/gamtools/pipeline.py on lines 269..278

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 52.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Cyclomatic complexity is too high in function convert_from_args. (8)
                Open

                def convert_from_args(args):
                    """Wrapper function to call convert from argparse"""
                
                    if args.input_format is None:
                        args.input_format = detect_file_type(args.input_file)
                Severity: Minor
                Found in lib/gamtools/matrix.py by radon

                Cyclomatic Complexity

                Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

                Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

                Construct Effect on CC Reasoning
                if +1 An if statement is a single decision.
                elif +1 The elif statement adds another decision.
                else +0 The else statement does not cause a new decision. The decision is at the if.
                for +1 There is a decision at the start of the loop.
                while +1 There is a decision at the while statement.
                except +1 Each except branch adds a new conditional path of execution.
                finally +0 The finally block is unconditionally executed.
                with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
                assert +1 The assert statement internally roughly equals a conditional statement.
                Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
                Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

                Source: http://radon.readthedocs.org/en/latest/intro.html

                Cyclomatic complexity is too high in function comparison_from_operator. (7)
                Open

                def comparison_from_operator(operator, left, right):
                    """
                    Perform a comparison between left and right values in a QC
                    conditions file.
                
                
                Severity: Minor
                Found in lib/gamtools/qc/pass_qc.py by radon

                Cyclomatic Complexity

                Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

                Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

                Construct Effect on CC Reasoning
                if +1 An if statement is a single decision.
                elif +1 The elif statement adds another decision.
                else +0 The else statement does not cause a new decision. The decision is at the if.
                for +1 There is a decision at the start of the loop.
                while +1 There is a decision at the while statement.
                except +1 Each except branch adds a new conditional path of execution.
                finally +0 The finally block is unconditionally executed.
                with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
                assert +1 The assert statement internally roughly equals a conditional statement.
                Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
                Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

                Source: http://radon.readthedocs.org/en/latest/intro.html

                Cyclomatic complexity is too high in function matrix_from_args. (7)
                Open

                def matrix_from_args(args):
                    """Extract parameters from an argparse namespace object and pass them to
                    create_and_save_contact_matrix.
                    """
                
                
                Severity: Minor
                Found in lib/gamtools/cosegregation.py by radon

                Cyclomatic Complexity

                Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

                Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

                Construct Effect on CC Reasoning
                if +1 An if statement is a single decision.
                elif +1 The elif statement adds another decision.
                else +0 The else statement does not cause a new decision. The decision is at the if.
                for +1 There is a decision at the start of the loop.
                while +1 There is a decision at the while statement.
                except +1 Each except branch adds a new conditional path of execution.
                finally +0 The finally block is unconditionally executed.
                with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
                assert +1 The assert statement internally roughly equals a conditional statement.
                Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
                Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

                Source: http://radon.readthedocs.org/en/latest/intro.html

                Severity
                Category
                Status
                Source
                Language