stephensolis/kameris

View on GitHub

Showing 11 of 11 total issues

Similar blocks of code found in 2 locations. Consider refactoring.
Open

                if not os.path.exists(all_metadata_filename):
                    download_utils.download_file(
                        download_utils.url_for_file(
                            all_metadata_filename, options['urls_file'],
                            'metadata'
Severity: Major
Found in kameris/job_steps/selection.py and 1 other location - About 1 hr to fix
kameris/job_steps/selection.py on lines 96..101

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 40.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

                if not os.path.exists(archive_filename):
                    download_utils.download_file(
                        download_utils.url_for_file(
                            archive_filename, options['urls_file'], 'archives'
                        ),
Severity: Major
Found in kameris/job_steps/selection.py and 1 other location - About 1 hr to fix
kameris/job_steps/selection.py on lines 56..62

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 40.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

            all_metadata_filename = os.path.join(
                options['metadata_dir'],
                group_options['dataset']['metadata'] + '.json'
Severity: Major
Found in kameris/job_steps/selection.py and 1 other location - About 1 hr to fix
kameris/job_steps/selection.py on lines 89..91

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

            archive_filename = os.path.join(
                options['archives_dir'],
                group_options['dataset']['archive'] + '.zip'
Severity: Major
Found in kameris/job_steps/selection.py and 1 other location - About 1 hr to fix
kameris/job_steps/selection.py on lines 49..51

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function classification_run has 7 arguments (exceeds 4 allowed). Consider refactoring.
Open

def classification_run(classifier_factory, features, point_classes,
Severity: Major
Found in kameris/job_steps/classify.py - About 50 mins to fix

    Avoid deeply nested control flow statements.
    Open

                        for chunk in r.iter_content(chunk_size=1048576):
                            f.write(chunk)
                            pbar.update(len(chunk))
    
    
    Severity: Major
    Found in kameris/utils/download_utils.py - About 45 mins to fix

      Avoid deeply nested control flow statements.
      Open

                          for opts in expand_values:
                              if opts == expanded_options:
                                  break
                              elif all(o in opts for o in sliced_options):
                                  new_exp_options['selection_copy_from'] = \
      Severity: Major
      Found in kameris/subcommands/run_job.py - About 45 mins to fix

        Function crossvalidation_run has 6 arguments (exceeds 4 allowed). Consider refactoring.
        Open

        def crossvalidation_run(classifier_factory, features, features_mode,
        Severity: Minor
        Found in kameris/job_steps/classify.py - About 45 mins to fix

          Avoid deeply nested control flow statements.
          Open

                              if 'archive_folder' in group_options['dataset']:
                                  filename = '{}/{}'.format(
                                      group_options['dataset']['archive_folder'],
                                      filename
                                  )
          Severity: Major
          Found in kameris/job_steps/selection.py - About 45 mins to fix

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                if 'iterations' in totals:
                    final_stats['average_iterations'] = (
                        totals['iterations'] / validation_count
            Severity: Minor
            Found in kameris/job_steps/classify.py and 1 other location - About 35 mins to fix
            kameris/job_steps/classify.py on lines 186..188

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 33.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                if 'reduced_variance_ratio' in totals:
                    final_stats['average_reduced_variance_ratio'] = (
                        totals['reduced_variance_ratio'] / validation_count
            Severity: Minor
            Found in kameris/job_steps/classify.py and 1 other location - About 35 mins to fix
            kameris/job_steps/classify.py on lines 182..184

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 33.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Severity
            Category
            Status
            Source
            Language