HumanCellAtlas/ingest-client

View on GitHub

Showing 924 of 924 total issues

Identical blocks of code found in 2 locations. Consider refactoring.
Open

                    if column_name.split(".")[-1] == "text":
                        desc = self.get_value_for_column(spreadsheet_tabs_template, column_name, "description")
                        required = bool(self.get_value_for_column(spreadsheet_tabs_template, column_name, "required"))
                        example_text = self.get_value_for_column(spreadsheet_tabs_template, column_name, "example")
                        guidelines = self.get_value_for_column(spreadsheet_tabs_template, column_name, "guidelines")
Severity: Major
Found in ingest/template/linked_spreadsheet_builder.py and 1 other location - About 1 day to fix
ingest/template/vanilla_spreadsheet_builder.py on lines 30..41

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 157.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

                    if column_name.split(".")[-1] == "text":
                        desc = self.get_value_for_column(spreadsheet_tabs_template, column_name, "description")
                        required = bool(self.get_value_for_column(spreadsheet_tabs_template, column_name, "required"))
                        example_text = self.get_value_for_column(spreadsheet_tabs_template, column_name, "example")
                        guidelines = self.get_value_for_column(spreadsheet_tabs_template, column_name, "guidelines")
Severity: Major
Found in ingest/template/vanilla_spreadsheet_builder.py and 1 other location - About 1 day to fix
ingest/template/linked_spreadsheet_builder.py on lines 58..69

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 157.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Function build has a Cognitive Complexity of 66 (exceeds 5 allowed). Consider refactoring.
Open

    def build(self, spreadsheet_tabs_template):
        # TODO(maniarathi): Fill out docstring here once finished parsing and consolidating code.
        tab_config = spreadsheet_tabs_template.tab_config

        if (self.link_config is not False) and (self.autofill_scale != 0):  # some precalculation for whole sheet
Severity: Minor
Found in ingest/template/linked_spreadsheet_builder.py - About 1 day to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

IngestApi has 54 functions (exceeds 20 allowed). Consider refactoring.
Open

class IngestApi:
    def __init__(self, url=None, token_manager=None):
        self.logger = logging.getLogger(__name__)
        self.session = create_session_with_retry()
        self.token_manager = token_manager
Severity: Major
Found in ingest/api/ingestapi.py - About 7 hrs to fix

    File ingestexportservice.py has 476 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    #!/usr/bin/env python
    import json
    import logging
    import os
    import re
    Severity: Minor
    Found in ingest/exporter/ingestexportservice.py - About 7 hrs to fix

      Function build has a Cognitive Complexity of 43 (exceeds 5 allowed). Consider refactoring.
      Open

          def build(self, spreadsheet_tabs_template):
              tabs = spreadsheet_tabs_template.tabs
      
              for tab in tabs:
                  for tab_name, detail in tab.items():
      Severity: Minor
      Found in ingest/template/vanilla_spreadsheet_builder.py - About 6 hrs to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function _value_linking has a Cognitive Complexity of 43 (exceeds 5 allowed). Consider refactoring.
      Open

          def _value_linking(self):
      
              # # work out from backbone list and autofill_scale how many id values to fill
              # backbone = self.link_config[0]
              # multiplier = []
      Severity: Minor
      Found in ingest/template/linked_spreadsheet_builder.py - About 6 hrs to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      File submission.py has 410 lines of code (exceeds 250 allowed). Consider refactoring.
      Open

      import logging
      
      import requests
      
      from ingest.api.ingestapi import IngestApi
      Severity: Minor
      Found in ingest/importer/submission.py - About 5 hrs to fix

        Similar blocks of code found in 2 locations. Consider refactoring.
        Open

                for link_to_tab in _link_to_tab:
                    display_name = self.col_name_mapping.get(link_to_tab)[0]
                    prog_name = self.col_name_mapping.get(link_to_tab)[1]
                    uf = str('DERIVED FROM {}'.format(display_name.upper()))
                    desc = str('Enter biomaterial ID from "{}" tab that this entity was derived from.'.format(display_name))
        Severity: Major
        Found in ingest/template/linked_spreadsheet_builder.py and 1 other location - About 5 hrs to fix
        ingest/template/linked_spreadsheet_builder.py on lines 269..274

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 95.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Similar blocks of code found in 2 locations. Consider refactoring.
        Open

                    for protocol in add_these:
                        display_name = self.col_name_mapping.get(protocol)[0]
                        prog_name = self.col_name_mapping.get(protocol)[1]
                        uf = str('ID OF {} USED'.format(display_name.upper()))
                        desc = str('Enter protocol ID from "{}" tab that this entity was derrived from.'.format(display_name))
        Severity: Major
        Found in ingest/template/linked_spreadsheet_builder.py and 1 other location - About 5 hrs to fix
        ingest/template/linked_spreadsheet_builder.py on lines 245..251

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 95.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        File ingestapi.py has 362 lines of code (exceeds 250 allowed). Consider refactoring.
        Open

        #!/usr/bin/env python
        """
        desc goes here
        """
        import json
        Severity: Minor
        Found in ingest/api/ingestapi.py - About 4 hrs to fix

          Function get_user_friendly_column_name has a Cognitive Complexity of 25 (exceeds 5 allowed). Consider refactoring.
          Open

              def get_user_friendly_column_name(self, template, column_name, primary_schema=None):
                  # TODO(maniarathi): Make this description better.
                  """ Given a column name derived originally from a metadata schema file that will be inputted as a column name
                  into the generated spreadsheet, turn it into a user friendly name. """
                  if '.text' in column_name:
          Severity: Minor
          Found in ingest/template/spreadsheet_builder.py - About 3 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Function export_bundle has a Cognitive Complexity of 25 (exceeds 5 allowed). Consider refactoring.
          Open

              def export_bundle(self, bundle_uuid, bundle_version, submission_uuid, process_uuid):
                  start_time = time.time()
                  self.related_entities_cache = {}
                  saved_bundle_uuid = None
          
          
          Severity: Minor
          Found in ingest/exporter/ingestexportservice.py - About 3 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Function load has a Cognitive Complexity of 23 (exceeds 5 allowed). Consider refactoring.
          Open

              def load(entity_json):
                  dictionary = EntityMap()
          
                  for entity_type, entities_dict in entity_json.items():
                      for entity_id, entity_body in entities_dict.items():
          Severity: Minor
          Found in ingest/importer/submission.py - About 3 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Function _validate_entity_links has a Cognitive Complexity of 22 (exceeds 5 allowed). Consider refactoring.
          Open

              def _validate_entity_links(self, entity_map, entity):
                  links_by_entity = entity.links_by_entity
          
                  for link_entity_type, link_entity_ids in links_by_entity.items():
                      for link_entity_id in link_entity_ids:
          Severity: Minor
          Found in ingest/importer/submission.py - About 3 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Identical blocks of code found in 2 locations. Consider refactoring.
          Open

                              if column_name.split(".")[-1] == "ontology" or column_name.split(".")[-1] == "ontology_label":
                                  worksheet.set_column(column_index, column_index, None, None, {'hidden': True})
          Severity: Major
          Found in ingest/template/linked_spreadsheet_builder.py and 1 other location - About 3 hrs to fix
          ingest/template/vanilla_spreadsheet_builder.py on lines 64..65

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 62.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Identical blocks of code found in 2 locations. Consider refactoring.
          Open

                              if column_name.split(".")[-1] == "ontology" or column_name.split(".")[-1] == "ontology_label":
                                  worksheet.set_column(column_index, column_index, None, None, {'hidden': True})
          Severity: Major
          Found in ingest/template/vanilla_spreadsheet_builder.py and 1 other location - About 3 hrs to fix
          ingest/template/linked_spreadsheet_builder.py on lines 92..93

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 62.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Identical blocks of code found in 2 locations. Consider refactoring.
          Open

                              if column_index == 0:
                                  worksheet.set_row(0, 30)
                                  worksheet.set_row(4, 30)
                                  worksheet.write(4, column_index, "FILL OUT INFORMATION BELOW THIS ROW", self.header_format)
                              else:
          Severity: Major
          Found in ingest/template/vanilla_spreadsheet_builder.py and 1 other location - About 2 hrs to fix
          ingest/template/linked_spreadsheet_builder.py on lines 98..105

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 59.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Function _fill_examples_from_schema has a Cognitive Complexity of 20 (exceeds 5 allowed). Consider refactoring.
          Open

              def _fill_examples_from_schema(self, example_text, worksheet, cols, col_number, tab_name):
                  double_prefix = 'Should be one of: '
                  metadata_fs = ';'
                  if example_text.startswith(double_prefix):
                      example_text_ = example_text[len(double_prefix):].split(',')[0]  # TODO hard coded assumption?
          Severity: Minor
          Found in ingest/template/linked_spreadsheet_builder.py - About 2 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Identical blocks of code found in 2 locations. Consider refactoring.
          Open

                              if column_index == 0:
                                  worksheet.set_row(0, 30)
                                  worksheet.set_row(4, 30)
          
                                  worksheet.write(4, column_index, "FILL OUT INFORMATION BELOW THIS ROW", self.header_format)
          Severity: Major
          Found in ingest/template/linked_spreadsheet_builder.py and 1 other location - About 2 hrs to fix
          ingest/template/vanilla_spreadsheet_builder.py on lines 70..75

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 59.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Severity
          Category
          Status
          Source
          Language