HumanCellAtlas/ingest-client

View on GitHub

Showing 924 of 924 total issues

Identical blocks of code found in 2 locations. Consider refactoring.
Open

                    if column_index == 0:
                        worksheet.set_row(0, 30)
                        worksheet.set_row(4, 30)
                        worksheet.write(4, column_index, "FILL OUT INFORMATION BELOW THIS ROW", self.header_format)
                    else:
Severity: Major
Found in ingest/template/vanilla_spreadsheet_builder.py and 1 other location - About 2 hrs to fix
ingest/template/linked_spreadsheet_builder.py on lines 98..105

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 59.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

IngestExporter has 24 functions (exceeds 20 allowed). Consider refactoring.
Open

class IngestExporter:
    def __init__(self, ingest_api: IngestApi, dss_api: DssApi, staging_service: StagingService, dry_run=False,
                 output_directory=None):
        format = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
        logging.basicConfig(format=format)
Severity: Minor
Found in ingest/exporter/ingestexportservice.py - About 2 hrs to fix

    Function _protocol_linking has a Cognitive Complexity of 19 (exceeds 5 allowed). Consider refactoring.
    Open

        def _protocol_linking(self, backbone_entities):
            protocol_pairings = self.link_config[1]
            # protocols_to_add = []
            protocols_to_add = {}
    
    
    Severity: Minor
    Found in ingest/template/linked_spreadsheet_builder.py - About 2 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    File importer.py has 259 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    import logging
    
    from requests import HTTPError
    
    from ingest.importer.conversion import template_manager
    Severity: Minor
    Found in ingest/importer/importer.py - About 2 hrs to fix

      MetadataEntity has 21 functions (exceeds 20 allowed). Consider refactoring.
      Open

      class MetadataEntity:
      
          # TODO enforce definition of concrete and domain types for all MetadataEntity
          # It's only currently done this way to minimise friction with other parts of the system
          def __init__(self, concrete_type=TYPE_UNDEFINED, domain_type=TYPE_UNDEFINED, object_id=None,
      Severity: Minor
      Found in ingest/importer/conversion/metadata_entity.py - About 2 hrs to fix

        Function _add_link_cols has a Cognitive Complexity of 16 (exceeds 5 allowed). Consider refactoring.
        Open

            def _add_link_cols(self, tab_name, col_number, worksheet, hf, backbone_entities):
        
                # given the tab name work out what entity links need to be added
        
                _link_to_tab = []
        Severity: Minor
        Found in ingest/template/linked_spreadsheet_builder.py - About 2 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

                        for schema in template.tabs:
                            if schema_name == list(schema.keys())[0]:
                                schema_uf = schema[schema_name]['display_name']
        Severity: Major
        Found in ingest/template/spreadsheet_builder.py and 1 other location - About 1 hr to fix
        ingest/template/spreadsheet_builder.py on lines 125..127

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 49.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

                        for schema in template.tabs:
                            if schema_name == list(schema.keys())[0]:
                                schema_uf = schema[schema_name]['display_name']
        Severity: Major
        Found in ingest/template/spreadsheet_builder.py and 1 other location - About 1 hr to fix
        ingest/template/spreadsheet_builder.py on lines 136..138

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 49.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Function sheet_in_schemas has a Cognitive Complexity of 15 (exceeds 5 allowed). Consider refactoring.
        Open

            def sheet_in_schemas(self, worksheet):
                schemas = self.template_mgr.template.json_schemas
                try:
                    concrete_type = self.template_mgr.get_concrete_type(worksheet.title)
                except UnknownKeySchemaException as e:
        Severity: Minor
        Found in ingest/importer/importer.py - About 1 hr to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function _generate_direct_links has a Cognitive Complexity of 14 (exceeds 5 allowed). Consider refactoring.
        Open

            def _generate_direct_links(self, entity_map, entity):
                project = entity_map.get_project()
        
                # TODO Revisit if we need to link all entities to the project
                # currently, all entities are indirectly link to the project via the submission envelope
        Severity: Minor
        Found in ingest/importer/submission.py - About 1 hr to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function put_bundle has a Cognitive Complexity of 14 (exceeds 5 allowed). Consider refactoring.
        Open

            def put_bundle(self, bundle_uuid, version, bundle_files):
                bundle = None
        
                # retrying file creation 20 times
                max_retries = 20
        Severity: Minor
        Found in ingest/api/dssapi.py - About 1 hr to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function put_file has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring.
        Open

            def put_file(self, bundle_uuid, file):
                url = file["url"]
                uuid = file["dss_uuid"]
        
                update_date = file.get("update_date")
        Severity: Minor
        Found in ingest/api/dssapi.py - About 1 hr to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Similar blocks of code found in 3 locations. Consider refactoring.
        Open

                    if (len(prog_name) == 3) and (prog_name[2].endswith('_description') and (
                            prog_name[1].endswith('_core'))):  # TODO hard coded assumption?
        Severity: Major
        Found in ingest/template/linked_spreadsheet_builder.py and 2 other locations - About 1 hr to fix
        ingest/template/linked_spreadsheet_builder.py on lines 147..148
        ingest/template/linked_spreadsheet_builder.py on lines 290..291

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 45.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Function _make_col_name_mapping has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring.
        Open

            def _make_col_name_mapping(self, template):
                col_name_mapping = {}
                for entity_type in template._template.get('tabs'):
                    for key, value in entity_type.items():
                        display_name = value.get('display_name')
        Severity: Minor
        Found in ingest/template/linked_spreadsheet_builder.py - About 1 hr to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Similar blocks of code found in 3 locations. Consider refactoring.
        Open

                    if (len(prog_name) == 3) and (
                            prog_name[2].endswith('_name') and (prog_name[1].endswith('_core'))):  # TODO hard coded assumption?
        Severity: Major
        Found in ingest/template/linked_spreadsheet_builder.py and 2 other locations - About 1 hr to fix
        ingest/template/linked_spreadsheet_builder.py on lines 151..152
        ingest/template/linked_spreadsheet_builder.py on lines 290..291

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 45.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Function _write_column_head has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring.
        Open

            def _write_column_head(self, worksheet, col_number, uf, hf, desc, prog_name, tab_name):
                worksheet.write(0, col_number, uf, hf)  # user friendly name
                worksheet.write(1, col_number, desc, self.desc_format)  # description
                # worksheet.write(2, col_number, ???, self.desc_format) # example
                worksheet.write(3, col_number, prog_name, self.locked_format)  # programatic name
        Severity: Minor
        Found in ingest/template/linked_spreadsheet_builder.py - About 1 hr to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Similar blocks of code found in 3 locations. Consider refactoring.
        Open

                    if (len(prog_name_list) == 3) and (prog_name_list[2].endswith('biomaterial_id') and (
                            prog_name_list[1].endswith('_core'))):
        Severity: Major
        Found in ingest/template/linked_spreadsheet_builder.py and 2 other locations - About 1 hr to fix
        ingest/template/linked_spreadsheet_builder.py on lines 147..148
        ingest/template/linked_spreadsheet_builder.py on lines 151..152

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 45.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Function __init__ has 10 arguments (exceeds 4 allowed). Consider refactoring.
        Open

            def __init__(self, entity_type, entity_id, content, ingest_json=None, links_by_entity=None,
        Severity: Major
        Found in ingest/importer/submission.py - About 1 hr to fix

          Function recurse_process has a Cognitive Complexity of 11 (exceeds 5 allowed). Consider refactoring.
          Open

              def recurse_process(self, process, process_info):
                  uuid = process['uuid']['uuid']
                  process_info.derived_by_processes[uuid] = process
          
                  # get all derived by processes using input biomaterial and input files
          Severity: Minor
          Found in ingest/exporter/ingestexportservice.py - About 1 hr to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Function __init__ has 9 arguments (exceeds 4 allowed). Consider refactoring.
          Open

              def __init__(self, concrete_type=TYPE_UNDEFINED, domain_type=TYPE_UNDEFINED, object_id=None,
          Severity: Major
          Found in ingest/importer/conversion/metadata_entity.py - About 1 hr to fix
            Severity
            Category
            Status
            Source
            Language