okfn-brasil/serenata-de-amor

View on GitHub
research/src/fetch_federal_sanctions.py

Summary

Maintainability
B
6 hrs
Test Coverage

Function translate_inident_and_suspended_companies_dataset has 5 arguments (exceeds 4 allowed). Consider refactoring.
Open

def translate_inident_and_suspended_companies_dataset(filepath, dataset_name, year, month, day):
Severity: Minor
Found in research/src/fetch_federal_sanctions.py - About 35 mins to fix

    Function translate_national_register_punished_companies_dataset has 5 arguments (exceeds 4 allowed). Consider refactoring.
    Open

    def translate_national_register_punished_companies_dataset(filepath, dataset_name, year, month, day):
    Severity: Minor
    Found in research/src/fetch_federal_sanctions.py - About 35 mins to fix

      Function dummy_translation_dataset has 5 arguments (exceeds 4 allowed). Consider refactoring.
      Open

      def dummy_translation_dataset(filepath, dataset_name, year, month, day):
      Severity: Minor
      Found in research/src/fetch_federal_sanctions.py - About 35 mins to fix

        Function translate_impeded_non_profit_entities_dataset has 5 arguments (exceeds 4 allowed). Consider refactoring.
        Open

        def translate_impeded_non_profit_entities_dataset(filepath, dataset_name, year, month, day):
        Severity: Minor
        Found in research/src/fetch_federal_sanctions.py - About 35 mins to fix

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

              data['company_cnpj'] = data['company_cnpj'].map(lambda x: str(x).zfill(14))
          Severity: Major
          Found in research/src/fetch_federal_sanctions.py and 1 other location - About 1 hr to fix
          research/src/fetch_federal_budget_datasets.py on lines 45..45

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 39.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

              data.rename(columns={
                  'Tipo de Pessoa': 'entity_type',
                  'CPF ou CNPJ do Sancionado': 'sanctioned_cnpj_cpf',
                  'Nome Informado pelo Órgão Sancionador': 'name_given_by_sanctioning_body',
                  'Razão Social - Cadastro Receita': 'company_name_receita_database',
          Severity: Minor
          Found in research/src/fetch_federal_sanctions.py and 1 other location - About 55 mins to fix
          research/src/fetch_tse_data.py on lines 279..296

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 37.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Identical blocks of code found in 4 locations. Consider refactoring.
          Open

              data.to_csv(path_or_buf='{0}{1}-{2}-{3}-{4}.xz'.format(BASE_DATA_DIR, year, month, day, dataset_name), sep=',',
          Severity: Major
          Found in research/src/fetch_federal_sanctions.py and 3 other locations - About 35 mins to fix
          research/src/fetch_federal_sanctions.py on lines 66..66
          research/src/fetch_federal_sanctions.py on lines 99..99
          research/src/fetch_federal_sanctions.py on lines 117..117

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 33.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Identical blocks of code found in 4 locations. Consider refactoring.
          Open

              data.to_csv(path_or_buf='{0}{1}-{2}-{3}-{4}.xz'.format(BASE_DATA_DIR, year, month, day, dataset_name), sep=',',
          Severity: Major
          Found in research/src/fetch_federal_sanctions.py and 3 other locations - About 35 mins to fix
          research/src/fetch_federal_sanctions.py on lines 66..66
          research/src/fetch_federal_sanctions.py on lines 117..117
          research/src/fetch_federal_sanctions.py on lines 124..124

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 33.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Identical blocks of code found in 4 locations. Consider refactoring.
          Open

              data.to_csv(path_or_buf='{0}{1}-{2}-{3}-{4}.xz'.format(BASE_DATA_DIR, year, month, day, dataset_name), sep=',',
          Severity: Major
          Found in research/src/fetch_federal_sanctions.py and 3 other locations - About 35 mins to fix
          research/src/fetch_federal_sanctions.py on lines 99..99
          research/src/fetch_federal_sanctions.py on lines 117..117
          research/src/fetch_federal_sanctions.py on lines 124..124

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 33.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Identical blocks of code found in 4 locations. Consider refactoring.
          Open

              data.to_csv(path_or_buf='{0}{1}-{2}-{3}-{4}.xz'.format(BASE_DATA_DIR, year, month, day, dataset_name), sep=',',
          Severity: Major
          Found in research/src/fetch_federal_sanctions.py and 3 other locations - About 35 mins to fix
          research/src/fetch_federal_sanctions.py on lines 66..66
          research/src/fetch_federal_sanctions.py on lines 99..99
          research/src/fetch_federal_sanctions.py on lines 124..124

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 33.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Expected 2 blank lines after class or function definition, found 1
          Open

          BASE_URL = 'http://arquivos.portaldatransparencia.gov.br/downloads.asp?'

          Separate top-level function and class definitions with two blank lines.

          Method definitions inside a class are separated by a single blank
          line.
          
          Extra blank lines may be used (sparingly) to separate groups of
          related functions.  Blank lines may be omitted between a bunch of
          related one-liners (e.g. a set of dummy implementations).
          
          Use blank lines in functions, sparingly, to indicate logical
          sections.
          
          Okay: def a():\n    pass\n\n\ndef b():\n    pass
          Okay: def a():\n    pass\n\n\nasync def b():\n    pass
          Okay: def a():\n    pass\n\n\n# Foo\n# Bar\n\ndef b():\n    pass
          Okay: default = 1\nfoo = 1
          Okay: classify = 1\nfoo = 1
          
          E301: class Foo:\n    b = 0\n    def bar():\n        pass
          E302: def a():\n    pass\n\ndef b(n):\n    pass
          E302: def a():\n    pass\n\nasync def b(n):\n    pass
          E303: def a():\n    pass\n\n\n\ndef b(n):\n    pass
          E303: def a():\n\n\n\n    pass
          E304: @decorator\n\ndef a():\n    pass
          E305: def a():\n    pass\na()
          E306: def a():\n    def b():\n        pass\n    def c():\n        pass

          Line too long (122 > 120 characters)
          Open

                      # print('File not found for dataset {0} from date {1}'.format(dataset, datetime.today().strftime("%Y-%m-%d")))

          Limit all lines to a maximum of 79 characters.

          There are still many devices around that are limited to 80 character
          lines; plus, limiting windows to 80 characters makes it possible to
          have several windows side-by-side.  The default wrapping on such
          devices looks ugly.  Therefore, please limit all lines to a maximum
          of 79 characters. For flowing long blocks of text (docstrings or
          comments), limiting the length to 72 characters is recommended.
          
          Reports error E501.

          Closing bracket does not match visual indentation
          Open

                  }, inplace=True)

          Continuation lines indentation.

          Continuation lines should align wrapped elements either vertically
          using Python's implicit line joining inside parentheses, brackets
          and braces, or using a hanging indent.
          
          When using a hanging indent these considerations should be applied:
          - there should be no arguments on the first line, and
          - further indentation should be used to clearly distinguish itself
            as a continuation line.
          
          Okay: a = (\n)
          E123: a = (\n    )
          
          Okay: a = (\n    42)
          E121: a = (\n   42)
          E122: a = (\n42)
          E123: a = (\n    42\n    )
          E124: a = (24,\n     42\n)
          E125: if (\n    b):\n    pass
          E126: a = (\n        42)
          E127: a = (24,\n      42)
          E128: a = (24,\n    42)
          E129: if (a or\n    b):\n    pass
          E131: a = (\n    42\n 24)

          There are no issues that match your filters.

          Category
          Status