UnB-KnEDLe/DODFMiner

View on GitHub
dodfminer/extract/pure/core.py

Summary

Maintainability
D
2 days
Test Coverage

Function extract_structure has a Cognitive Complexity of 33 (exceeds 5 allowed). Consider refactoring.
Open

    def extract_structure(cls, file, single=False, norm='NFKD'): # pylint: disable=too-many-locals
        """Extract boxes of text with their respective titles.

        Args:
            file: The DODF file to extract titles from.
Severity: Minor
Found in dodfminer/extract/pure/core.py - About 4 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

File core.py has 347 lines of code (exceeds 250 allowed). Consider refactoring.
Open

# coding=utf-8

"""Extract content from DODFS and export to JSON.

Contains class ContentExtractor which have to public functions
Severity: Minor
Found in dodfminer/extract/pure/core.py - About 4 hrs to fix

    Function extract_text has a Cognitive Complexity of 25 (exceeds 5 allowed). Consider refactoring.
    Open

        def extract_text(cls, file, single=False, block=False, is_json=True, sep=' ', norm='NFKD'):
            """Extract block of text from file
    
            Args:
                file: The DODF to extract titles from.
    Severity: Minor
    Found in dodfminer/extract/pure/core.py - About 3 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function extract_to_json has a Cognitive Complexity of 12 (exceeds 5 allowed). Consider refactoring.
    Open

        def extract_to_json(cls, folder='./',
                            titles_with_boxes=False, norm='NFKD'):
            """Extract information from DODF to JSON.
    
            Args:
    Severity: Minor
    Found in dodfminer/extract/pure/core.py - About 1 hr to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function extract_text has 6 arguments (exceeds 4 allowed). Consider refactoring.
    Open

        def extract_text(cls, file, single=False, block=False, is_json=True, sep=' ', norm='NFKD'):
    Severity: Minor
    Found in dodfminer/extract/pure/core.py - About 45 mins to fix

      Avoid deeply nested control flow statements.
      Open

                              if is_json:
                                  list_of_boxes.append((text[0], text[1], text[2],
                                                        text[3], norm_text))
                              else:
                                  drawboxes_text += (norm_text + sep)
      Severity: Major
      Found in dodfminer/extract/pure/core.py - About 45 mins to fix

        Avoid deeply nested control flow statements.
        Open

                                if section and (title not in content_dict[section].keys()):
                                    content_dict[section].update(
                                        {normalized_title: []})
                            else:
        Severity: Major
        Found in dodfminer/extract/pure/core.py - About 45 mins to fix

          Function _get_pdfs_list has a Cognitive Complexity of 6 (exceeds 5 allowed). Consider refactoring.
          Open

              def _get_pdfs_list(cls, folder):
                  """Get DODFs list from the path.
          
                  Args:
                      folder: The folder containing the PDFs to be extracted.
          Severity: Minor
          Found in dodfminer/extract/pure/core.py - About 25 mins to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          Function extract_to_txt has a Cognitive Complexity of 6 (exceeds 5 allowed). Consider refactoring.
          Open

              def extract_to_txt(cls, folder='./', norm='NFKD'):
                  """Extract information from DODF to a .txt file.
          
                  For each PDF file in data/DODFs, the method extracts information from the
                  PDF and writes it to the .txt file.
          Severity: Minor
          Found in dodfminer/extract/pure/core.py - About 25 mins to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          There are no issues that match your filters.

          Category
          Status