HazyResearch/fonduer

View on GitHub

Showing 224 of 224 total issues

Cyclomatic complexity is too high in method split_sentences. (8)
Open

    def split_sentences(self, text: str) -> Iterator[Dict[str, Any]]:
        """Split text into sentences.

        Split input text into sentences that match CoreNLP's default format,
        but are not yet processed.

Cyclomatic Complexity

Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

Construct Effect on CC Reasoning
if +1 An if statement is a single decision.
elif +1 The elif statement adds another decision.
else +0 The else statement does not cause a new decision. The decision is at the if.
for +1 There is a decision at the start of the loop.
while +1 There is a decision at the while statement.
except +1 Each except branch adds a new conditional path of execution.
finally +0 The finally block is unconditionally executed.
with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
assert +1 The assert statement internally roughly equals a conditional statement.
Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

Source: http://radon.readthedocs.org/en/latest/intro.html

File spacy_parser.py has 254 lines of code (exceeds 250 allowed). Consider refactoring.
Open

"""Fonduer Spacy parser."""
import importlib
import logging
from collections import defaultdict
from pathlib import Path
Severity: Minor
Found in src/fonduer/parser/lingual_parser/spacy_parser.py - About 2 hrs to fix

    Function corenlp_to_xmltree_sub has a Cognitive Complexity of 16 (exceeds 5 allowed). Consider refactoring.
    Open

    def corenlp_to_xmltree_sub(
        s: Dict[str, Any], dep_parents: List[int], rid: int = 0
    ) -> _Element:
        """Construct XMLTree with CoreNLP information.
    
    
    Severity: Minor
    Found in src/fonduer/features/feature_libs/tree_structs.py - About 2 hrs to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Cyclomatic complexity is too high in method display_boxes. (7)
    Open

        def display_boxes(
            self,
            pdf_file: str,
            boxes: List[Bbox],
            alternate_colors: bool = False,
    Severity: Minor
    Found in src/fonduer/utils/visualizer.py by radon

    Cyclomatic Complexity

    Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

    Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

    Construct Effect on CC Reasoning
    if +1 An if statement is a single decision.
    elif +1 The elif statement adds another decision.
    else +0 The else statement does not cause a new decision. The decision is at the if.
    for +1 There is a decision at the start of the loop.
    while +1 There is a decision at the while statement.
    except +1 Each except branch adds a new conditional path of execution.
    finally +0 The finally block is unconditionally executed.
    with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
    assert +1 The assert statement internally roughly equals a conditional statement.
    Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
    Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

    Source: http://radon.readthedocs.org/en/latest/intro.html

    Function save_model has 16 arguments (exceeds 4 allowed). Consider refactoring.
    Open

    def save_model(
    Severity: Major
    Found in src/fonduer/packaging/fonduer_model.py - About 2 hrs to fix

      Cyclomatic complexity is too high in method apply. (7)
      Open

          def apply(self, doc: Document, **kwargs: Any) -> Document:
              """Extract mentions from the given Document.
      
              :param doc: A document to process.
              """
      Severity: Minor
      Found in src/fonduer/candidates/mentions.py by radon

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in method apply. (7)
      Open

          def apply(self, mentions: Iterator[TemporaryContext]) -> Iterator[TemporaryContext]:
              """Apply the Matcher to a **generator** of mentions.
      
              Optionally only takes the longest match (NOTE: assumes this is the
              *first* match)
      Severity: Minor
      Found in src/fonduer/candidates/matchers.py by radon

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in function lowest_common_ancestor_depth. (7)
      Open

      def lowest_common_ancestor_depth(c: Tuple[SpanMention, ...]) -> int:
          """Return the lowest common ancestor depth.
      
          In particular, return the minimum distance between a multinary-Mention Candidate to
          their lowest common ancestor.

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in method _get_files. (7)
      Open

          def _get_files(self, path: str) -> List[str]:
              if os.path.isfile(path):
                  fpaths = [path]
              elif os.path.isdir(path):
                  fpaths = [os.path.join(path, f) for f in os.listdir(path)]

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in function common_ancestor. (7)
      Open

      def common_ancestor(c: Tuple[SpanMention, ...]) -> List[str]:
          """Return the path to the root that is shared between a multinary-Mention Candidate.
      
          In particular, this is the common path of HTML tags.
      
      

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in function get_cell_ngrams. (7)
      Open

      def get_cell_ngrams(
          mention: Union[Candidate, Mention, TemporarySpanMention],
          attrib: str = "words",
          n_min: int = 1,
          n_max: int = 1,

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in method _f_gen. (7)
      Open

          def _f_gen(self, c: Candidate) -> Iterator[Tuple[int, str, int]]:
              """Convert lfs into a generator of id, name, and labels.
      
              In particular, catch verbose values and convert to integer ones.
              """
      Severity: Minor
      Found in src/fonduer/supervision/labeler.py by radon

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in method clear. (7)
      Open

          def clear(self) -> None:  # type: ignore
              """Delete Mentions of each class in the extractor from the given split."""
              # Create set of candidate_subclasses associated with each mention_subclass
              cand_subclasses = set()
              for mentions, tablename in [
      Severity: Minor
      Found in src/fonduer/candidates/mentions.py by radon

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in class HOCRDocPreprocessor. (7)
      Open

      class HOCRDocPreprocessor(DocPreprocessor):
          """A ``Document`` generator for hOCR files.
      
          hOCR should comply with `hOCR v1.2`_.
          Note that *ppageno* property of *ocr_page* is optional by `hOCR v1.2`_,

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in method __call__. (7)
      Open

          def __call__(self, tokenized_sentences: List[Sentence]) -> Doc:
              """Apply the custom tokenizer.
      
              :param tokenized_sentences: A list of sentences that was previously
              tokenized/split by spacy

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in method model_installed. (7)
      Open

          @staticmethod
          def model_installed(name: str) -> bool:
              """Check if spaCy language model is installed.
      
              From https://github.com/explosion/spaCy/blob/master/spacy/util.py

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Cyclomatic complexity is too high in function _vizlib_multinary_features. (7)
      Open

      def _vizlib_multinary_features(
          spans: Tuple[SpanMention, ...]
      ) -> Iterator[Tuple[str, int]]:
          """Visual-related features for multiple spans."""
          if same_page(spans):

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Function _coordinates_from_HTML has a Cognitive Complexity of 15 (exceeds 5 allowed). Consider refactoring.
      Open

          def _coordinates_from_HTML(
              self, page: Tag, page_num: int
          ) -> Tuple[List[PdfWord], Dict[PdfWordId, Bbox]]:
              pdf_word_list: List[PdfWord] = []
              coordinate_map: Dict[PdfWordId, Bbox] = {}
      Severity: Minor
      Found in src/fonduer/parser/visual_parser/pdf_visual_parser.py - About 1 hr to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function log_model has 15 arguments (exceeds 4 allowed). Consider refactoring.
      Open

      def log_model(
      Severity: Major
      Found in src/fonduer/packaging/fonduer_model.py - About 1 hr to fix

        Cyclomatic complexity is too high in method _map_to_id. (6)
        Open

            def _map_to_id(self) -> None:
                self.X_dict.update(
                    dict([(f"m{i}", []) for i in range(len(self.candidates[0]))])
                )
        
        
        Severity: Minor
        Found in src/fonduer/learning/dataset.py by radon

        Cyclomatic Complexity

        Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

        Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

        Construct Effect on CC Reasoning
        if +1 An if statement is a single decision.
        elif +1 The elif statement adds another decision.
        else +0 The else statement does not cause a new decision. The decision is at the if.
        for +1 There is a decision at the start of the loop.
        while +1 There is a decision at the while statement.
        except +1 Each except branch adds a new conditional path of execution.
        finally +0 The finally block is unconditionally executed.
        with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
        assert +1 The assert statement internally roughly equals a conditional statement.
        Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
        Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

        Source: http://radon.readthedocs.org/en/latest/intro.html

        Severity
        Category
        Status
        Source
        Language