HazyResearch/fonduer

View on GitHub

Showing 134 of 224 total issues

Function candidate_subclass has 6 arguments (exceeds 4 allowed). Consider refactoring.
Open

def candidate_subclass(
Severity: Minor
Found in src/fonduer/candidates/models/candidate.py - About 45 mins to fix

    Avoid deeply nested control flow statements.
    Open

                            for f, v in _vizlib_unary_features(span):
                                unary_vizlib_feats[span.stable_id].add((f, v))
    
    
    Severity: Major
    Found in src/fonduer/features/feature_libs/visual_features.py - About 45 mins to fix

      Avoid deeply nested control flow statements.
      Open

                              for feature, value in _strlib_unary_features(span):
                                  unary_strlib_feats[span.stable_id].add((feature, value))
      
      
      Severity: Major
      Found in src/fonduer/features/feature_libs/structural_features.py - About 45 mins to fix

        Avoid deeply nested control flow statements.
        Open

                            for f in unary_tdl_feats[span.stable_id]:
                                yield candidate.id, f"TDL_{f}", DEF_VALUE
                    for f in _get_word_feats(span):
        Severity: Major
        Found in src/fonduer/features/feature_libs/textual_features.py - About 45 mins to fix

          Avoid deeply nested control flow statements.
          Open

                              if candidate.id not in multinary_tdl_feats:
                                  multinary_tdl_feats[candidate.id] = set()
                                  for f in get_tdl_feats(xmltree.root, s_idxs):
                                      multinary_tdl_feats[candidate.id].add(f)
                              for f in multinary_tdl_feats[candidate.id]:
          Severity: Major
          Found in src/fonduer/features/feature_libs/textual_features.py - About 45 mins to fix

            Avoid deeply nested control flow statements.
            Open

                                    if self.space:
                                        word.string.replace_with(" ".join(tokens))
                                    else:
                                        word.string.replace_with("".join(tokens))
                                word.unwrap()
            Severity: Major
            Found in src/fonduer/parser/preprocessors/hocr_doc_preprocessor.py - About 45 mins to fix

              Avoid deeply nested control flow statements.
              Open

                                      if len(content) > 0:  # Ignore empty characters
                                          word_id: PdfWordId = (page_num, i)
                                          pdf_word_list.append((word_id, content))
                                          coordinate_map[word_id] = Bbox(
                                              page_num,
              Severity: Major
              Found in src/fonduer/parser/visual_parser/pdf_visual_parser.py - About 45 mins to fix

                Function update has 6 arguments (exceeds 4 allowed). Consider refactoring.
                Open

                    def update(
                Severity: Minor
                Found in src/fonduer/supervision/labeler.py - About 45 mins to fix

                  Function get_neighbor_sentence_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                  Open

                  def get_neighbor_sentence_ngrams(
                  Severity: Minor
                  Found in src/fonduer/utils/data_model_utils/tabular.py - About 45 mins to fix

                    Function apply has 6 arguments (exceeds 4 allowed). Consider refactoring.
                    Open

                        def apply(  # type: ignore
                    Severity: Minor
                    Found in src/fonduer/candidates/candidates.py - About 45 mins to fix

                      Function __init__ has 6 arguments (exceeds 4 allowed). Consider refactoring.
                      Open

                          def __init__(
                      Severity: Minor
                      Found in src/fonduer/candidates/candidates.py - About 45 mins to fix

                        Avoid deeply nested control flow statements.
                        Open

                                            if (  # True if visually aligned AND not from itself.
                                                bbox_direction_aligned(ts.get_bbox(), span.get_bbox())
                                                and ts not in span
                                                and span not in ts
                                            ):
                        Severity: Major
                        Found in src/fonduer/utils/data_model_utils/visual.py - About 45 mins to fix

                          Function get_cell_ngrams has a Cognitive Complexity of 8 (exceeds 5 allowed). Consider refactoring.
                          Open

                          def get_cell_ngrams(
                              mention: Union[Candidate, Mention, TemporarySpanMention],
                              attrib: str = "words",
                              n_min: int = 1,
                              n_max: int = 1,
                          Severity: Minor
                          Found in src/fonduer/utils/data_model_utils/tabular.py - About 45 mins to fix

                          Cognitive Complexity

                          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                          A method's cognitive complexity is based on a few simple rules:

                          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                          • Code is considered more complex for each "break in the linear flow of the code"
                          • Code is considered more complex when "flow breaking structures are nested"

                          Further reading

                          Avoid deeply nested control flow statements.
                          Open

                                              for f in _get_ddlib_feats(span, get_as_dict(span.sentence), sidxs):
                                                  yield candidate.id, f"DDL_{f}", DEF_VALUE
                                              # Add TreeDLib entity features
                                              if span.stable_id not in unary_tdl_feats:
                          Severity: Major
                          Found in src/fonduer/features/feature_libs/textual_features.py - About 45 mins to fix

                            Avoid deeply nested control flow statements.
                            Open

                                                    if ptr + i in s2h_multi:  # One spacy token-to-multi hOCR words
                                                        left = lefts[s2h_multi[ptr + i]]
                                                        top = tops[s2h_multi[ptr + i]]
                                                        right = rights[s2h_multi[ptr + i]]
                                                        bottom = bottoms[s2h_multi[ptr + i]]
                            Severity: Major
                            Found in src/fonduer/parser/visual_parser/hocr_visual_parser.py - About 45 mins to fix

                              Function get_neighbor_sentence_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                              Open

                              def get_neighbor_sentence_ngrams(
                              Severity: Minor
                              Found in src/fonduer/utils/data_model_utils/textual.py - About 45 mins to fix

                                Function get_row_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                                Open

                                def get_row_ngrams(
                                Severity: Minor
                                Found in src/fonduer/utils/data_model_utils/tabular.py - About 45 mins to fix

                                  Function __getitem__ has a Cognitive Complexity of 8 (exceeds 5 allowed). Consider refactoring.
                                  Open

                                      def __getitem__(self, key: slice) -> "TemporaryImplicitSpanMention":
                                          """Slice operation returns a new candidate sliced according to **char index**.
                                  
                                          Note that the slicing is w.r.t. the candidate range (not the abs.
                                          sentence char indexing)
                                  Severity: Minor
                                  Found in src/fonduer/candidates/models/implicit_span_mention.py - About 45 mins to fix

                                  Cognitive Complexity

                                  Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                                  A method's cognitive complexity is based on a few simple rules:

                                  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                                  • Code is considered more complex for each "break in the linear flow of the code"
                                  • Code is considered more complex when "flow breaking structures are nested"

                                  Further reading

                                  Function __init__ has 6 arguments (exceeds 4 allowed). Consider refactoring.
                                  Open

                                      def __init__(
                                  Severity: Minor
                                  Found in src/fonduer/learning/dataset.py - About 45 mins to fix

                                    Function get_col_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                                    Open

                                    def get_col_ngrams(
                                    Severity: Minor
                                    Found in src/fonduer/utils/data_model_utils/tabular.py - About 45 mins to fix
                                      Severity
                                      Category
                                      Status
                                      Source
                                      Language