HazyResearch/fonduer

View on GitHub

Showing 224 of 224 total issues

Avoid deeply nested control flow statements.
Open

                            if r.search(styles.text) is not None:
                                if cur_style_index is not None:
                                    parts["html_attrs"][cur_style_index] += (
                                        r.search(styles.text)
                                        .group(3)
Severity: Major
Found in src/fonduer/parser/parser.py - About 45 mins to fix

    Function update has 6 arguments (exceeds 4 allowed). Consider refactoring.
    Open

        def update(
    Severity: Minor
    Found in src/fonduer/supervision/labeler.py - About 45 mins to fix

      Avoid deeply nested control flow statements.
      Open

                              for f, v in _vizlib_unary_features(span):
                                  unary_vizlib_feats[span.stable_id].add((f, v))
      
      
      Severity: Major
      Found in src/fonduer/features/feature_libs/visual_features.py - About 45 mins to fix

        Avoid deeply nested control flow statements.
        Open

                            for f in unary_tdl_feats[span.stable_id]:
                                yield candidate.id, f"TDL_{f}", DEF_VALUE
                    for f in _get_word_feats(span):
        Severity: Major
        Found in src/fonduer/features/feature_libs/textual_features.py - About 45 mins to fix

          Function __init__ has 6 arguments (exceeds 4 allowed). Consider refactoring.
          Open

              def __init__(
          Severity: Minor
          Found in src/fonduer/utils/udf.py - About 45 mins to fix

            Avoid deeply nested control flow statements.
            Open

                                    if ptr + i in s2h_multi:  # One spacy token-to-multi hOCR words
                                        left = lefts[s2h_multi[ptr + i]]
                                        top = tops[s2h_multi[ptr + i]]
                                        right = rights[s2h_multi[ptr + i]]
                                        bottom = bottoms[s2h_multi[ptr + i]]
            Severity: Major
            Found in src/fonduer/parser/visual_parser/hocr_visual_parser.py - About 45 mins to fix

              Avoid deeply nested control flow statements.
              Open

                                      if len(content) > 0:  # Ignore empty characters
                                          word_id: PdfWordId = (page_num, i)
                                          pdf_word_list.append((word_id, content))
                                          coordinate_map[word_id] = Bbox(
                                              page_num,
              Severity: Major
              Found in src/fonduer/parser/visual_parser/pdf_visual_parser.py - About 45 mins to fix

                Function get_vert_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                Open

                def get_vert_ngrams(
                Severity: Minor
                Found in src/fonduer/utils/data_model_utils/visual.py - About 45 mins to fix

                  Function get_neighbor_sentence_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                  Open

                  def get_neighbor_sentence_ngrams(
                  Severity: Minor
                  Found in src/fonduer/utils/data_model_utils/tabular.py - About 45 mins to fix

                    Function get_col_ngrams has 6 arguments (exceeds 4 allowed). Consider refactoring.
                    Open

                    def get_col_ngrams(
                    Severity: Minor
                    Found in src/fonduer/utils/data_model_utils/tabular.py - About 45 mins to fix

                      Function apply has 6 arguments (exceeds 4 allowed). Consider refactoring.
                      Open

                          def apply(  # type: ignore
                      Severity: Minor
                      Found in src/fonduer/candidates/candidates.py - About 45 mins to fix

                        Function _f_gen has a Cognitive Complexity of 8 (exceeds 5 allowed). Consider refactoring.
                        Open

                            def _f_gen(self, c: Candidate) -> Iterator[Tuple[int, str, int]]:
                                """Convert lfs into a generator of id, name, and labels.
                        
                                In particular, catch verbose values and convert to integer ones.
                                """
                        Severity: Minor
                        Found in src/fonduer/supervision/labeler.py - About 45 mins to fix

                        Cognitive Complexity

                        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                        A method's cognitive complexity is based on a few simple rules:

                        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                        • Code is considered more complex for each "break in the linear flow of the code"
                        • Code is considered more complex when "flow breaking structures are nested"

                        Further reading

                        Function candidate_subclass has 6 arguments (exceeds 4 allowed). Consider refactoring.
                        Open

                        def candidate_subclass(
                        Severity: Minor
                        Found in src/fonduer/candidates/models/candidate.py - About 45 mins to fix

                          Avoid deeply nested control flow statements.
                          Open

                                              if candidate.id not in multinary_tdl_feats:
                                                  multinary_tdl_feats[candidate.id] = set()
                                                  for f in get_tdl_feats(xmltree.root, s_idxs):
                                                      multinary_tdl_feats[candidate.id].add(f)
                                              for f in multinary_tdl_feats[candidate.id]:
                          Severity: Major
                          Found in src/fonduer/features/feature_libs/textual_features.py - About 45 mins to fix

                            Function clear has 6 arguments (exceeds 4 allowed). Consider refactoring.
                            Open

                                def clear(  # type: ignore
                            Severity: Minor
                            Found in src/fonduer/supervision/labeler.py - About 45 mins to fix

                              Avoid deeply nested control flow statements.
                              Open

                                                  if (  # True if visually aligned AND not from itself.
                                                      bbox_direction_aligned(ts.get_bbox(), span.get_bbox())
                                                      and ts not in span
                                                      and span not in ts
                                                  ):
                              Severity: Major
                              Found in src/fonduer/utils/data_model_utils/visual.py - About 45 mins to fix

                                Function __getitem__ has a Cognitive Complexity of 8 (exceeds 5 allowed). Consider refactoring.
                                Open

                                    def __getitem__(self, key: slice) -> "TemporaryImplicitSpanMention":
                                        """Slice operation returns a new candidate sliced according to **char index**.
                                
                                        Note that the slicing is w.r.t. the candidate range (not the abs.
                                        sentence char indexing)
                                Severity: Minor
                                Found in src/fonduer/candidates/models/implicit_span_mention.py - About 45 mins to fix

                                Cognitive Complexity

                                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                                A method's cognitive complexity is based on a few simple rules:

                                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                                • Code is considered more complex for each "break in the linear flow of the code"
                                • Code is considered more complex when "flow breaking structures are nested"

                                Further reading

                                Function __init__ has 6 arguments (exceeds 4 allowed). Consider refactoring.
                                Open

                                    def __init__(
                                Severity: Minor
                                Found in src/fonduer/candidates/candidates.py - About 45 mins to fix

                                  Avoid deeply nested control flow statements.
                                  Open

                                                          for f, v in _tablelib_unary_features(span):
                                                              unary_tablelib_feats[span.stable_id].add((f, v))
                                  
                                  
                                  Severity: Major
                                  Found in src/fonduer/features/feature_libs/tabular_features.py - About 45 mins to fix

                                    Avoid deeply nested control flow statements.
                                    Open

                                                        for f in multinary_tdl_feats[candidate.id]:
                                                            yield candidate.id, f"TDL_{f}", DEF_VALUE
                                                for i, span in enumerate(spans):
                                    Severity: Major
                                    Found in src/fonduer/features/feature_libs/textual_features.py - About 45 mins to fix
                                      Severity
                                      Category
                                      Status
                                      Source
                                      Language