giganticode/codeprep

View on GitHub

Showing 82 of 82 total issues

Function preprocess has 7 arguments (exceeds 4 allowed). Consider refactoring.
Open

def preprocess(text: str, config: PrepConfig, bpe_codes_id: Optional[str] = None, extension: Optional[str] = None,
Severity: Major
Found in codeprep/api/text.py - About 50 mins to fix

    Function __init__ has 7 arguments (exceeds 4 allowed). Consider refactoring.
    Open

        def __init__(self, path: str, prep_config: PrepConfig, normalized_extension_list: Optional[List[str]],
    Severity: Major
    Found in codeprep/pipeline/dataset.py - About 50 mins to fix

      Function __init__ has 7 arguments (exceeds 4 allowed). Consider refactoring.
      Open

          def __init__(self, types_to_be_repr,
      Severity: Major
      Found in codeprep/preprocess/reprconfig.py - About 50 mins to fix

        Function __init__ has 7 arguments (exceeds 4 allowed). Consider refactoring.
        Open

            def __init__(self, merges_done: int, time_for_last_merge: float,
        Severity: Major
        Found in codeprep/bpepkg/wild_bpe.py - About 50 mins to fix

          Function create has 7 arguments (exceeds 4 allowed). Consider refactoring.
          Open

              def create(cls: Type['Dataset'], path_to_dataset: str, prep_config: PrepConfig, extensions: Optional[str],
          Severity: Major
          Found in codeprep/pipeline/dataset.py - About 50 mins to fix

            Avoid deeply nested control flow statements.
            Open

                                if can_be_concat(pair_to_merge, cc, op_side.opposite()):
                                    cc_concat = concat_pairs(pair_to_merge, cc, op_side.opposite())
                                    add_pairs_to_neighbour_index(neighbour_index, appeared_pair, cc_concat, op_side,
                                                                 location_index)
            
            
            Severity: Major
            Found in codeprep/bpepkg/wild_bpe.py - About 45 mins to fix

              Function handle_learnbpe has a Cognitive Complexity of 8 (exceeds 5 allowed). Consider refactoring.
              Open

              def handle_learnbpe(args):
                  set_log_level(args)
                  path = os.path.abspath(args['--path'])
                  bpe_config = create_bpe_config_from_args(args)
                  n_merges = int(args['<n-merges>'])
              Severity: Minor
              Found in codeprep/cli/impl.py - About 45 mins to fix

              Cognitive Complexity

              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

              A method's cognitive complexity is based on a few simple rules:

              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
              • Code is considered more complex for each "break in the linear flow of the code"
              • Code is considered more complex when "flow breaking structures are nested"

              Further reading

              Avoid deeply nested control flow statements.
              Open

                                  for disappeared_pair2 in disappearing_pairs:
                                      mm = concat_pairs(pair_to_merge, disappeared_pair2, side)
                                      add_pairs_to_neighbour_index(neighbour_index, appeared_pair, mm, side, location_index)
                                      if can_be_concat(pair_to_merge, mm, side.opposite()):
                                          mm_concat = concat_pairs(pair_to_merge, mm, side.opposite())
              Severity: Major
              Found in codeprep/bpepkg/wild_bpe.py - About 45 mins to fix

                Avoid deeply nested control flow statements.
                Open

                                    if return_dirs_instead_of_regular_files:
                                        yield bin_name
                                for file in files:
                Severity: Major
                Found in codeprep/util/dir.py - About 45 mins to fix

                  Avoid deeply nested control flow statements.
                  Open

                                      if current_merge_candidate_priority < merge_candidate_priority:
                                          merge_candidate_priority = current_merge_candidate_priority
                                          merge_indices = [i]
                                      elif current_merge_candidate_priority == merge_candidate_priority:
                                          if not merge_indices or merge_indices[-1] != i - 1:
                  Severity: Major
                  Found in codeprep/bpepkg/bpe_encode.py - About 45 mins to fix

                    Function create_split_value has 6 arguments (exceeds 4 allowed). Consider refactoring.
                    Open

                    def create_split_value(split_type: str, bpe_codes_id: Optional[str] = None, full_strings: bool = False,
                    Severity: Minor
                    Found in codeprep/api/common.py - About 45 mins to fix

                      Avoid deeply nested control flow statements.
                      Open

                                          if can_be_concat(pair_to_merge, neighbour_of_neighbour, side.opposite()):
                                              neighbour_of_neighbour_concat = concat_pairs(pair_to_merge, neighbour_of_neighbour,
                                                                                           side.opposite())
                                              add_pairs_to_neighbour_index(neighbour_index, appeared_pair, neighbour_of_neighbour_concat,
                                                                           side, location_index)
                      Severity: Major
                      Found in codeprep/bpepkg/wild_bpe.py - About 45 mins to fix

                        Avoid deeply nested control flow statements.
                        Open

                                            if not extensions or has_one_of_extensions(bin_name, extensions_bin):
                                                if not os.path.islink(os.path.join(root, file)):
                                                    f.write(f'{bin_name}\n')
                                                    if not return_dirs_instead_of_regular_files:
                                                        yield bin_name
                        Severity: Major
                        Found in codeprep/util/dir.py - About 45 mins to fix

                          Function add_pairs_to_neighbour_index has 5 arguments (exceeds 4 allowed). Consider refactoring.
                          Open

                          def add_pairs_to_neighbour_index(index, pair1, pair2, side, location_index):
                          Severity: Minor
                          Found in codeprep/bpepkg/wild_bpe.py - About 35 mins to fix

                            Function walk_and_save has 5 arguments (exceeds 4 allowed). Consider refactoring.
                            Open

                            def walk_and_save(path: str, dir_list_path: str, file_list_path: str, return_dirs_instead_of_regular_files: bool,
                            Severity: Minor
                            Found in codeprep/util/dir.py - About 35 mins to fix

                              Method earnedScores has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                              Open

                                  @Override
                                  public int earnedScores(DiceLayout diceLayout) {
                                      int sum = 0;
                                      for (Integer diceValue: DiceValues.getDescendingIterator()) {
                                          if (diceLayout.getCount(diceValue) >= 2) {

                              Cognitive Complexity

                              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                              A method's cognitive complexity is based on a few simple rules:

                              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                              • Code is considered more complex for each "break in the linear flow of the code"
                              • Code is considered more complex when "flow breaking structures are nested"

                              Further reading

                              Function cleanup_location_index has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                              Open

                              def cleanup_location_index(location_index, most_freq_pair, disappearing_pairs):
                                  for side in Side:
                                      for disappearing_pair in disappearing_pairs[side]:
                                          if len(location_index[disappearing_pair]) == 0:
                                              del location_index[disappearing_pair]
                              Severity: Minor
                              Found in codeprep/bpepkg/wild_bpe.py - About 35 mins to fix

                              Cognitive Complexity

                              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                              A method's cognitive complexity is based on a few simple rules:

                              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                              • Code is considered more complex for each "break in the linear flow of the code"
                              • Code is considered more complex when "flow breaking structures are nested"

                              Further reading

                              Function cleanup_neighbour_index has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                              Open

                              def cleanup_neighbour_index(location_index, neighbour_index, most_freq_pair):
                                  for side in Side:
                                      disappearing_pairs = neighbour_index[most_freq_pair][side]
                                      for disappearing_pair in disappearing_pairs:
                                          if disappearing_pair not in location_index and disappearing_pair in neighbour_index:
                              Severity: Minor
                              Found in codeprep/bpepkg/wild_bpe.py - About 35 mins to fix

                              Cognitive Complexity

                              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                              A method's cognitive complexity is based on a few simple rules:

                              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                              • Code is considered more complex for each "break in the linear flow of the code"
                              • Code is considered more complex when "flow breaking structures are nested"

                              Further reading

                              Function _normalize_index_or_slice has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                              Open

                                  def _normalize_index_or_slice(item: Union[int, slice], total: int) -> Tuple[int, int]:
                                      if isinstance(item, int):
                                          start = TokenSequence._normalize_index(item, is_slice=False, total=total)
                                          stop = start + 1 if start < total else start
                                      elif isinstance(item, slice):
                              Severity: Minor
                              Found in codeprep/preprocess/tokens.py - About 35 mins to fix

                              Cognitive Complexity

                              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                              A method's cognitive complexity is based on a few simple rules:

                              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                              • Code is considered more complex for each "break in the linear flow of the code"
                              • Code is considered more complex when "flow breaking structures are nested"

                              Further reading

                              Avoid too many return statements within this function.
                              Open

                                          return '2'
                              Severity: Major
                              Found in codeprep/api/common.py - About 30 mins to fix
                                Severity
                                Category
                                Status
                                Source
                                Language