giganticode/codeprep

View on GitHub

Showing 82 of 82 total issues

Function basic has 13 arguments (exceeds 4 allowed). Consider refactoring.
Open

def basic(text: str, extension: Optional[str] = None,
Severity: Major
Found in codeprep/api/text.py - About 1 hr to fix

    Function run has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring.
    Open

        def run(self) -> None:
            while True:
                chunk_assigned = self.chunk_queue.get(block=True, timeout=BLOCKING_TIMEOUT_SECONDS_SHORT)
                if chunk_assigned == -1:
                    if not self.process_counter.compare_and_dec(1):
    Severity: Minor
    Found in codeprep/pipeline/vocab.py - About 1 hr to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function walk has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring.
    Open

    def walk(path:bytes, extension: Optional[bytes] = None) -> Generator[bytes, None, None]:
        if os.path.isfile(path) and (not extension or path.endswith(extension)):
            yield path
        else:
            for root, dirs, files in os.walk(path):
    Severity: Minor
    Found in codeprep/util/dir.py - About 1 hr to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Function create_prep_config has 12 arguments (exceeds 4 allowed). Consider refactoring.
    Open

    def create_prep_config(spl_type: str, bpe_codes_id: Optional[str] = None, no_spaces: bool = False,
    Severity: Major
    Found in codeprep/api/common.py - About 1 hr to fix

      Function get_all_files has a Cognitive Complexity of 12 (exceeds 5 allowed). Consider refactoring.
      Open

          def get_all_files(self, return_dirs_instead_of_regular_files: bool=False) -> Generator[bytes, None, None]:
              if self.files_need_to_be_saved():
                  if not os.path.exists(self.path_to_file_list_folder):
                      os.makedirs(self.path_to_file_list_folder)
                  for filepath in walk_and_save(self.original.path,
      Severity: Minor
      Found in codeprep/pipeline/dataset.py - About 1 hr to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function merge_lists_both has a Cognitive Complexity of 12 (exceeds 5 allowed). Consider refactoring.
      Open

      def merge_lists_both(main_list: List[int], list2: List[int], position_shift: Tuple[int, int]) -> Tuple[List[int], List[int]]:
          """
          >>> merge_lists_both([0, 5, 7, 11, 16], [1, 9, 15], (2, -2))
          ([1, 15], [7])
      
      
      Severity: Minor
      Found in codeprep/bpepkg/wild_bpe.py - About 1 hr to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function convert_text has a Cognitive Complexity of 12 (exceeds 5 allowed). Consider refactoring.
      Open

      def convert_text(text: str, extension: str) -> List[ParsedToken]:
          extension = extension or 'java'
          if extension:
              try:
                  lexer = get_lexer_by_name(extension)
      Severity: Minor
      Found in codeprep/parse/core.py - About 1 hr to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function bpe has 11 arguments (exceeds 4 allowed). Consider refactoring.
      Open

      def bpe(path: str, bpe_codes_id: str, extensions: Optional[str] = None, no_spaces: bool = False,
      Severity: Major
      Found in codeprep/api/corpus.py - About 1 hr to fix

        Function bpe has 11 arguments (exceeds 4 allowed). Consider refactoring.
        Open

        def bpe(text: str, bpe_codes_id: str, extension: Optional[str] = None, no_spaces: bool = False,
        Severity: Major
        Found in codeprep/api/text.py - About 1 hr to fix

          Function nosplit has 11 arguments (exceeds 4 allowed). Consider refactoring.
          Open

          def nosplit(path: str, extensions: Optional[str] = None, no_spaces: bool = False, no_unicode: bool = False,
          Severity: Major
          Found in codeprep/api/corpus.py - About 1 hr to fix

            Function chars has 10 arguments (exceeds 4 allowed). Consider refactoring.
            Open

            def chars(path: str, extensions: Optional[str] = None, no_spaces: bool = False, no_unicode: bool = False,
            Severity: Major
            Found in codeprep/api/corpus.py - About 1 hr to fix

              Function nosplit has 10 arguments (exceeds 4 allowed). Consider refactoring.
              Open

              def nosplit(text: str, extension: Optional[str] = None, no_spaces: bool = False, no_unicode: bool = False,
              Severity: Major
              Found in codeprep/api/text.py - About 1 hr to fix

                Function chars has 9 arguments (exceeds 4 allowed). Consider refactoring.
                Open

                def chars(text: str, extension: Optional[str] = None, no_spaces: bool = False, no_unicode: bool = False,
                Severity: Major
                Found in codeprep/api/text.py - About 1 hr to fix

                  Function get_char_iterator_for_dir has a Cognitive Complexity of 10 (exceeds 5 allowed). Consider refactoring.
                  Open

                  def get_char_iterator_for_dir(path_to_dir: str) -> Generator[str, None, None]:
                      for root, dirs, files in os.walk(path_to_dir):
                          for file in files:
                              if file.endswith('.py'):
                                  yield from get_char_iterator_for_file(os.path.join(root, file))
                  Severity: Minor
                  Found in codeprep/bpepkg/wild_bpe.py - About 1 hr to fix

                  Cognitive Complexity

                  Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                  A method's cognitive complexity is based on a few simple rules:

                  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                  • Code is considered more complex for each "break in the linear flow of the code"
                  • Code is considered more complex when "flow breaking structures are nested"

                  Further reading

                  Function __getitem__ has a Cognitive Complexity of 10 (exceeds 5 allowed). Consider refactoring.
                  Open

                      def __getitem__(self, item: Union[int, slice]):
                          start, stop = TokenSequence._normalize_index_or_slice(item, total=len(self))
                  
                          adjusted_before = 0
                          adjusted_after = 0
                  Severity: Minor
                  Found in codeprep/preprocess/tokens.py - About 1 hr to fix

                  Cognitive Complexity

                  Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                  A method's cognitive complexity is based on a few simple rules:

                  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                  • Code is considered more complex for each "break in the linear flow of the code"
                  • Code is considered more complex when "flow breaking structures are nested"

                  Further reading

                  Function _normalize_index has a Cognitive Complexity of 10 (exceeds 5 allowed). Consider refactoring.
                  Open

                      def _normalize_index(index: Optional[int], is_slice: bool, total: int) -> Optional[int]:
                          """
                          >>> TokenSequence._normalize_index(None, is_slice=False, total=2) is None
                          True
                          >>> TokenSequence._normalize_index(-3, is_slice=False, total=2)
                  Severity: Minor
                  Found in codeprep/preprocess/tokens.py - About 1 hr to fix

                  Cognitive Complexity

                  Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                  A method's cognitive complexity is based on a few simple rules:

                  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                  • Code is considered more complex for each "break in the linear flow of the code"
                  • Code is considered more complex when "flow breaking structures are nested"

                  Further reading

                  Function __init__ has 8 arguments (exceeds 4 allowed). Consider refactoring.
                  Open

                      def __init__(self, id: int, tasks: Dict[int, Queue], path_to_dump: str, process_counter: AtomicInteger,
                  Severity: Major
                  Found in codeprep/pipeline/vocab.py - About 1 hr to fix

                    Function replace_non_ascii_seqs has a Cognitive Complexity of 9 (exceeds 5 allowed). Consider refactoring.
                    Open

                    def replace_non_ascii_seqs(word:str, placeholder: str) -> str:
                        """
                        >>> replace_non_ascii_seqs("","\xf7")
                        ''
                    
                    
                    Severity: Minor
                    Found in codeprep/noneng.py - About 55 mins to fix

                    Cognitive Complexity

                    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                    A method's cognitive complexity is based on a few simple rules:

                    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                    • Code is considered more complex for each "break in the linear flow of the code"
                    • Code is considered more complex when "flow breaking structures are nested"

                    Further reading

                    Function create has 7 arguments (exceeds 4 allowed). Consider refactoring.
                    Open

                        def create(tokens: List[str] = None, metadata: PreppedTokenMetadata = None,
                    Severity: Major
                    Found in codeprep/preprocess/tokens.py - About 50 mins to fix

                      Function preprocess_corpus has 7 arguments (exceeds 4 allowed). Consider refactoring.
                      Open

                      def preprocess_corpus(path: str, prep_config: PrepConfig, bpe_codes_id: Optional[str]=None,
                      Severity: Major
                      Found in codeprep/api/corpus.py - About 50 mins to fix
                        Severity
                        Category
                        Status
                        Source
                        Language