iterative/dvc

View on GitHub

Showing 519 of 560 total issues

File output.py has 1245 lines of code (exceeds 250 allowed). Consider refactoring.
Open

import errno
import logging
import os
import posixpath
from collections import defaultdict
Severity: Major
Found in dvc/output.py - About 3 days to fix

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

        @pytest.mark.xfail(raises=NotImplementedError, strict=False)
        def test_pull_no_00_prefix(self, tmp_dir, dvc, remote, monkeypatch):
            # Related: https://github.com/iterative/dvc/issues/6244
    
            fs_type = type(dvc.cloud.get_remote_odb("upstream").fs)
    Severity: Major
    Found in dvc/testing/remote_tests.py and 1 other location - About 2 days to fix
    dvc/testing/remote_tests.py on lines 118..142

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 252.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

        @pytest.mark.xfail(raises=NotImplementedError, strict=False)
        def test_pull_00_prefix(self, tmp_dir, dvc, remote, monkeypatch):
            # Related: https://github.com/iterative/dvc/issues/6089
    
            fs_type = type(dvc.cloud.get_remote_odb("upstream").fs)
    Severity: Major
    Found in dvc/testing/remote_tests.py and 1 other location - About 2 days to fix
    dvc/testing/remote_tests.py on lines 144..168

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 252.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

        def test_dir(self, tmp_dir, dvc, remote_worktree):  # pylint: disable=W0613
            (stage,) = tmp_dir.dvc_gen(
                {
                    "data_dir": {
                        "data_sub_dir": {"data_sub": "data_sub"},
    Severity: Major
    Found in dvc/testing/remote_tests.py and 1 other location - About 2 days to fix
    dvc/testing/remote_tests.py on lines 187..214

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 231.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

        def test_dir(self, tmp_dir, dvc, remote_version_aware):  # pylint: disable=W0613
            (stage,) = tmp_dir.dvc_gen(
                {
                    "data_dir": {
                        "data_sub_dir": {"data_sub": "data_sub"},
    Severity: Major
    Found in dvc/testing/remote_tests.py and 1 other location - About 2 days to fix
    dvc/testing/remote_tests.py on lines 234..261

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 231.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    File base.py has 683 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    import logging
    import os
    import pickle  # nosec B403
    import shutil
    from abc import ABC, abstractmethod
    Severity: Major
    Found in dvc/repo/experiments/executor/base.py - About 1 day to fix

      File index.py has 640 lines of code (exceeds 250 allowed). Consider refactoring.
      Open

      import logging
      import time
      from functools import partial
      from typing import (
          TYPE_CHECKING,
      Severity: Major
      Found in dvc/repo/index.py - About 1 day to fix

        File base.py has 612 lines of code (exceeds 250 allowed). Consider refactoring.
        Open

        import logging
        import os
        from abc import ABC, abstractmethod
        from dataclasses import asdict, dataclass
        from typing import (
        Severity: Major
        Found in dvc/repo/experiments/queue/base.py - About 1 day to fix

          File celery.py has 537 lines of code (exceeds 250 allowed). Consider refactoring.
          Open

          import hashlib
          import locale
          import logging
          import os
          from collections import defaultdict
          Severity: Major
          Found in dvc/repo/experiments/queue/celery.py - About 1 day to fix

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                def test_file(self, tmp_dir, dvc, remote_version_aware):  # pylint: disable=W0613
                    (stage,) = tmp_dir.dvc_gen("foo", "foo")
            
                    dvc.push()
                    assert "version_id" in (tmp_dir / "foo.dvc").read_text()
            Severity: Major
            Found in dvc/testing/remote_tests.py and 1 other location - About 1 day to fix
            dvc/testing/remote_tests.py on lines 219..232

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 130.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                def test_file(self, tmp_dir, dvc, remote_worktree):  # pylint: disable=W0613
                    (stage,) = tmp_dir.dvc_gen("foo", "foo")
            
                    dvc.push()
                    assert "version_id" in (tmp_dir / "foo.dvc").read_text()
            Severity: Major
            Found in dvc/testing/remote_tests.py and 1 other location - About 1 day to fix
            dvc/testing/remote_tests.py on lines 172..185

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 130.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            File __init__.py has 474 lines of code (exceeds 250 allowed). Consider refactoring.
            Open

            import csv
            import io
            import logging
            import os
            from collections import defaultdict
            Severity: Minor
            Found in dvc/repo/plots/__init__.py - About 7 hrs to fix

              File context.py has 441 lines of code (exceeds 250 allowed). Consider refactoring.
              Open

              import logging
              from abc import ABC, abstractmethod
              from collections import defaultdict
              from collections.abc import Mapping, MutableMapping, MutableSequence, Sequence
              from contextlib import contextmanager
              Severity: Minor
              Found in dvc/parsing/context.py - About 6 hrs to fix

                File __init__.py has 398 lines of code (exceeds 250 allowed). Consider refactoring.
                Open

                import logging
                import os
                from collections.abc import Mapping, Sequence
                from copy import deepcopy
                from itertools import product
                Severity: Minor
                Found in dvc/parsing/__init__.py - About 5 hrs to fix

                  File data_sync.py has 392 lines of code (exceeds 250 allowed). Consider refactoring.
                  Open

                  import argparse
                  import logging
                  
                  from dvc.cli import completion
                  from dvc.cli.command import CmdBase
                  Severity: Minor
                  Found in dvc/commands/data_sync.py - About 5 hrs to fix

                    File plots.py has 382 lines of code (exceeds 250 allowed). Consider refactoring.
                    Open

                    import argparse
                    import logging
                    import os
                    from typing import TYPE_CHECKING, Dict, List, Optional
                    
                    
                    Severity: Minor
                    Found in dvc/commands/plots.py - About 5 hrs to fix

                      Function dumpd has a Cognitive Complexity of 34 (exceeds 5 allowed). Consider refactoring.
                      Open

                          def dumpd(self, **kwargs):  # noqa: C901, PLR0912
                              from dvc.cachemgr import LEGACY_HASH_NAMES
                      
                              ret: Dict[str, Any] = {}
                              with_files = (
                      Severity: Minor
                      Found in dvc/output.py - About 5 hrs to fix

                      Cognitive Complexity

                      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                      A method's cognitive complexity is based on a few simple rules:

                      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                      • Code is considered more complex for each "break in the linear flow of the code"
                      • Code is considered more complex when "flow breaking structures are nested"

                      Further reading

                      LocalCeleryQueue has 38 functions (exceeds 20 allowed). Consider refactoring.
                      Open

                      class LocalCeleryQueue(BaseStashQueue):
                          """DVC experiment queue.
                      
                          Maps queued experiments to (Git) stash reflog entries.
                          """
                      Severity: Minor
                      Found in dvc/repo/experiments/queue/celery.py - About 5 hrs to fix

                        File worktree.py has 374 lines of code (exceeds 250 allowed). Consider refactoring.
                        Open

                        import logging
                        from functools import partial
                        from typing import TYPE_CHECKING, Any, Dict, Iterable, Optional, Set, Tuple, Union
                        
                        from funcy import first
                        Severity: Minor
                        Found in dvc/repo/worktree.py - About 5 hrs to fix

                          Function push has a Cognitive Complexity of 33 (exceeds 5 allowed). Consider refactoring.
                          Open

                          def push(  # noqa: C901, PLR0913
                              self,
                              targets=None,
                              jobs=None,
                              remote=None,
                          Severity: Minor
                          Found in dvc/repo/push.py - About 4 hrs to fix

                          Cognitive Complexity

                          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                          A method's cognitive complexity is based on a few simple rules:

                          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                          • Code is considered more complex for each "break in the linear flow of the code"
                          • Code is considered more complex when "flow breaking structures are nested"

                          Further reading

                          Severity
                          Category
                          Status
                          Source
                          Language