CenterForOpenScience/waterbutler

View on GitHub

Showing 92 of 92 total issues

Similar blocks of code found in 2 locations. Consider refactoring.
Open

class FileSystemFolderMetadata(BaseFileSystemMetadata, metadata.BaseFolderMetadata):

    @property
    def name(self):
        return os.path.split(self.raw['path'].rstrip('/'))[1]
Severity: Major
Found in waterbutler/providers/filesystem/metadata.py and 1 other location - About 3 hrs to fix
waterbutler/providers/cloudfiles/metadata.py on lines 95..103

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 71.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

class CloudFilesFolderMetadata(BaseCloudFilesMetadata, metadata.BaseFolderMetadata):

    @property
    def name(self):
        return os.path.split(self.raw['subdir'].rstrip('/'))[1]
Severity: Major
Found in waterbutler/providers/cloudfiles/metadata.py and 1 other location - About 3 hrs to fix
waterbutler/providers/filesystem/metadata.py on lines 23..31

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 71.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

File remote_logging.py has 322 lines of code (exceeds 250 allowed). Consider refactoring.
Open

import json
import time
import asyncio
import logging

Severity: Minor
Found in waterbutler/core/remote_logging.py - About 3 hrs to fix

    File metadata.py has 321 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    import abc
    import typing
    import hashlib
    
    import furl
    Severity: Minor
    Found in waterbutler/core/metadata.py - About 3 hrs to fix

      WaterButlerPath has 27 functions (exceeds 20 allowed). Consider refactoring.
      Open

      class WaterButlerPath:
          """ A standardized and validated immutable WaterButler path.  This is our abstraction around
          file paths in storage providers.  A WaterButlerPath is an array of WaterButlerPathPart objects.
          Each PathPart has two important attributes, `value` and `_id`.  `value` is always the
          human-readable component of the path. If the provider assigns ids to entities (see: Box, Google
      Severity: Minor
      Found in waterbutler/core/path.py - About 3 hrs to fix

        File zip.py has 292 lines of code (exceeds 250 allowed). Consider refactoring.
        Open

        import zlib
        import time
        import struct
        import asyncio
        import logging
        Severity: Minor
        Found in waterbutler/core/streams/zip.py - About 3 hrs to fix

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

          class GoogleDriveRevision(metadata.BaseFileRevisionMetadata):
          
              @property
              def version_identifier(self):
                  return 'revision'
          Severity: Major
          Found in waterbutler/providers/googledrive/metadata.py and 1 other location - About 2 hrs to fix
          waterbutler/providers/onedrive/metadata.py on lines 91..103

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 55.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

          class OneDriveRevisionMetadata(metadata.BaseFileRevisionMetadata):
          
              @property
              def version_identifier(self):
                  return 'revision'
          Severity: Major
          Found in waterbutler/providers/onedrive/metadata.py and 1 other location - About 2 hrs to fix
          waterbutler/providers/googledrive/metadata.py on lines 178..190

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 55.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          File provider.py has 267 lines of code (exceeds 250 allowed). Consider refactoring.
          Open

          import hashlib
          import logging
          import tempfile
          from typing import Tuple
          from http import HTTPStatus
          Severity: Minor
          Found in waterbutler/providers/dataverse/provider.py - About 2 hrs to fix

            File provider.py has 261 lines of code (exceeds 250 allowed). Consider refactoring.
            Open

            import aiohttp
            
            from waterbutler.core import streams
            from waterbutler.core import provider
            from waterbutler.core import exceptions
            Severity: Minor
            Found in waterbutler/providers/owncloud/provider.py - About 2 hrs to fix

              Similar blocks of code found in 2 locations. Consider refactoring.
              Open

                  @property
                  def materialized_path(self):
                      if settings.ARTICLE_TYPE_IDENTIFIER in self.raw['url']:
                          return '/{0}'.format(self.name)
                      # if self.raw['defined_type'] in settings.FOLDER_TYPES:
              Severity: Major
              Found in waterbutler/providers/figshare/metadata.py and 1 other location - About 2 hrs to fix
              waterbutler/providers/figshare/metadata.py on lines 39..43

              Duplicated Code

              Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

              Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

              When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

              Tuning

              This issue has a mass of 53.

              We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

              The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

              If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

              See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

              Refactorings

              Further Reading

              Function _interpret_query_parameters has a Cognitive Complexity of 17 (exceeds 5 allowed). Consider refactoring.
              Open

                  def _interpret_query_parameters(self, **kwargs) -> Tuple[str, str, str]:
                      """This one hurts.
              
                      Over the life of WB, the github provider has accepted the following parameters to identify
                      the ref (commit or branch) that the path entity is to be found on: ``ref``, ``branch``,
              Severity: Minor
              Found in waterbutler/providers/github/provider.py - About 2 hrs to fix

              Cognitive Complexity

              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

              A method's cognitive complexity is based on a few simple rules:

              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
              • Code is considered more complex for each "break in the linear flow of the code"
              • Code is considered more complex when "flow breaking structures are nested"

              Further reading

              Similar blocks of code found in 2 locations. Consider refactoring.
              Open

                  @property
                  def path(self):
                      if settings.ARTICLE_TYPE_IDENTIFIER in self.raw['url']:
                          return '/{0}'.format(self.id)
                      return '/{0}/{1}'.format(self.article_id, self.id)
              Severity: Major
              Found in waterbutler/providers/figshare/metadata.py and 1 other location - About 2 hrs to fix
              waterbutler/providers/figshare/metadata.py on lines 45..52

              Duplicated Code

              Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

              Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

              When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

              Tuning

              This issue has a mass of 53.

              We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

              The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

              If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

              See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

              Refactorings

              Further Reading

              Function on_finish has a Cognitive Complexity of 16 (exceeds 5 allowed). Consider refactoring.
              Open

                  def on_finish(self):
                      status, method = self.get_status(), self.request.method.upper()
              
                      # If the response code is not within the 200-302 range, the request was a HEAD or OPTIONS,
                      # the response code is 202, or the response was a 206 partial request, then no callbacks
              Severity: Minor
              Found in waterbutler/server/api/v1/provider/__init__.py - About 2 hrs to fix

              Cognitive Complexity

              Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

              A method's cognitive complexity is based on a few simple rules:

              • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
              • Code is considered more complex for each "break in the linear flow of the code"
              • Code is considered more complex when "flow breaking structures are nested"

              Further reading

              File exceptions.py has 252 lines of code (exceeds 250 allowed). Consider refactoring.
              Open

              import json
              from http import HTTPStatus
              
              from aiohttp.client_exceptions import ContentTypeError
              
              
              Severity: Minor
              Found in waterbutler/core/exceptions.py - About 2 hrs to fix

                Function write_error has a Cognitive Complexity of 15 (exceeds 5 allowed). Consider refactoring.
                Open

                    def write_error(self, status_code, exc_info):
                        etype, exc, _ = exc_info
                
                        finish_args = []
                        with sentry_sdk.configure_scope() as scope:
                Severity: Minor
                Found in waterbutler/server/api/v1/core.py - About 1 hr to fix

                Cognitive Complexity

                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                A method's cognitive complexity is based on a few simple rules:

                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                • Code is considered more complex for each "break in the linear flow of the code"
                • Code is considered more complex when "flow breaking structures are nested"

                Further reading

                Function dropbox_conflict_error_handler has a Cognitive Complexity of 14 (exceeds 5 allowed). Consider refactoring.
                Open

                    def dropbox_conflict_error_handler(self, data: dict, error_path: str='') -> None:
                        """Takes a standard Dropbox error response and an optional path and tries to throw a
                        meaningful error based on it.
                
                        :param dict data: the error received from Dropbox
                Severity: Minor
                Found in waterbutler/providers/dropbox/provider.py - About 1 hr to fix

                Cognitive Complexity

                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                A method's cognitive complexity is based on a few simple rules:

                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                • Code is considered more complex for each "break in the linear flow of the code"
                • Code is considered more complex when "flow breaking structures are nested"

                Further reading

                Function new_from_response has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring.
                Open

                    def new_from_response(cls, response, base_folder_id, base_folder_metadata=None):
                        """Build a new `OneDrivePath` object from a OneDrive API response representing a file or
                        folder entity.  Requires the ID of the provider base folder.  Requires base folder metadata
                        if base folder is neither the provider root nor the immediate parent of the entity being
                        built.
                Severity: Minor
                Found in waterbutler/providers/onedrive/path.py - About 1 hr to fix

                Cognitive Complexity

                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                A method's cognitive complexity is based on a few simple rules:

                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                • Code is considered more complex for each "break in the linear flow of the code"
                • Code is considered more complex when "flow breaking structures are nested"

                Further reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    @property
                    def export_name(self):
                        title = self._file_title
                        if self.is_google_doc:
                            ext = utils.get_download_extension(self.raw)
                Severity: Major
                Found in waterbutler/providers/googledrive/metadata.py and 1 other location - About 1 hr to fix
                waterbutler/providers/googledrive/metadata.py on lines 68..74

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 44.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    @property
                    def name(self):
                        title = self._file_title
                        if self.is_google_doc:
                            ext = utils.get_extension(self.raw)
                Severity: Major
                Found in waterbutler/providers/googledrive/metadata.py and 1 other location - About 1 hr to fix
                waterbutler/providers/googledrive/metadata.py on lines 131..137

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 44.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Severity
                Category
                Status
                Source
                Language