datahuborg/datahub

View on GitHub
src/core/db/manager.py

Summary

Maintainability
F
1 wk
Test Coverage

File manager.py has 1019 lines of code (exceeds 250 allowed). Consider refactoring.
Open

import six
import hashlib
import os
import errno
import re
Severity: Major
Found in src/core/db/manager.py - About 2 days to fix

    DataHubManager has 69 functions (exceeds 20 allowed). Consider refactoring.
    Open

    class DataHubManager:
    
        def __init__(self, user=settings.ANONYMOUS_ROLE, repo_base=None,
                     is_app=False):
    
    
    Severity: Major
    Found in src/core/db/manager.py - About 1 day to fix

      Function import_file has 9 arguments (exceeds 4 allowed). Consider refactoring.
      Open

          def import_file(username, repo_base, repo, table, file_name,
      Severity: Major
      Found in src/core/db/manager.py - About 1 hr to fix

        Function user_data_path has a Cognitive Complexity of 9 (exceeds 5 allowed). Consider refactoring.
        Open

        def user_data_path(repo_base, repo='', file_name='', file_format=None):
            """
            Returns an absolute path to a file or repo in a user's data folder.
        
            user_data_path('foo') => '/user_data/foo'
        Severity: Minor
        Found in src/core/db/manager.py - About 55 mins to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function update_card has a Cognitive Complexity of 9 (exceeds 5 allowed). Consider refactoring.
        Open

            def update_card(self, repo, card_name, new_query=None,
                            new_name=None, public=None):
                """
                Updates a card's name, query, and/or public visibility.
        
        
        Severity: Minor
        Found in src/core/db/manager.py - About 55 mins to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function export_table has 6 arguments (exceeds 4 allowed). Consider refactoring.
        Open

            def export_table(self, repo, table, file_name, file_format='CSV',
        Severity: Minor
        Found in src/core/db/manager.py - About 45 mins to fix

          Function update_card has 5 arguments (exceeds 4 allowed). Consider refactoring.
          Open

              def update_card(self, repo, card_name, new_query=None,
          Severity: Minor
          Found in src/core/db/manager.py - About 35 mins to fix

            Function export_view has 5 arguments (exceeds 4 allowed). Consider refactoring.
            Open

                def export_view(self, repo, view, file_format='CSV',
            Severity: Minor
            Found in src/core/db/manager.py - About 35 mins to fix

              Function import_rows has 5 arguments (exceeds 4 allowed). Consider refactoring.
              Open

                  def import_rows(self, repo, table, rows, delimiter=',', header=False):
              Severity: Minor
              Found in src/core/db/manager.py - About 35 mins to fix

                Function paginate_query has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                Open

                    def paginate_query(self, query, current_page, rows_per_page):
                        """
                        Set variables for query pagination, limiting query statement
                        to just the section of the table that will be displayed
                        """
                Severity: Minor
                Found in src/core/db/manager.py - About 35 mins to fix

                Cognitive Complexity

                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                A method's cognitive complexity is based on a few simple rules:

                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                • Code is considered more complex for each "break in the linear flow of the code"
                • Code is considered more complex when "flow breaking structures are nested"

                Further reading

                Function remove_user has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                Open

                    def remove_user(username, remove_db=True, ignore_missing_user=False):
                        # Delete the Django user
                        try:
                            DataHubManager._remove_django_user(username)
                        except User.DoesNotExist as e:
                Severity: Minor
                Found in src/core/db/manager.py - About 35 mins to fix

                Cognitive Complexity

                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                A method's cognitive complexity is based on a few simple rules:

                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                • Code is considered more complex for each "break in the linear flow of the code"
                • Code is considered more complex when "flow breaking structures are nested"

                Further reading

                Function _remove_django_user has a Cognitive Complexity of 6 (exceeds 5 allowed). Consider refactoring.
                Open

                    def _remove_django_user(username):
                        # Get the user associated with the username, delete their apps, and
                        # then delete the user
                        try:
                            user = User.objects.get(username=username)
                Severity: Minor
                Found in src/core/db/manager.py - About 25 mins to fix

                Cognitive Complexity

                Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                A method's cognitive complexity is based on a few simple rules:

                • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                • Code is considered more complex for each "break in the linear flow of the code"
                • Code is considered more complex when "flow breaking structures are nested"

                Further reading

                Identical blocks of code found in 2 locations. Consider refactoring.
                Open

                def rename_duplicates(columns):
                    columns = [c.lower() for c in columns]
                    new_columns = []
                    col_idx = {c: 1 for c in columns}
                
                
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 7 hrs to fix
                src/browser/utils.py on lines 46..59

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 116.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    def export_view(self, repo, view, file_format='CSV',
                                    delimiter=',', header=True):
                        """
                        Exports a view to a file in the same repo.
                
                
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 7 hrs to fix
                src/core/db/manager.py on lines 697..729

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 114.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    def export_table(self, repo, table, file_name, file_format='CSV',
                                     delimiter=',', header=True):
                        """
                        Exports a table to a file in the same repo.
                
                
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 7 hrs to fix
                src/core/db/manager.py on lines 736..767

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 114.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                def clean_str(text, prefix):
                    string = text.strip().lower()
                
                    # replace whitespace with '_'
                    string = re.sub(' ', '_', string)
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 5 hrs to fix
                src/browser/utils.py on lines 28..43

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 86.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    def describe_table(self, repo, table, detail=False):
                        """
                        Lists a table's schema. If detail=True, provides all schema info.
                
                        Default return includes column names and types only.
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 4 hrs to fix
                src/core/db/manager.py on lines 266..279

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 75.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    def describe_view(self, repo, view, detail=False):
                        """
                        Lists a view's schema. If detail=True, provides all schema info.
                
                        Default return includes column names and types only.
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 4 hrs to fix
                src/core/db/manager.py on lines 210..223

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 75.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                        try:
                            app = App.objects.get(app_id=collaborator)
                            collaborator_obj, _ = Collaborator.objects.get_or_create(
                                app=app, repo_name=repo, repo_base=self.repo_base)
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 1 hr to fix
                src/core/db/manager.py on lines 416..419

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 42.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                        except App.DoesNotExist:
                            user = User.objects.get(username=collaborator)
                            collaborator_obj, _ = Collaborator.objects.get_or_create(
                                user=user, repo_name=repo, repo_base=self.repo_base)
                Severity: Major
                Found in src/core/db/manager.py and 1 other location - About 1 hr to fix
                src/core/db/manager.py on lines 412..415

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 42.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                        if len(invalid_db_privileges) > 0:
                            raise ValueError(
                                "Unsupported db privileges: \"{0}\"".format(
                                    ','.join(invalid_db_privileges)))
                Severity: Minor
                Found in src/core/db/manager.py and 1 other location - About 35 mins to fix
                src/core/db/manager.py on lines 407..410

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 33.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    def delete_table(self, repo, table, force=False):
                        """
                        Deletes a table.
                
                        Set force=True to drop dependent objects (e.g. views).
                Severity: Minor
                Found in src/core/db/manager.py and 1 other location - About 35 mins to fix
                src/core/db/manager.py on lines 281..294

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 33.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                    def delete_view(self, repo, view, force=False):
                        """
                        Deletes a view.
                
                        Set force=True to drop dependent objects (e.g. other views).
                Severity: Minor
                Found in src/core/db/manager.py and 1 other location - About 35 mins to fix
                src/core/db/manager.py on lines 296..309

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 33.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Similar blocks of code found in 2 locations. Consider refactoring.
                Open

                        if len(invalid_file_privileges) > 0:
                            raise ValueError(
                                "Unsupported file privileges: \"{0}\"".format(
                                    ','.join(invalid_file_privileges)))
                Severity: Minor
                Found in src/core/db/manager.py and 1 other location - About 35 mins to fix
                src/core/db/manager.py on lines 402..405

                Duplicated Code

                Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                Tuning

                This issue has a mass of 33.

                We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                Refactorings

                Further Reading

                Invalid escape sequence '.'
                Open

                    return re.sub('^\.+', '', text)
                Severity: Minor
                Found in src/core/db/manager.py by pep8

                Invalid escape sequences are deprecated in Python 3.6.

                Okay: regex = r'\.png$'
                W605: regex = '\.png$'

                Do not use bare 'except'
                Open

                            except:
                Severity: Minor
                Found in src/core/db/manager.py by pep8

                When catching exceptions, mention specific exceptions when possible.

                Okay: except Exception:
                Okay: except BaseException:
                E722: except:

                There are no issues that match your filters.

                Category
                Status