crawley-project/crawley

View on GitHub

Showing 50 of 50 total issues

Identical blocks of code found in 2 locations. Consider refactoring.
Open

def generate_template(tm_name, project_name, output_dir, new_extension=None):
    """
        Generates a project's file from a template
    """

Severity: Major
Found in crawley/manager/utils.py and 1 other location - About 1 day to fix
crawley/utils/projects.py on lines 32..50

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 149.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

def generate_template(tm_name, project_name, output_dir, new_extension=None):
    """
        Generates a project's file from a template
    """

Severity: Major
Found in crawley/utils/projects.py and 1 other location - About 1 day to fix
crawley/manager/utils.py on lines 51..69

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 149.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

File browser.py has 383 lines of code (exceeds 250 allowed). Consider refactoring.
Open

import multiprocessing

from lxml import etree
from PyQt4 import QtCore, QtWebKit, QtGui
from baseBrowser import BaseBrowser, BaseBrowserTab, FrmBaseConfig, FrmBaseSettings
Severity: Minor
Found in crawley/web_browser/browser.py - About 5 hrs to fix

    Similar blocks of code found in 4 locations. Consider refactoring.
    Open

    Severity: Major
    Found in examples/pypi_crawler_dsl/settings.py and 3 other locations - About 4 hrs to fix
    examples/pypi_crawler_smart/settings.py on lines 0..15
    examples/pypi_packages/relational_storage/settings.py on lines 0..15
    examples/pypi_packages_template/settings.py on lines 0..15

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 79.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 4 locations. Consider refactoring.
    Open

    Severity: Major
    Found in examples/pypi_crawler_smart/settings.py and 3 other locations - About 4 hrs to fix
    examples/pypi_crawler_dsl/settings.py on lines 0..15
    examples/pypi_packages/relational_storage/settings.py on lines 0..15
    examples/pypi_packages_template/settings.py on lines 0..15

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 79.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 4 locations. Consider refactoring.
    Open

    Severity: Major
    Found in examples/pypi_packages_template/settings.py and 3 other locations - About 4 hrs to fix
    examples/pypi_crawler_dsl/settings.py on lines 0..15
    examples/pypi_crawler_smart/settings.py on lines 0..15
    examples/pypi_packages/relational_storage/settings.py on lines 0..15

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 79.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 4 locations. Consider refactoring.
    Open

    Severity: Major
    Found in examples/pypi_packages/relational_storage/settings.py and 3 other locations - About 4 hrs to fix
    examples/pypi_crawler_dsl/settings.py on lines 0..15
    examples/pypi_crawler_smart/settings.py on lines 0..15
    examples/pypi_packages_template/settings.py on lines 0..15

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 79.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

    class CustomDict(dict):
    
        def __init__(self, error="[%s] Not valid argument", *args, **kwargs):
    
            self.error = error
    Severity: Major
    Found in crawley/utils/collections/custom_dict.py and 1 other location - About 4 hrs to fix
    crawley/manager/utils.py on lines 7..19

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 76.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

    class CustomDict(dict):
    
        def __init__(self, error="[%s] Not valid argument", *args, **kwargs):
    
            self.error = error
    Severity: Major
    Found in crawley/manager/utils.py and 1 other location - About 4 hrs to fix
    crawley/utils/collections/custom_dict.py on lines 3..15

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 76.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Function setupUi has 88 lines of code (exceeds 25 allowed). Consider refactoring.
    Open

        def setupUi(self, Settings):
            Settings.setObjectName("Settings")
            Settings.resize(340, 476)
            self.groupBox = QtGui.QGroupBox(Settings)
            self.groupBox.setGeometry(QtCore.QRect(10, 10, 541, 241))
    Severity: Major
    Found in crawley/web_browser/GUI/settings.py - About 3 hrs to fix

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

      class FrmBaseSettings(FrmSettingsGUI):
      
          def __init__(self, parent):
      
              FrmSettingsGUI.__init__(self, parent)
      Severity: Major
      Found in crawley/web_browser/baseBrowser.py and 1 other location - About 3 hrs to fix
      crawley/web_browser/baseBrowser.py on lines 91..97

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 67.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

      class FrmBaseConfig(FrmConfigGUI):
      
          def __init__(self, parent):
      
              FrmConfigGUI.__init__(self, parent)
      Severity: Major
      Found in crawley/web_browser/baseBrowser.py and 1 other location - About 3 hrs to fix
      crawley/web_browser/baseBrowser.py on lines 100..106

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 67.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Function setupUi has 76 lines of code (exceeds 25 allowed). Consider refactoring.
      Open

          def setupUi(self, MainWindow):
              MainWindow.setObjectName("MainWindow")
              MainWindow.resize(1132, 671)
              self.centralwidget = QtGui.QWidget(MainWindow)
              font = QtGui.QFont()
      Severity: Major
      Found in crawley/web_browser/GUI/base.py - About 3 hrs to fix

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

            def scrape(self, response):
        
                project = response.html.xpath("/html/body/div[5]/div/div/div[3]/h1")[0].text
                author = response.html.xpath("/html/body/div[5]/div/div/div[3]/ul/li/span")[0].text
        
        
        Severity: Major
        Found in examples/pypi_crawler_smart/pypi_crawler/crawlers.py and 1 other location - About 2 hrs to fix
        examples/pypi_crawler/pypi_crawler/crawlers.py on lines 11..16

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 61.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

            def scrape(self, response):
        
                project = response.html.xpath("/html/body/div[5]/div/div/div[3]/h1")[0].text
                author = response.html.xpath("/html/body/div[5]/div/div/div[3]/ul/li/span")[0].text
        
        
        Severity: Major
        Found in examples/pypi_crawler/pypi_crawler/crawlers.py and 1 other location - About 2 hrs to fix
        examples/pypi_crawler_smart/pypi_crawler/crawlers.py on lines 14..19

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 61.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        OrderedDict has 25 functions (exceeds 20 allowed). Consider refactoring.
        Open

        class OrderedDict(dict):
            'Dictionary that remembers insertion order'
            # An inherited dict maps keys to values.
            # The inherited dict provides __getitem__, __len__, __contains__, and get.
            # The remaining methods are order-aware.
        Severity: Minor
        Found in crawley/utils/collections/ordered_dict.py - About 2 hrs to fix

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

              def __iter__(self):
                  'od.__iter__() <==> iter(od)'
                  root = self.__root
                  curr = root[1]
                  while curr is not root:
          Severity: Major
          Found in crawley/utils/collections/ordered_dict.py and 1 other location - About 2 hrs to fix
          crawley/utils/collections/ordered_dict.py on lines 70..76

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 59.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

              def __reversed__(self):
                  'od.__reversed__() <==> reversed(od)'
                  root = self.__root
                  curr = root[0]
                  while curr is not root:
          Severity: Major
          Found in crawley/utils/collections/ordered_dict.py and 1 other location - About 2 hrs to fix
          crawley/utils/collections/ordered_dict.py on lines 62..68

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 59.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          BrowserTab has 24 functions (exceeds 20 allowed). Consider refactoring.
          Open

          class BrowserTab(BaseBrowserTab):
              """
                  A Browser Tab representation
          
                  This class overrides all the methods of the
          Severity: Minor
          Found in crawley/web_browser/browser.py - About 2 hrs to fix

            Function _gen_scrape_method has a Cognitive Complexity of 17 (exceeds 5 allowed). Consider refactoring.
            Open

                def _gen_scrape_method(self, sentences):
                    """
                        Generates scrapers methods.
                        Returns a dictionary containing methods and attributes for the
                        scraper class.
            Severity: Minor
            Found in crawley/simple_parser/compilers.py - About 2 hrs to fix

            Cognitive Complexity

            Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

            A method's cognitive complexity is based on a few simple rules:

            • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
            • Code is considered more complex for each "break in the linear flow of the code"
            • Code is considered more complex when "flow breaking structures are nested"

            Further reading

            Severity
            Category
            Status
            Source
            Language