SpamExperts/OrangeAssassin

View on GitHub

Showing 802 of 802 total issues

File bayes.py has 960 lines of code (exceeds 250 allowed). Consider refactoring.
Open

"""Bayes - determine spam likelihood using a Bayesian classifier.

This is a Bayesian-style probabilistic classifier, using an algorithm
based on the one detailed in Paul Graham's "A Plan For Spam" paper at:

Severity: Major
Found in oa/plugins/bayes.py - About 2 days to fix

    File header_eval.py has 728 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    """Expose some eval rules that do checks on the headers."""
    
    from __future__ import absolute_import
    from __future__ import division
    
    
    Severity: Major
    Found in oa/plugins/header_eval.py - About 1 day to fix

      Function _handle_line has a Cognitive Complexity of 57 (exceeds 5 allowed). Consider refactoring.
      Open

          def _handle_line(self, filename, line, line_no, _depth=0):
              """Handles a single line."""
              try:
                  line = line.decode("iso-8859-1").strip()
              except UnicodeDecodeError as e:
      Severity: Minor
      Found in oa/rules/parser.py - About 1 day to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      File received_parser.py has 535 lines of code (exceeds 250 allowed). Consider refactoring.
      Open

      """
      Parser for Received headers
      It extracts the following metadata:
      :rdns
      :ip
      Severity: Major
      Found in oa/received_parser.py - About 1 day to fix

        Function _tokenise_line has a Cognitive Complexity of 51 (exceeds 5 allowed). Consider refactoring.
        Open

            def _tokenise_line(self, line, tokprefix, region):
                # Include quotes, .'s and -'s for URIs, and [$,]'s for Nigerian-scam
                # strings, and ISO-8859-15 alphas. Do not split on @'s; better
                # results keeping it.
                # Some useful tokens: "$31,000,000" "www.clock-speed.net" "f*ck" "Hits!"
        Severity: Minor
        Found in oa/plugins/bayes.py - About 7 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Identical blocks of code found in 3 locations. Consider refactoring.
        Open

            def parse_list(self, list_name):
                parsed_list = []
                characters = ["?", "@", ".", "*@"]
                for addr in self[list_name]:
                    if len([e for e in characters if e in addr]):
        Severity: Major
        Found in oa/plugins/wlbl_eval.py and 2 other locations - About 7 hrs to fix
        oa/plugins/bayes.py on lines 560..571
        oa/plugins/spf.py on lines 148..159

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 121.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Identical blocks of code found in 3 locations. Consider refactoring.
        Open

            def parse_list(self, list_name):
                parsed_list = []
                characters = ["?", "@", ".", "*@"]
                for addr in self[list_name]:
                    if len([e for e in characters if e in addr]):
        Severity: Major
        Found in oa/plugins/spf.py and 2 other locations - About 7 hrs to fix
        oa/plugins/bayes.py on lines 560..571
        oa/plugins/wlbl_eval.py on lines 117..127

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 121.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Identical blocks of code found in 3 locations. Consider refactoring.
        Open

            def parse_list(self, list_name):
                parsed_list = []
                characters = ["?", "@", ".", "*@"]
                for addr in self[list_name]:
                    if len([e for e in characters if e in addr]):
        Severity: Major
        Found in oa/plugins/bayes.py and 2 other locations - About 7 hrs to fix
        oa/plugins/spf.py on lines 148..159
        oa/plugins/wlbl_eval.py on lines 117..127

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 121.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        File message.py has 492 lines of code (exceeds 250 allowed). Consider refactoring.
        Open

        """Internal representation of email messages."""
        
        from builtins import str
        from builtins import set
        from builtins import list
        Severity: Minor
        Found in oa/message.py - About 7 hrs to fix

          Cyclomatic complexity is too high in method _handle_line. (31)
          Open

              def _handle_line(self, filename, line, line_no, _depth=0):
                  """Handles a single line."""
                  try:
                      line = line.decode("iso-8859-1").strip()
                  except UnicodeDecodeError as e:
          Severity: Minor
          Found in oa/rules/parser.py by radon

          Cyclomatic Complexity

          Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

          Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

          Construct Effect on CC Reasoning
          if +1 An if statement is a single decision.
          elif +1 The elif statement adds another decision.
          else +0 The else statement does not cause a new decision. The decision is at the if.
          for +1 There is a decision at the start of the loop.
          while +1 There is a decision at the while statement.
          except +1 Each except branch adds a new conditional path of execution.
          finally +0 The finally block is unconditionally executed.
          with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
          assert +1 The assert statement internally roughly equals a conditional statement.
          Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
          Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

          Source: http://radon.readthedocs.org/en/latest/intro.html

          Cyclomatic complexity is too high in method _tokenise_line. (30)
          Open

              def _tokenise_line(self, line, tokprefix, region):
                  # Include quotes, .'s and -'s for URIs, and [$,]'s for Nigerian-scam
                  # strings, and ISO-8859-15 alphas. Do not split on @'s; better
                  # results keeping it.
                  # Some useful tokens: "$31,000,000" "www.clock-speed.net" "f*ck" "Hits!"
          Severity: Minor
          Found in oa/plugins/bayes.py by radon

          Cyclomatic Complexity

          Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

          Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

          Construct Effect on CC Reasoning
          if +1 An if statement is a single decision.
          elif +1 The elif statement adds another decision.
          else +0 The else statement does not cause a new decision. The decision is at the if.
          for +1 There is a decision at the start of the loop.
          while +1 There is a decision at the while statement.
          except +1 Each except branch adds a new conditional path of execution.
          finally +0 The finally block is unconditionally executed.
          with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
          assert +1 The assert statement internally roughly equals a conditional statement.
          Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
          Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

          Source: http://radon.readthedocs.org/en/latest/intro.html

          Function _parse_relays has a Cognitive Complexity of 38 (exceeds 5 allowed). Consider refactoring.
          Open

              def _parse_relays(self, relays):
                  """Walks though a relays list to extract
                  [un]trusted/internal/external relays"""
                  is_trusted = True
                  is_internal = True
          Severity: Minor
          Found in oa/message.py - About 5 hrs to fix

          Cognitive Complexity

          Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

          A method's cognitive complexity is based on a few simple rules:

          • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
          • Code is considered more complex for each "break in the linear flow of the code"
          • Code is considered more complex when "flow breaking structures are nested"

          Further reading

          File wlbl_eval.py has 404 lines of code (exceeds 250 allowed). Consider refactoring.
          Open

          """ WLBLEval plugin."""
          from __future__ import absolute_import
          from builtins import str
          import re
          from collections import defaultdict
          Severity: Minor
          Found in oa/plugins/wlbl_eval.py - About 5 hrs to fix

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                def check_for_forged_received_trail(self, msg, option=None, target=None):
                    """Check if there are more than one untrusted relays and verify if
                    rdns is different than the other relay's by."""
                    try:
                        mismatch_from = self.get_global("mismatch_from")
            Severity: Major
            Found in oa/plugins/relay_eval.py and 1 other location - About 5 hrs to fix
            oa/plugins/relay_eval.py on lines 139..150

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 90.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Similar blocks of code found in 2 locations. Consider refactoring.
            Open

                def check_for_forged_received_ip_helo(self, msg, option=None, target=None):
                    """Verify if helo and ip are IP ADDRESSES and if they are different,
                    this means that received ip is forged"""
                    try:
                        mismatch_ip_helo = self.get_global("mismatch_ip_helo")
            Severity: Major
            Found in oa/plugins/relay_eval.py and 1 other location - About 5 hrs to fix
            oa/plugins/relay_eval.py on lines 126..137

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 90.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Identical blocks of code found in 2 locations. Consider refactoring.
            Open

            class MessageList(argparse.FileType):
                def __call__(self, string):
                    if os.path.isdir(string):
                        for x in os.listdir(string):
                            path = os.path.join(string, x)
            Severity: Major
            Found in scripts/compile.py and 1 other location - About 5 hrs to fix
            scripts/match.py on lines 26..34

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 89.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            Identical blocks of code found in 2 locations. Consider refactoring.
            Open

            class MessageList(argparse.FileType):
                def __call__(self, string):
                    if os.path.isdir(string):
                        for x in os.listdir(string):
                            path = os.path.join(string, x)
            Severity: Major
            Found in scripts/match.py and 1 other location - About 5 hrs to fix
            scripts/compile.py on lines 21..29

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 89.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            BayesPlugin has 38 functions (exceeds 20 allowed). Consider refactoring.
            Open

            class BayesPlugin(oa.plugins.base.BasePlugin):
                """Implement a somewhat Bayesian plug-in."""
            
                learn_caller_will_untie = False
                learn_no_relearn = False
            Severity: Minor
            Found in oa/plugins/bayes.py - About 5 hrs to fix

              Similar blocks of code found in 2 locations. Consider refactoring.
              Open

                      if "remove" in options:
                          targets = options.get("remove").split(",")
                          local = "local" in targets
                          remote = "remote" in targets
                          self.ruleset.ctxt.hook_revoke(msg, spam, local, remote)
              Severity: Major
              Found in oa/protocol/tell.py and 1 other location - About 5 hrs to fix
              oa/protocol/tell.py on lines 18..23

              Duplicated Code

              Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

              Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

              When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

              Tuning

              This issue has a mass of 86.

              We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

              The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

              If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

              See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

              Refactorings

              Further Reading

              Similar blocks of code found in 2 locations. Consider refactoring.
              Open

                      if "set" in options:
                          targets = options.get("set").split(",")
                          local = "local" in targets
                          remote = "remote" in targets
                          self.ruleset.ctxt.hook_report(msg, spam, local, remote)
              Severity: Major
              Found in oa/protocol/tell.py and 1 other location - About 5 hrs to fix
              oa/protocol/tell.py on lines 24..29

              Duplicated Code

              Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

              Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

              When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

              Tuning

              This issue has a mass of 86.

              We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

              The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

              If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

              See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

              Refactorings

              Further Reading

              Severity
              Category
              Status
              Source
              Language