SpamExperts/OrangeAssassin

View on GitHub
oa/plugins/pdf_info.py

Summary

Maintainability
F
4 days
Test Coverage

PDFInfoPlugin has 27 functions (exceeds 20 allowed). Consider refactoring.
Open

class PDFInfoPlugin(oa.plugins.base.BasePlugin):
    """PDFInfoPlugin"""

    eval_rules = (
        "pdf_count",
Severity: Minor
Found in oa/plugins/pdf_info.py - About 3 hrs to fix

    File pdf_info.py has 286 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    """ PDFInfo Plugin. """
    
    from __future__ import absolute_import
    
    import collections
    Severity: Minor
    Found in oa/plugins/pdf_info.py - About 2 hrs to fix

      Cyclomatic complexity is too high in method _save_stats. (9)
      Open

          def _save_stats(self, msg, payload):
              """Extracts and saves the PDF stats once per unique file"""
              # Use the md5 as ID to avoid duplicated PDFs
              pdf_id = md5(payload).hexdigest()
              self._update_pdf_hashes(msg, pdf_id)
      Severity: Minor
      Found in oa/plugins/pdf_info.py by radon

      Cyclomatic Complexity

      Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.

      Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:

      Construct Effect on CC Reasoning
      if +1 An if statement is a single decision.
      elif +1 The elif statement adds another decision.
      else +0 The else statement does not cause a new decision. The decision is at the if.
      for +1 There is a decision at the start of the loop.
      while +1 There is a decision at the while statement.
      except +1 Each except branch adds a new conditional path of execution.
      finally +0 The finally block is unconditionally executed.
      with +1 The with statement roughly corresponds to a try/except block (see PEP 343 for details).
      assert +1 The assert statement internally roughly equals a conditional statement.
      Comprehension +1 A list/set/dict comprehension of generator expression is equivalent to a for loop.
      Boolean Operator +1 Every boolean operator (and, or) adds a decision point.

      Source: http://radon.readthedocs.org/en/latest/intro.html

      Function _save_stats has a Cognitive Complexity of 14 (exceeds 5 allowed). Consider refactoring.
      Open

          def _save_stats(self, msg, payload):
              """Extracts and saves the PDF stats once per unique file"""
              # Use the md5 as ID to avoid duplicated PDFs
              pdf_id = md5(payload).hexdigest()
              self._update_pdf_hashes(msg, pdf_id)
      Severity: Minor
      Found in oa/plugins/pdf_info.py - About 1 hr to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Function pdf_match_details has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
      Open

          def pdf_match_details(self, msg, detail, regex, target=None):
              """Match if the detail match with any of the PDF files in the
              message.
      
              :param detail: author, creator, created, modified, producer, title
      Severity: Minor
      Found in oa/plugins/pdf_info.py - About 35 mins to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def pdf_name_regex(self, msg, regex, target=None):
              """The same than pdf_named but you can use regular expressions to
              do partial matches.
      
              :param regex: regular expression, see examples in ruleset.
      Severity: Major
      Found in oa/plugins/pdf_info.py and 1 other location - About 2 hrs to fix
      oa/plugins/image_info.py on lines 178..185

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 60.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Identical blocks of code found in 2 locations. Consider refactoring.
      Open

          def _add_name(self, msg, name):
              """Add a name to the names list."""
              try:
                  names = self.get_local(msg, "names")
              except KeyError:
      Severity: Major
      Found in oa/plugins/pdf_info.py and 1 other location - About 2 hrs to fix
      oa/plugins/image_info.py on lines 100..107

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 57.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def _update_pdf_hashes(self, msg, newhash):
              try:
                  hashes = self.get_local(msg, "md5hashes")
              except KeyError:
                  hashes = set()
      Severity: Major
      Found in oa/plugins/pdf_info.py and 1 other location - About 2 hrs to fix
      oa/plugins/pdf_info.py on lines 257..263

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 54.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def _update_is_encrypted(self, msg, enc):
              try:
                  encrypted = self.get_local(msg, "pdf_encrypted")
              except KeyError:
                  encrypted = set()
      Severity: Major
      Found in oa/plugins/pdf_info.py and 1 other location - About 2 hrs to fix
      oa/plugins/pdf_info.py on lines 176..182

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 54.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

          def _update_pdf_size(self, msg, incr):
              """Update the cummulative size of PDFs"""
              try:
                  pdfbytes = self.get_local(msg, "pdf_bytes")
              except KeyError:
      Severity: Major
      Found in oa/plugins/pdf_info.py and 3 other locations - About 2 hrs to fix
      oa/plugins/pdf_info.py on lines 45..52
      oa/plugins/pdf_info.py on lines 74..81
      oa/plugins/pdf_info.py on lines 104..113

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 53.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

          def _update_pixel_coverage(self, msg, incr):
              """Update the cumulative pixel coverage
              "incr" is the area of the image in pixels
              """
              try:
      Severity: Major
      Found in oa/plugins/pdf_info.py and 3 other locations - About 2 hrs to fix
      oa/plugins/pdf_info.py on lines 45..52
      oa/plugins/pdf_info.py on lines 74..81
      oa/plugins/pdf_info.py on lines 275..282

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 53.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

          def _update_counts(self, msg, incr):
              """Update the cumulative pdf counts"""
              try:
                  counts = self.get_local(msg, "counts")
              except KeyError:
      Severity: Major
      Found in oa/plugins/pdf_info.py and 3 other locations - About 2 hrs to fix
      oa/plugins/pdf_info.py on lines 74..81
      oa/plugins/pdf_info.py on lines 104..113
      oa/plugins/pdf_info.py on lines 275..282

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 53.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 4 locations. Consider refactoring.
      Open

          def _update_image_counts(self, msg, incr):
              """Update the cumulative image counts"""
              try:
                  counts = self.get_local(msg, "image_counts")
              except KeyError:
      Severity: Major
      Found in oa/plugins/pdf_info.py and 3 other locations - About 2 hrs to fix
      oa/plugins/pdf_info.py on lines 45..52
      oa/plugins/pdf_info.py on lines 104..113
      oa/plugins/pdf_info.py on lines 275..282

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 53.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 3 locations. Consider refactoring.
      Open

          def pdf_count(self, msg, minimum, maximum=None, target=None):
              """Check the number of pdf files in the message
      
              :param minimum: required, message contains at least x pdf mime parts
              :param maximum: optional, if specified, must not contain more than x
      Severity: Major
      Found in oa/plugins/pdf_info.py and 2 other locations - About 1 hr to fix
      oa/plugins/pdf_info.py on lines 83..95
      oa/plugins/pdf_info.py on lines 115..126

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 47.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 3 locations. Consider refactoring.
      Open

          def pdf_pixel_coverage(self, msg, minimum, maximum=None, target=None):
              """Check the pixel coverage in the PDF files.
      
              :param minimum: required, message contains at least this much pixel area
              :param maximum: optional, if specified, message must not contain more
      Severity: Major
      Found in oa/plugins/pdf_info.py and 2 other locations - About 1 hr to fix
      oa/plugins/pdf_info.py on lines 54..65
      oa/plugins/pdf_info.py on lines 83..95

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 47.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 3 locations. Consider refactoring.
      Open

          def pdf_image_count(self, msg, minimum, maximum=None, target=None):
              """Check the number of images in the pdf attachments
              
              :param minimum: required, message contains at least x images in pdf
              attachments.
      Severity: Major
      Found in oa/plugins/pdf_info.py and 2 other locations - About 1 hr to fix
      oa/plugins/pdf_info.py on lines 54..65
      oa/plugins/pdf_info.py on lines 115..126

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 47.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Line too long (80 > 79 characters)
      Open

              :param maximum: optional, if specified, must not contain more than x pdf
      Severity: Minor
      Found in oa/plugins/pdf_info.py by pep8

      Limit all lines to a maximum of 79 characters.

      There are still many devices around that are limited to 80 character
      lines; plus, limiting windows to 80 characters makes it possible to
      have several windows side-by-side.  The default wrapping on such
      devices looks ugly.  Therefore, please limit all lines to a maximum
      of 79 characters. For flowing long blocks of text (docstrings or
      comments), limiting the length to 72 characters is recommended.
      
      Reports error E501.

      Blank line contains whitespace
      Open

              
      Severity: Minor
      Found in oa/plugins/pdf_info.py by pep8

      Trailing whitespace is superfluous.

      The warning returned varies on whether the line itself is blank,
      for easier filtering for those who want to indent their blank lines.
      
      Okay: spam(1)\n#
      W291: spam(1) \n#
      W293: class Foo(object):\n    \n    bang = 12

      Line too long (80 > 79 characters)
      Open

              :param minimum: required, message contains at least this much pixel area
      Severity: Minor
      Found in oa/plugins/pdf_info.py by pep8

      Limit all lines to a maximum of 79 characters.

      There are still many devices around that are limited to 80 character
      lines; plus, limiting windows to 80 characters makes it possible to
      have several windows side-by-side.  The default wrapping on such
      devices looks ugly.  Therefore, please limit all lines to a maximum
      of 79 characters. For flowing long blocks of text (docstrings or
      comments), limiting the length to 72 characters is recommended.
      
      Reports error E501.

      There are no issues that match your filters.

      Category
      Status