DeveloperCAP/MLCAT

View on GitHub
lib/analysis/thread/hypergraph.py

Summary

Maintainability
F
3 days
Test Coverage

Function generate_hyperedge_distribution has a Cognitive Complexity of 65 (exceeds 5 allowed). Consider refactoring.
Open

def generate_hyperedge_distribution(nodelist_filename, edgelist_filename, clean_headers_filename, foldername, time_limit=None, ignore_lat=False):
    """
    Generate the distribution of hyperedges for messages in a certain time limit, stores it as hyperedge_distribution.csv based on edge frequency and generates a diagram stored in plots.

    :param nodelist_filename: The csv file containing the nodes.
Severity: Minor
Found in lib/analysis/thread/hypergraph.py - About 1 day to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Function generate_hyperedges has a Cognitive Complexity of 40 (exceeds 5 allowed). Consider refactoring.
Open

def generate_hyperedges():
    """

    Generates hyperedges from the discussion graph obtained from the nodes and edges stored in graph_nodes.csv and graph_edges.csv.
    All email header information can be represented as one hyperedge of a hypergraph.
Severity: Minor
Found in lib/analysis/thread/hypergraph.py - About 6 hrs to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

File hypergraph.py has 275 lines of code (exceeds 250 allowed). Consider refactoring.
Open

"""
This module is used to model each discussion thread as one hypergraph. All the email header information can be
represented as one hyperedge of a hypergraph. This concise format for representing a discussion thread as a
hypergraph is then stored as a table to a CSV file, with the author column headers containing the ids of the authors.
All the author columns are sorted left to right in the descending order of out degree, followed by in degree. The
Severity: Minor
Found in lib/analysis/thread/hypergraph.py - About 2 hrs to fix

    Function __init__ has 7 arguments (exceeds 4 allowed). Consider refactoring.
    Open

        def __init__(self, msg_id=0, height=-1, parent_id=0, time=None, from_addr=None, to_addr=None, cc_addr=None):
    Severity: Major
    Found in lib/analysis/thread/hypergraph.py - About 50 mins to fix

      Function add_thread_nodes has 7 arguments (exceeds 4 allowed). Consider refactoring.
      Open

      def add_thread_nodes(thread_authors, nbunch, parent_id, curr_height, json_data, thread_nodes, conn_subgraph):
      Severity: Major
      Found in lib/analysis/thread/hypergraph.py - About 50 mins to fix

        Avoid deeply nested control flow statements.
        Open

                            if edge[0] in msgs_before_time and edge[1] in msgs_before_time:
                                discussion_graph.add_edge(*edge)
                    edge_file.close()
        Severity: Major
        Found in lib/analysis/thread/hypergraph.py - About 45 mins to fix

          Function generate_hyperedge_distribution has 6 arguments (exceeds 4 allowed). Consider refactoring.
          Open

          def generate_hyperedge_distribution(nodelist_filename, edgelist_filename, clean_headers_filename, foldername, time_limit=None, ignore_lat=False):
          Severity: Minor
          Found in lib/analysis/thread/hypergraph.py - About 45 mins to fix

            Identical blocks of code found in 2 locations. Consider refactoring.
            Open

                with open("graph_edges.csv", "r") as edge_file:
                    for pair in edge_file:
                        edge = pair.split(';')
                        edge[1] = edge[1].strip()
                        try:
            Severity: Major
            Found in lib/analysis/thread/hypergraph.py and 1 other location - About 7 hrs to fix
            lib/analysis/thread/graph/generate.py on lines 24..34

            Duplicated Code

            Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

            Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

            When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

            Tuning

            This issue has a mass of 119.

            We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

            The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

            If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

            See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

            Refactorings

            Further Reading

            There are no issues that match your filters.

            Category
            Status