DeveloperCAP/MLCAT

View on GitHub

Showing 81 of 109 total issues

Function generate_keyword_digest has a Cognitive Complexity of 119 (exceeds 5 allowed). Consider refactoring.
Open

def generate_keyword_digest(mbox_filename, output_filename, author_uid_filename, json_filename, top_n = None, console_output=True):
"""
From the .MBOX file, this function extracts the email content is extracted using two predefined classes
available in the Python Standard Library: Mailbox and Message. Feature vectors are created for all the authors
by obtaining meaningful words from the mail content, after removing the stop words, using NLTK libraries.
Severity: Minor
Found in lib/input/mbox/keyword_digest.py - About 2 days to fix

Function conversation_refresh_times has a Cognitive Complexity of 106 (exceeds 5 allowed). Consider refactoring.
Open

def conversation_refresh_times(headers_filename, nodelist_filename, edgelist_filename, foldername, time_ubound = None, time_lbound = None, plot=False, ignore_lat = False):
"""
 
:param headers_filename: The JSON file containing the headers.
:param nodelist_filename: The csv file containing the nodes.
Severity: Minor
Found in lib/analysis/author/time_statistics.py - About 2 days to fix

Function vertex_clustering has a Cognitive Complexity of 83 (exceeds 5 allowed). Consider refactoring.
Open

def vertex_clustering(json_filename, nodelist_filename, edgelist_filename, foldername, time_limit=None, ignore_lat=False):
"""
This function performs vertex clustering on the dataset passed in the parameters and saves the dendrogram resulting
from the vertex clustering as a PDF along with the visualization of the vertex cluster itself. It is recommended to
limit these graphs to 200 authors as the visualization becomes incompehensible beyond that.
Severity: Minor
Found in lib/analysis/author/community.py - About 1 day to fix

Function get has a Cognitive Complexity of 73 (exceeds 5 allowed). Consider refactoring.
Open

def get(json_filename, output_filename, active_score, passive_score, write_to_file=True):
"""
 
:param json_data: The JSON file containing the headers.
:param output_filename: Stores authors' email address,score and rank.
Severity: Minor
Found in lib/analysis/author/ranking.py - About 1 day to fix

Function generate_wh_table_authors has a Cognitive Complexity of 66 (exceeds 5 allowed). Consider refactoring.
Open

def generate_wh_table_authors(nodelist_filename, edgelist_filename, output_filename, ignore_lat=False, time_limit=None):
"""
This module is used to generate the author version of the width height table. The width height table for the
authors is a representation of the number of total and new authors in a thread aggregated at a given generation.
The table, which itself is temporarily stored in a two dimensional array, is then written into a CSV file. These
Severity: Minor
Found in lib/analysis/author/wh_table.py - About 1 day to fix

Function generate_hyperedge_distribution has a Cognitive Complexity of 65 (exceeds 5 allowed). Consider refactoring.
Open

def generate_hyperedge_distribution(nodelist_filename, edgelist_filename, clean_headers_filename, foldername, time_limit=None, ignore_lat=False):
"""
Generate the distribution of hyperedges for messages in a certain time limit, stores it as hyperedge_distribution.csv based on edge frequency and generates a diagram stored in plots.
 
:param nodelist_filename: The csv file containing the nodes.
Severity: Minor
Found in lib/analysis/thread/hypergraph.py - About 1 day to fix

Function generate_time_stats_threads has a Cognitive Complexity of 55 (exceeds 5 allowed). Consider refactoring.
Open

def generate_time_stats_threads(nodelist_filename, edgelist_filename, clean_headers_filename, foldername, time_lbound=None, time_ubound=None, plot=False):
"""
Generates and plots statistics for inter-arrival of consecutive messages and distribution of length of each disccussion thread.
 
:param nodelist_filename: The csv file containing the nodes.
Severity: Minor
Found in lib/analysis/thread/time_statistics.py - About 1 day to fix

Function generate_wh_table_threads has a Cognitive Complexity of 45 (exceeds 5 allowed). Consider refactoring.
Open

def generate_wh_table_threads(nodelist_filename, edgelist_filename, output_filename, ignore_lat=False, time_limit=None):
"""
Generate the thread width height table, which is a representation of the number of nodes in the graph that have a
given height and a given number of children in a tabular form. This table provides an aggregate statistical view of
Severity: Minor
Found in lib/analysis/thread/wh_table.py - About 6 hrs to fix

Function generate_edge_list has a Cognitive Complexity of 42 (exceeds 5 allowed). Consider refactoring.
Open

def generate_edge_list(author_nodes, author_edges, graph_nodes,
graph_edges, threads_json, author_json, ignore_lat=True):
"""
:param author_nodes: The csv file containing the author nodes data.
:param author_edges: The csv file containing the author edges data.
Severity: Minor
Found in lib/analysis/author/edge_list.py - About 6 hrs to fix

Function generate has a Cognitive Complexity of 42 (exceeds 5 allowed). Consider refactoring.
Open

def generate(ignore_lat=False, time_limit=None):
"""
 
This function generate a table containing the number of mails in a thread and the corresponding aggregate count
of the number of threads that have that number of mails in them, along with the total number of authors who have
Severity: Minor
Found in lib/analysis/thread/ps_table.py - About 6 hrs to fix

Function author_interaction has a Cognitive Complexity of 41 (exceeds 5 allowed). Consider refactoring.
Open

def author_interaction(clean_data, graph_nodes, graph_edges, pajek_file, ignore_lat=True):
"""
Prints the number of strongly connected components,weekly connected components, number of nodes and edges from the author graph.
 
:param clean_data: Path to clean_data.json file
Severity: Minor
Found in lib/analysis/author/graph/generate.py - About 6 hrs to fix

Function weighted_multigraph has a Cognitive Complexity of 40 (exceeds 5 allowed). Consider refactoring.
Open

def weighted_multigraph(graph_nodes, graph_edges, clean_data, output_dir, ignore_lat = False):
"""
 
Calls other functions to generate graphs that show the interaction between authors either through multiple edges or
through edge weights.
Severity: Minor
Found in lib/analysis/author/graph/interaction.py - About 6 hrs to fix

Function generate_hyperedges has a Cognitive Complexity of 40 (exceeds 5 allowed). Consider refactoring.
Open

def generate_hyperedges():
"""
 
Generates hyperedges from the discussion graph obtained from the nodes and edges stored in graph_nodes.csv and graph_edges.csv.
All email header information can be represented as one hyperedge of a hypergraph.
Severity: Minor
Found in lib/analysis/thread/hypergraph.py - About 6 hrs to fix

Function remove_invalid_references has a Cognitive Complexity of 40 (exceeds 5 allowed). Consider refactoring.
Open

def remove_invalid_references(input_json_filename, output_json_filename, ref_toggle=False):
"""
This function is used to remove headers associated with invalid references.
:param input_json_filename: The json file containing all the references.
Severity: Minor
Found in lib/input/data_cleanup.py - About 6 hrs to fix

Function get_mail_header has a Cognitive Complexity of 39 (exceeds 5 allowed). Consider refactoring.
Open

def get_mail_header(to_get, range_=True, uid_map_filename='thread_uid_map.json'):
"""
This function fetches the emails from the IMAP server as per the parameters passed.
:param to_get: List of UIDs of the mails to get. Default value is 2000.
Severity: Minor
Found in lib/input/imap/header.py - About 5 hrs to fix

Function generate_message_activity_heatmaps has a Cognitive Complexity of 38 (exceeds 5 allowed). Consider refactoring.
Open

def generate_message_activity_heatmaps(clean_headers_filename, foldername, timeline=True):
"""
Extract header information and call functions to generate various timelines or heatmaps.
 
:param clean_headers_filename: The JSON file containing cleaned headers.
Severity: Minor
Found in lib/analysis/thread/message_activity.py - About 5 hrs to fix

Function get_message_body has a Cognitive Complexity of 35 (exceeds 5 allowed). Consider refactoring.
Open

def get_message_body(message):
"""
Gets the message body of the message passed as a parameter.
 
:param message: The message whose body is to be extracted.
Severity: Minor
Found in lib/input/mbox/keyword_digest.py - About 5 hrs to fix

Function get_message_body has a Cognitive Complexity of 35 (exceeds 5 allowed). Consider refactoring.
Open

def get_message_body(message):
"""
Gets the message body of the message.
 
:param message: The message whose body is to be extracted.
Severity: Minor
Found in lib/input/mbox/keyword_clustering.py - About 5 hrs to fix

Function extract_mail_header has a Cognitive Complexity of 33 (exceeds 5 allowed). Consider refactoring.
Open

def extract_mail_header(mbox_filename, json_filename='headers.json', thread_uid_filename='thread_uid_map.json', author_uid_filename='author_uid_map.json'):
"""
From the .MBOX file, this function extracts the header information is extracted using two predefined classes
available in the Python Standard Library: Mailbox and Message, for accessing and manipulating on-disk mailboxes
and the messages they contain respectively. The headers are then saved to a JSON file. an unique Message-ID is
Severity: Minor
Found in lib/input/mbox/mbox_hdr.py - About 4 hrs to fix

Function digraph has a Cognitive Complexity of 23 (exceeds 5 allowed). Consider refactoring.
Open

def digraph():
"""
This function is used to generate a thread-wise view of the entire mailing list by saving the a graph representing the
messages in a thread as a tree using the References and In-Reply-TO fields from the mail headers. The thread graphs are
then saved to GEXF, DOT and PNG formats. All authors of a thread are identified and each author is given a unique color.
Severity: Minor
Found in lib/analysis/thread/graph/generate.py - About 3 hrs to fix
Severity
Category
Status
Source
Language