DeveloperCAP/MLCAT

View on GitHub

Showing 109 of 109 total issues

Identical blocks of code found in 2 locations. Consider refactoring.
Open

for i in range(1, num_vertices+1):
line = lines_in_file[i].split()
line[1] = "\"" + line[1] + "\""
del line[2:]
line.append("\n")
Severity: Major
Found in lib/analysis/author/graph/generate.py and 1 other location - About 5 hrs to fix
lib/analysis/author/community.py on lines 34..39

Identical blocks of code found in 2 locations. Consider refactoring.
Open

if edge[0] not in lone_author_threads and edge[1] not in lone_author_threads:
if edge[0] in msgs_before_time and edge[1] in msgs_before_time:
try:
discussion_graph.node[edge[0]]['sender']
discussion_graph.node[edge[1]]['sender']
Severity: Major
Found in lib/analysis/author/wh_table.py and 1 other location - About 5 hrs to fix
lib/analysis/author/wh_table.py on lines 47..52

Identical blocks of code found in 2 locations. Consider refactoring.
Open

for i in range(1, num_vertices+1):
line = lines_in_file[i].split()
line[1] = "\"" + line[1] + "\""
del line[2:]
line.append("\n")
Severity: Major
Found in lib/analysis/author/community.py and 1 other location - About 5 hrs to fix
lib/analysis/author/graph/generate.py on lines 26..31

Function generate_message_activity_heatmaps has a Cognitive Complexity of 38 (exceeds 5 allowed). Consider refactoring.
Open

def generate_message_activity_heatmaps(clean_headers_filename, foldername, timeline=True):
"""
Extract header information and call functions to generate various timelines or heatmaps.
 
:param clean_headers_filename: The JSON file containing cleaned headers.
Severity: Minor
Found in lib/analysis/thread/message_activity.py - About 5 hrs to fix

Function get_message_body has a Cognitive Complexity of 35 (exceeds 5 allowed). Consider refactoring.
Open

def get_message_body(message):
"""
Gets the message body of the message passed as a parameter.
 
:param message: The message whose body is to be extracted.
Severity: Minor
Found in lib/input/mbox/keyword_digest.py - About 5 hrs to fix

Function get_message_body has a Cognitive Complexity of 35 (exceeds 5 allowed). Consider refactoring.
Open

def get_message_body(message):
"""
Gets the message body of the message.
 
:param message: The message whose body is to be extracted.
Severity: Minor
Found in lib/input/mbox/keyword_clustering.py - About 5 hrs to fix

Function extract_mail_header has a Cognitive Complexity of 33 (exceeds 5 allowed). Consider refactoring.
Open

def extract_mail_header(mbox_filename, json_filename='headers.json', thread_uid_filename='thread_uid_map.json', author_uid_filename='author_uid_map.json'):
"""
From the .MBOX file, this function extracts the header information is extracted using two predefined classes
available in the Python Standard Library: Mailbox and Message, for accessing and manipulating on-disk mailboxes
and the messages they contain respectively. The headers are then saved to a JSON file. an unique Message-ID is
Severity: Minor
Found in lib/input/mbox/mbox_hdr.py - About 4 hrs to fix

Identical blocks of code found in 2 locations. Consider refactoring.
Open

if jfile['References']:
ref_list = str(jfile['References']).split(',')
# Message Id of the parent mail is appended to the end of the list of references.
parent_id = int(ref_list[-1])
if parent_id and parent_id < msg_id:
Severity: Major
Found in lib/analysis/thread/graph/edge_list.py and 1 other location - About 3 hrs to fix
lib/analysis/thread/graph/edge_list.py on lines 23..28

Identical blocks of code found in 2 locations. Consider refactoring.
Open

if jfile['References']:
ref_list = str(jfile['References']).split(',')
# Message Id of the parent mail is appended to the end of the list of references.
parent_id = int(ref_list[-1])
if parent_id and parent_id < msg_id:
Severity: Major
Found in lib/analysis/thread/graph/edge_list.py and 1 other location - About 3 hrs to fix
lib/analysis/thread/graph/edge_list.py on lines 60..65

Function digraph has a Cognitive Complexity of 23 (exceeds 5 allowed). Consider refactoring.
Open

def digraph():
"""
This function is used to generate a thread-wise view of the entire mailing list by saving the a graph representing the
messages in a thread as a tree using the References and In-Reply-TO fields from the mail headers. The thread graphs are
then saved to GEXF, DOT and PNG formats. All authors of a thread are identified and each author is given a unique color.
Severity: Minor
Found in lib/analysis/thread/graph/generate.py - About 3 hrs to fix

Function check_validity has a Cognitive Complexity of 22 (exceeds 5 allowed). Consider refactoring.
Open

def check_validity(self, check_unavailable_uid='False', json_headers='headers.json'):
"""
 
This function checks for and prints duplicate, missing, and invalid objects in the "headers.json" file.
This function can be run first to generate a list of duplicate, missing, or invalid objects' UIDs which
Severity: Minor
Found in lib/input/check_headers.py - About 3 hrs to fix

Function generate_kmeans_clustering has a Cognitive Complexity of 22 (exceeds 5 allowed). Consider refactoring.
Open

def generate_kmeans_clustering(mbox_filename, output_filename, author_uid_filename, json_filename, top_n = None):
"""
From the .MBOX file, this function extracts the email content is extracted using two predefined classes
available in the Python Standard Library: Mailbox and Message. Feature vectors are created for all the authors
by obtaining meaningful words from the mail content, after removing the stop words, using NLTK libraries.
Severity: Minor
Found in lib/input/mbox/keyword_clustering.py - About 3 hrs to fix

Function get_utc_time has a Cognitive Complexity of 20 (exceeds 5 allowed). Consider refactoring.
Open

def get_utc_time(orig_time):
"""
A function to convert a formatted string containing date and time from a local timezone to UTC, by taking into
consideration multiple formats of the input parameter
 
Severity: Minor
Found in lib/util/read.py - About 2 hrs to fix

Identical blocks of code found in 3 locations. Consider refactoring.
Open

bin_number = 2*json_obj['Time'].hour if json_obj['Time'].minute < 30 else 2*json_obj['Time'].hour+1
Severity: Major
Found in lib/analysis/thread/message_activity.py and 2 other locations - About 2 hrs to fix
lib/analysis/thread/message_activity.py on lines 17..17
lib/analysis/thread/message_activity.py on lines 54..54

Identical blocks of code found in 3 locations. Consider refactoring.
Open

bin_number = 2*json_obj['Time'].hour if json_obj['Time'].minute < 30 else 2*json_obj['Time'].hour+1
Severity: Major
Found in lib/analysis/thread/message_activity.py and 2 other locations - About 2 hrs to fix
lib/analysis/thread/message_activity.py on lines 17..17
lib/analysis/thread/message_activity.py on lines 36..36

Identical blocks of code found in 3 locations. Consider refactoring.
Open

bin_number = 2*json_obj['Time'].hour if json_obj['Time'].minute < 30 else 2*json_obj['Time'].hour+1
Severity: Major
Found in lib/analysis/thread/message_activity.py and 2 other locations - About 2 hrs to fix
lib/analysis/thread/message_activity.py on lines 36..36
lib/analysis/thread/message_activity.py on lines 54..54

Function get_datetime_object has a Cognitive Complexity of 20 (exceeds 5 allowed). Consider refactoring.
Open

def get_datetime_object(orig_time):
"""A function to convert a formatted string containing date and time from a local timezone to UTC, by taking into consideration multiple formats of the input parameter and then return the corresponding datetime object.
 
:param orig_time: Formatted string containing a date and time from a local timezone
:return: A datetime object corresponding to the input string in UTC
Severity: Minor
Found in lib/util/read.py - About 2 hrs to fix

File hypergraph.py has 275 lines of code (exceeds 250 allowed). Consider refactoring.
Open

"""
This module is used to model each discussion thread as one hypergraph. All the email header information can be
represented as one hyperedge of a hypergraph. This concise format for representing a discussion thread as a
hypergraph is then stored as a table to a CSV file, with the author column headers containing the ids of the authors.
All the author columns are sorted left to right in the descending order of out degree, followed by in degree. The
Severity: Minor
Found in lib/analysis/thread/hypergraph.py - About 2 hrs to fix

    Function add_to_weighted_graph has a Cognitive Complexity of 18 (exceeds 5 allowed). Consider refactoring.
    Open

    def add_to_weighted_graph(graph_obj, discussion_graph, json_data, nbunch, node_enum=list()):
    """
    Add weighted edges to the DiGraph object recursively.
     
    :param graph_obj: Object for a directed graph with mulitple edges.
    Severity: Minor
    Found in lib/analysis/author/graph/interaction.py - About 2 hrs to fix

    Identical blocks of code found in 2 locations. Consider refactoring.
    Open

    if jfile['In-Reply-To']:
    parent_id = jfile['In-Reply-To']
    if parent_id and parent_id < msg_id:
    edges.add((parent_id, msg_id))
    Severity: Major
    Found in lib/analysis/thread/graph/edge_list.py and 1 other location - About 1 hr to fix
    lib/analysis/thread/graph/edge_list.py on lines 29..32
    Severity
    Category
    Status
    Source
    Language