File parser.py
has 675 lines of code (exceeds 250 allowed). Consider refactoring. Open
"""Fonduer parser."""
import itertools
import logging
import re
import warnings
Cyclomatic complexity is too high in method _parse_table. (18) Open
def _parse_table(self, node: HtmlElement, state: Dict[str, Any]) -> Dict[str, Any]:
"""Parse a table node.
:param node: The lxml table node to parse
:param state: The global state necessary to place the node in context
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Cyclomatic complexity is too high in method _parse_sentence. (18) Open
def _parse_sentence(
self, paragraph: Paragraph, node: HtmlElement, state: Dict[str, Any]
) -> Iterator[Sentence]:
"""Parse the Sentences of the node.
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Cyclomatic complexity is too high in method _parse_paragraph. (17) Open
def _parse_paragraph(
self, node: HtmlElement, state: Dict[str, Any]
) -> Iterator[Sentence]:
"""Parse a Paragraph of the node.
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Cyclomatic complexity is too high in method _parse_figure. (15) Open
def _parse_figure(self, node: HtmlElement, state: Dict[str, Any]) -> Dict[str, Any]:
"""Parse the figure node.
:param node: The lxml img node to parse
:param state: The global state necessary to place the node in context
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Cyclomatic complexity is too high in method parse. (12) Open
def parse(self, document: Document, text: str) -> Iterator[Sentence]:
"""Depth-first search over the provided tree.
Implemented as an iterative procedure. The structure of the state
needed to parse each node is also defined in this function.
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Cyclomatic complexity is too high in class ParserUDF. (10) Open
class ParserUDF(UDF):
"""Parser UDF class."""
def __init__(
self,
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Function __init__
has 18 arguments (exceeds 4 allowed). Consider refactoring. Open
def __init__(
Cyclomatic complexity is too high in method __init__. (8) Open
def __init__(
self,
structural: bool,
blacklist: Union[str, List[str]],
flatten: Union[str, List[str]],
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Cyclomatic complexity is too high in method apply. (6) Open
def apply( # type: ignore
self, document: Document, **kwargs: Any
) -> Optional[Document]:
"""Parse a text in an instance of Document.
- Read upRead up
- Exclude checks
Cyclomatic Complexity
Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.
Radon analyzes the AST tree of a Python program to compute Cyclomatic Complexity. Statements have the following effects on Cyclomatic Complexity:
Construct | Effect on CC | Reasoning |
---|---|---|
if | +1 | An if statement is a single decision. |
elif | +1 | The elif statement adds another decision. |
else | +0 | The else statement does not cause a new decision. The decision is at the if. |
for | +1 | There is a decision at the start of the loop. |
while | +1 | There is a decision at the while statement. |
except | +1 | Each except branch adds a new conditional path of execution. |
finally | +0 | The finally block is unconditionally executed. |
with | +1 | The with statement roughly corresponds to a try/except block (see PEP 343 for details). |
assert | +1 | The assert statement internally roughly equals a conditional statement. |
Comprehension | +1 | A list/set/dict comprehension of generator expression is equivalent to a for loop. |
Boolean Operator | +1 | Every boolean operator (and, or) adds a decision point. |
Function __init__
has 11 arguments (exceeds 4 allowed). Consider refactoring. Open
def __init__(
Function __init__
has a Cognitive Complexity of 9 (exceeds 5 allowed). Consider refactoring. Open
def __init__(
self,
structural: bool,
blacklist: Union[str, List[str]],
flatten: Union[str, List[str]],
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Avoid deeply nested control flow statements. Open
if r.search(styles.text) is not None:
if cur_style_index is not None:
parts["html_attrs"][cur_style_index] += (
r.search(styles.text)
.group(3)
Function apply
has 5 arguments (exceeds 4 allowed). Consider refactoring. Open
def apply( # type: ignore