Function collect_segmented_results
has a Cognitive Complexity of 9 (exceeds 5 allowed). Consider refactoring. Open
def collect_segmented_results(
input_df: SparkDataFrame,
schema: DatasetSchema,
dataset_timestamp: Optional[datetime] = None,
creation_timestamp: Optional[datetime] = None,
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Function whylogs_pandas_segmented_profiler
has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring. Open
def whylogs_pandas_segmented_profiler(
pdf_iterator: Iterable[pd.DataFrame], schema: Optional[DatasetSchema] = None
) -> Iterable[pd.DataFrame]:
if schema is None or not schema.segments:
raise ValueError(
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
TODO found Open
# TODO: optimize this so we split the dataframe by segment first rather than split in the pandas dataframe
- Exclude checks
Line too long (101 > 79 characters) Open
raise ValueError("Segment key is missing required key values or parent_id: {segment_params}")
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (88 > 79 characters) Open
segmented_column_profile_views: Dict[Segment, Dict[str, ColumnProfileView]] = dict()
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (116 > 79 characters) Open
segmented_column_profile_views[segment_key][column_name] = collected_segment_column_profile_views[key_tuple]
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (104 > 79 characters) Open
segment_column_views_dict = collect_segmented_column_profile_views(input_df=input_df, schema=schema)
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (80 > 79 characters) Open
pdf_iterator: Iterable[pd.DataFrame], schema: Optional[DatasetSchema] = None
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (87 > 79 characters) Open
from whylogs.api.pyspark.experimental.profiler import COL_NAME_FIELD, COL_PROFILE_FIELD
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (117 > 79 characters) Open
"Cannot profile segments without segmentation defined in the specified DatasetSchema: no segments found."
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (110 > 79 characters) Open
logger.warning(f"Skipping segment: could not find partition with id {partition_id} in: {schema.segments}")
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (82 > 79 characters) Open
logger.info(f"Processing segmented profiling in spark with {schema.segments}")
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (101 > 79 characters) Open
for col_name, col_profile in segmented_results.view(segmented_key).get_columns().items():
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (106 > 79 characters) Open
def column_profile_bytes_aggregator(group_by_cols: Tuple[str], profiles_df: pd.DataFrame) -> pd.DataFrame:
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (115 > 79 characters) Open
col_name for col_name in input_df.schema.names if isinstance(input_df.schema[col_name].dataType, VectorUDT)
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (93 > 79 characters) Open
res_df = pd.DataFrame(columns=[SEGMENT_KEY_FIELD, COL_NAME_FIELD, COL_PROFILE_FIELD])
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (87 > 79 characters) Open
COL_PROFILE_FIELD: [col_profile.to_protobuf().SerializeToString()],
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (106 > 79 characters) Open
(_string_to_segment(row.segment_key), row.col_name): ColumnProfileView.from_bytes(row.col_profile)
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (103 > 79 characters) Open
whylogs_pandas_map_profiler_with_schema = partial(whylogs_pandas_segmented_profiler, schema=schema)
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (90 > 79 characters) Open
raise ValueError(f"Collision when collecting profiles for segment {segment}!")
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (103 > 79 characters) Open
lambda acc, x: acc.merge(x), profiles_df[COL_PROFILE_FIELD].apply(ColumnProfileView.from_bytes)
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (116 > 79 characters) Open
raise ValueError("Cannot collect segmented results without segments defined in the passed in DatasetSchema")
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (91 > 79 characters) Open
cp = f"{SEGMENT_KEY_FIELD} string, {COL_NAME_FIELD} string, {COL_PROFILE_FIELD} binary"
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (119 > 79 characters) Open
columns=column_views_dict, dataset_timestamp=_dataset_timestamp, creation_timestamp=_creation_timestamp
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (114 > 79 characters) Open
# TODO: optimize this so we split the dataframe by segment first rather than split in the pandas dataframe
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (106 > 79 characters) Open
input_df_arrays = input_df_arrays.withColumn(col_name, vector_to_array(input_df_arrays[col_name]))
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (128 > 79 characters) Open
segmented_profile_bytes_df = input_df_arrays.mapInPandas(whylogs_pandas_map_profiler_with_schema, schema=cp) # type: ignore
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (92 > 79 characters) Open
collected_segment_column_profile_views: Dict[Tuple[Segment, str], ColumnProfileView] = {
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (120 > 79 characters) Open
logger.warning("Unable to load pyspark; install pyspark to get whylogs profiling support in spark environment.")
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (113 > 79 characters) Open
def _lookup_segment_partition_by_id(schema: DatasetSchema, partition_id: str) -> Optional[SegmentationPartition]:
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.
Line too long (160 > 79 characters) Open
f"Skipping segment: could not collect profiles for segment {segment} because the schema has no matching partition ids: {schema.segments.keys()}"
- Read upRead up
- Exclude checks
Limit all lines to a maximum of 79 characters.
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to
have several windows side-by-side. The default wrapping on such
devices looks ugly. Therefore, please limit all lines to a maximum
of 79 characters. For flowing long blocks of text (docstrings or
comments), limiting the length to 72 characters is recommended.
Reports error E501.