Line length Open
* **Top frequent items** (default is 128). Note that this configuration affects the memory footprint, especially for text features.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
| Lending Club | 1.6GB | 2.2M | 151 | 14MB | 7.4MB |
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
WhyLogs resolves this by allowing users to merge sketches from different machines. To merge two WhyLogs
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Fenced code blocks should be surrounded by blank lines Open
```
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Fenced code blocks should be surrounded by blank lines Open
```
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Line length Open
* **Data Insight:** WhyLogs provides complex statistics across different stages of your ML/AI pipelines and applications.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
**Pipeline:** One or more datasets used to build a single model or application. A project may contain multiple pipelines.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
**WhyLogs Output:** WhyLogs returns profile summary files for a dataset in JSON format. For convenience, these files are provided in flat table, histogram, and frequency formats.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
WhyLogs collects approximate statistics and sketches of data on a column-basis into a statistical profile. These metrics include:
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
WhyLogs uses Protobuf as the backing storage format. To write the data to disk, use the standard Protobuf
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
.groupBy("City", "Priority") // tag and group the data with categorical information
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
For further analysis, dataframes can be stored in a Parquet file, or collected to the driver if the number of entries is small enough.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Headers should be surrounded by blank lines Open
## Glossary/Concepts
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.
Lists should be surrounded by blank lines Open
1. Simpler API that gets you started quickly.
- Read upRead up
- Exclude checks
MD032 - Lists should be surrounded by blank lines
Tags: bullet, ul, ol, blank_lines
Aliases: blanks-around-lists
This rule is triggered when lists (of any kind) are either not preceded or not followed by a blank line:
Some text
* Some
* List
1. Some
2. List
Some text
To fix this, ensure that all lists have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
* Some
* List
1. Some
2. List
Some text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse lists that don't have blank lines before and after them.
Note: List items without hanging indents are a violation of this rule; list items with hanging indents are okay:
* This is
not okay
* This is
okay
Line length Open
Understanding the properties of data as it moves through applications is essential to keeping your ML/AI pipeline stable and improving your user experience, whether your pipeline is built for production or experimentation. WhyLogs is an open source statistical logging library that allows data science and ML teams to effortlessly profile ML/AI pipelines and applications, producing log files that can be used for monitoring, alerts, analytics, and error analysis.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
* **Scalability:** WhyLogs scales with your system, from local development mode to live production systems in multi-node clusters, and works well with batch and streaming architectures.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Multiple top level headers in the same document Open
# WhyLogs Java Library
- Read upRead up
- Exclude checks
MD025 - Multiple top level headers in the same document
Tags: headers
Aliases: single-h1
Parameters: level (number; default 1)
This rule is triggered when a top level header is in use (the first line of the file is a h1 header), and more than one h1 header is in use in the document:
# Top level header
# Another top level header
To fix, structure your document so that there is a single h1 header that is the title for the document, and all later headers are h2 or lower level headers:
# Title
## Header
## Another header
Rationale: A top level header is a h1 on the first line of the file, and serves as the title for the document. If this convention is in use, then there can not be more than one title for the document, and the entire document should be contained within this header.
Note: The level
parameter can be used to change the top level (ex: to h2) in
cases where an h1 is added externally.
Multiple top level headers in the same document Open
# Key Features
- Read upRead up
- Exclude checks
MD025 - Multiple top level headers in the same document
Tags: headers
Aliases: single-h1
Parameters: level (number; default 1)
This rule is triggered when a top level header is in use (the first line of the file is a h1 header), and more than one h1 header is in use in the document:
# Top level header
# Another top level header
To fix, structure your document so that there is a single h1 header that is the title for the document, and all later headers are h2 or lower level headers:
# Title
## Header
## Another header
Rationale: A top level header is a h1 on the first line of the file, and serves as the title for the document. If this convention is in use, then there can not be more than one title for the document, and the entire document should be contained within this header.
Note: The level
parameter can be used to change the top level (ex: to h2) in
cases where an h1 is added externally.
Fenced code blocks should be surrounded by blank lines Open
```xml
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Fenced code blocks should be surrounded by blank lines Open
```
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Headers should be surrounded by blank lines Open
## Performance
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.
Trailing punctuation in header Open
# Version upgrade to 1.0.0 coming soon!
- Read upRead up
- Exclude checks
MD026 - Trailing punctuation in header
Tags: headers
Aliases: no-trailing-punctuation
Parameters: punctuation (string; default ".,;:!?")
This rule is triggered on any header that has a punctuation character as the last character in the line:
# This is a header.
To fix this, remove any trailing punctuation:
# This is a header
Note: The punctuation parameter can be used to specify what characters class
as punctuation at the end of the header. For example, you can set it to
'.,;:!'
to allow headers with question marks in them, such as might be used
in an FAQ.
Line length Open
* **Observability:** In addition to supporting traditional monitoring approaches, WhyLogs data can support advanced ML-focused analytics, error analysis, and data quality and data drift detection.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
try (final OutputStream fos = Files.newOutputStream(Paths.get("profile.bin"))) {
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Headers should be surrounded by blank lines Open
# WhyLogs Java Library
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.
Lists should be surrounded by blank lines Open
* Have the same name
- Read upRead up
- Exclude checks
MD032 - Lists should be surrounded by blank lines
Tags: bullet, ul, ol, blank_lines
Aliases: blanks-around-lists
This rule is triggered when lists (of any kind) are either not preceded or not followed by a blank line:
Some text
* Some
* List
1. Some
2. List
Some text
To fix this, ensure that all lists have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
* Some
* List
1. Some
2. List
Some text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse lists that don't have blank lines before and after them.
Note: List items without hanging indents are a violation of this rule; list items with hanging indents are okay:
* This is
not okay
* This is
okay
Line length Open
For now, follow this readme to get started in v0, and check back here for updates. If you'd like to watch the development unfold or contribute, check out the repo out [here](https://github.com/whylabs/whylogs/tree/mainline/java)
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
| NYC Tickets | 1.9GB | 10.8M | 43 | 14MB | 2.3MB |
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
.aggProfiles() // runs the aggregation. returns a dataframe of <timestamp, datasetProfile> entries
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Fenced code blocks should be surrounded by blank lines Open
```
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Line length Open
* **Histograms** for numerical features. WhyLogs binary output can be queried to with dynamic binning based on the shape of your data.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
For examples in different languages, please checkout our [whylogs-examples](https://github.com/whylabs/whylogs-examples) repository.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
This is a Java implementation of WhyLogs v0 with support for Apache Spark integration for large scale datasets which can be seen [here](https://github.com/whylabs/whylogs/tree/maintenance/0.7.x/java). The Python implementation can be found [here](https://github.com/whylabs/whylogs-python).
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
**Dataset:** A collection of records. WhyLogs v0.0.2 supports structured datasets, which represent data as a table where each row is a different record and each column is a feature of the record.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Inline HTML Open
<version>0.1.0</version>
- Read upRead up
- Exclude checks
MD033 - Inline HTML
Tags: html
Aliases: no-inline-html
This rule is triggered whenever raw HTML is used in a markdown document:
Inline HTML header
To fix this, use 'pure' markdown instead of including raw HTML:
# Markdown header
Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.
Line length Open
* **Unified data instrumentation:** To enable data engineering pipelines and ML pipelines to share a common framework for tracking data quality and drifts, the WhyLogs library supports multiple languages and integrations.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
| Dataset | Size | No. of Entries | No. of Features | Est. Memory Consumption | Output Size (uncompressed) |
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
final DatasetProfile profile = new DatasetProfile("test-session", Instant.now(), tags);
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Fenced code blocks should be surrounded by blank lines Open
```
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Inline HTML Open
<groupId>ai.whylabs</groupId>
- Read upRead up
- Exclude checks
MD033 - Inline HTML
Tags: html
Aliases: no-inline-html
This rule is triggered whenever raw HTML is used in a markdown document:
Inline HTML header
To fix this, use 'pure' markdown instead of including raw HTML:
# Markdown header
Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.
Line length Open
We're excited to announce that v1 of the API to the Java side is coming soon! You can look forward to:
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
We tested WhyLogs Java performance on the following datasets to validate WhyLogs memory footprint and the output binary.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
| Pain pills in the USA | 75GB | 178M | 42 | 15MB | 2MB |
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Headers should be surrounded by blank lines Open
## Serialization and deserialization
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.
Line length Open
WhyLogs calculates approximate statistics for datasets of any size up to TB-scale, making it easy for users to identify changes in the statistical properties of a model's inputs or outputs. Using approximate statistics allows the package to run on minimal infrastructure and monitor an entire dataset, rather than miss outliers and other anomalies by only using a sample of the data to calculate statistics. These qualities make WhyLogs an excellent solution for profiling production ML/AI pipelines that operate on TB-scale data and with enterprise SLAs.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
**Feature:** In the context of WhyLogs v0.0.2 and structured data, a feature is a column in a dataset. A feature can be discrete (like gender or eye color) or continuous (like age or salary).
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
This example shows how we use WhyLogs to profile a dataset based on time and categorical information. The data is from the
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
val profiles = df.newProfilingSession("profilingSession") // start a new WhyLogs profiling job
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Fenced code blocks should be surrounded by blank lines Open
```xml
- Read upRead up
- Exclude checks
MD031 - Fenced code blocks should be surrounded by blank lines
Tags: code, blank_lines
Aliases: blanks-around-fences
This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:
Some text
```
Code block
```
```
Another code block
```
Some more text
To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
```
Code block
```
```
Another code block
```
Some more text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.
Lists should be surrounded by blank lines Open
* **Simple counters**: boolean, null values, data types.
- Read upRead up
- Exclude checks
MD032 - Lists should be surrounded by blank lines
Tags: bullet, ul, ol, blank_lines
Aliases: blanks-around-lists
This rule is triggered when lists (of any kind) are either not preceded or not followed by a blank line:
Some text
* Some
* List
1. Some
2. List
Some text
To fix this, ensure that all lists have a blank line both before and after (except where the block is at the beginning or end of the document):
Some text
* Some
* List
1. Some
2. List
Some text
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse lists that don't have blank lines before and after them.
Note: List items without hanging indents are a violation of this rule; list items with hanging indents are okay:
* This is
not okay
* This is
okay
Headers should be surrounded by blank lines Open
## Simple tracking
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.
Line length Open
* **Unique value counter** or **cardinality**: tracks an approximate unique value of your feature using HyperLogLog algorithm.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
In enterprise systems, data is often partitioned across multiple machines for distributed processing. Online systems may also process data on multiple machines, requiring engineers to run ad-hoc analysis using an ETL-based system to build complex metrics, such as counting unique visitors to a website.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Headers should be surrounded by blank lines Open
## Statistical Profile
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.
Line length Open
We ran our profile (in `cli` sub-module in this package) on each the dataset and collected JMX metrics.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Multiple top level headers in the same document Open
# Usage
- Read upRead up
- Exclude checks
MD025 - Multiple top level headers in the same document
Tags: headers
Aliases: single-h1
Parameters: level (number; default 1)
This rule is triggered when a top level header is in use (the first line of the file is a h1 header), and more than one h1 header is in use in the document:
# Top level header
# Another top level header
To fix, structure your document so that there is a single h1 header that is the title for the document, and all later headers are h2 or lower level headers:
# Title
## Header
## Another header
Rationale: A top level header is a h1 on the first line of the file, and serves as the title for the document. If this convention is in use, then there can not be more than one title for the document, and the entire document should be contained within this header.
Note: The level
parameter can be used to change the top level (ex: to h2) in
cases where an h1 is added externally.
Inline HTML Open
<dependency>
- Read upRead up
- Exclude checks
MD033 - Inline HTML
Tags: html
Aliases: no-inline-html
This rule is triggered whenever raw HTML is used in a markdown document:
Inline HTML header
To fix this, use 'pure' markdown instead of including raw HTML:
# Markdown header
Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.
Inline HTML Open
<artifactId>whylogs-core</artifactId>
- Read upRead up
- Exclude checks
MD033 - Inline HTML
Tags: html
Aliases: no-inline-html
This rule is triggered whenever raw HTML is used in a markdown document:
Inline HTML header
To fix this, use 'pure' markdown instead of including raw HTML:
# Markdown header
Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.
Line length Open
* **Lightweight:** WhyLogs produces small mergeable lightweight outputs in a variety of formats, using sketching algorithms and summarizing statistics.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Line length Open
**Statistical Profile:** A collection of statistical properties of a feature. Properties can be different for discrete and continuous features.
- Read upRead up
- Exclude checks
MD013 - Line length
Tags: line_length
Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)
This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.
This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.
You also have the option to exclude this rule for code blocks and tables. To
do this, set the code_blocks
and/or tables
parameters to false.
Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.
Headers should be surrounded by blank lines Open
# Version upgrade to 1.0.0 coming soon!
- Read upRead up
- Exclude checks
MD022 - Headers should be surrounded by blank lines
Tags: headers, blank_lines
Aliases: blanks-around-headers
This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:
# Header 1
Some text
Some more text
## Header 2
To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):
# Header 1
Some text
Some more text
## Header 2
Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.