whylabs/whylogs-python

View on GitHub
java/README.md

Summary

Maintainability
Test Coverage

Line length
Open

* **Top frequent items** (default is 128). Note that this configuration affects the memory footprint, especially for text features.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

| Lending Club          | 1.6GB | 2.2M           | 151             | 14MB                    | 7.4MB                  |
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

WhyLogs resolves this by allowing users to merge sketches from different machines. To merge two WhyLogs
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Fenced code blocks should be surrounded by blank lines
Open

```
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Fenced code blocks should be surrounded by blank lines
Open

```
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Line length
Open

* **Data Insight:** WhyLogs provides complex statistics across different stages of your ML/AI pipelines and applications.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

**Pipeline:** One or more datasets used to build a single model or application. A project may contain multiple pipelines.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

**WhyLogs Output:** WhyLogs returns profile summary files for a dataset in JSON format. For convenience, these files are provided in flat table, histogram, and frequency formats.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

WhyLogs collects approximate statistics and sketches of data on a column-basis into a statistical profile. These metrics include:
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

WhyLogs uses Protobuf as the backing storage format. To write the data to disk, use the standard Protobuf
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

                 .groupBy("City", "Priority") // tag and group the data with categorical information
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

For further analysis, dataframes can be stored in a Parquet file, or collected to the driver if the number of entries is small enough.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Headers should be surrounded by blank lines
Open

## Glossary/Concepts
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

Lists should be surrounded by blank lines
Open

1. Simpler API that gets you started quickly.
Severity: Info
Found in java/README.md by markdownlint

MD032 - Lists should be surrounded by blank lines

Tags: bullet, ul, ol, blank_lines

Aliases: blanks-around-lists

This rule is triggered when lists (of any kind) are either not preceded or not followed by a blank line:

Some text
* Some
* List

1. Some
2. List
Some text

To fix this, ensure that all lists have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

* Some
* List

1. Some
2. List

Some text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse lists that don't have blank lines before and after them.

Note: List items without hanging indents are a violation of this rule; list items with hanging indents are okay:

* This is
not okay

* This is
  okay

Line length
Open

Understanding the properties of data as it moves through applications is essential to keeping your ML/AI pipeline stable and improving your user experience, whether your pipeline is built for production or experimentation. WhyLogs is an open source statistical logging library that allows data science and ML teams to effortlessly profile ML/AI pipelines and applications, producing log files that can be used for monitoring, alerts, analytics, and error analysis.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

* **Scalability:** WhyLogs scales with your system, from local development mode to live production systems in multi-node clusters, and works well with batch and streaming architectures.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Multiple top level headers in the same document
Open

# WhyLogs Java Library
Severity: Info
Found in java/README.md by markdownlint

MD025 - Multiple top level headers in the same document

Tags: headers

Aliases: single-h1

Parameters: level (number; default 1)

This rule is triggered when a top level header is in use (the first line of the file is a h1 header), and more than one h1 header is in use in the document:

# Top level header

# Another top level header

To fix, structure your document so that there is a single h1 header that is the title for the document, and all later headers are h2 or lower level headers:

# Title

## Header

## Another header

Rationale: A top level header is a h1 on the first line of the file, and serves as the title for the document. If this convention is in use, then there can not be more than one title for the document, and the entire document should be contained within this header.

Note: The level parameter can be used to change the top level (ex: to h2) in cases where an h1 is added externally.

Multiple top level headers in the same document
Open

# Key Features
Severity: Info
Found in java/README.md by markdownlint

MD025 - Multiple top level headers in the same document

Tags: headers

Aliases: single-h1

Parameters: level (number; default 1)

This rule is triggered when a top level header is in use (the first line of the file is a h1 header), and more than one h1 header is in use in the document:

# Top level header

# Another top level header

To fix, structure your document so that there is a single h1 header that is the title for the document, and all later headers are h2 or lower level headers:

# Title

## Header

## Another header

Rationale: A top level header is a h1 on the first line of the file, and serves as the title for the document. If this convention is in use, then there can not be more than one title for the document, and the entire document should be contained within this header.

Note: The level parameter can be used to change the top level (ex: to h2) in cases where an h1 is added externally.

Fenced code blocks should be surrounded by blank lines
Open

```xml
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Fenced code blocks should be surrounded by blank lines
Open

```
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Headers should be surrounded by blank lines
Open

## Performance
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

Trailing punctuation in header
Open

# Version upgrade to 1.0.0 coming soon!
Severity: Info
Found in java/README.md by markdownlint

MD026 - Trailing punctuation in header

Tags: headers

Aliases: no-trailing-punctuation

Parameters: punctuation (string; default ".,;:!?")

This rule is triggered on any header that has a punctuation character as the last character in the line:

# This is a header.

To fix this, remove any trailing punctuation:

# This is a header

Note: The punctuation parameter can be used to specify what characters class as punctuation at the end of the header. For example, you can set it to '.,;:!' to allow headers with question marks in them, such as might be used in an FAQ.

Line length
Open

* **Observability:** In addition to supporting traditional monitoring approaches, WhyLogs data can support advanced ML-focused analytics, error analysis, and data quality and data drift detection.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

        try (final OutputStream fos = Files.newOutputStream(Paths.get("profile.bin"))) {
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Headers should be surrounded by blank lines
Open

# WhyLogs Java Library
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

Lists should be surrounded by blank lines
Open

* Have the same name
Severity: Info
Found in java/README.md by markdownlint

MD032 - Lists should be surrounded by blank lines

Tags: bullet, ul, ol, blank_lines

Aliases: blanks-around-lists

This rule is triggered when lists (of any kind) are either not preceded or not followed by a blank line:

Some text
* Some
* List

1. Some
2. List
Some text

To fix this, ensure that all lists have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

* Some
* List

1. Some
2. List

Some text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse lists that don't have blank lines before and after them.

Note: List items without hanging indents are a violation of this rule; list items with hanging indents are okay:

* This is
not okay

* This is
  okay

Line length
Open

For now, follow this readme to get started in v0, and check back here for updates. If you'd like to watch the development unfold or contribute, check out the repo out [here](https://github.com/whylabs/whylogs/tree/mainline/java)
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

| NYC Tickets           | 1.9GB | 10.8M          | 43              | 14MB                    | 2.3MB                 |
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

                 .aggProfiles() //  runs the aggregation. returns a dataframe of <timestamp, datasetProfile> entries
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Fenced code blocks should be surrounded by blank lines
Open

```
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Line length
Open

* **Histograms** for numerical features. WhyLogs binary output can be queried to with dynamic binning based on the shape of your data.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

For examples in different languages, please checkout our [whylogs-examples](https://github.com/whylabs/whylogs-examples) repository.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

This is a Java implementation of WhyLogs v0 with support for Apache Spark integration for large scale datasets which can be seen [here](https://github.com/whylabs/whylogs/tree/maintenance/0.7.x/java). The Python implementation can be found [here](https://github.com/whylabs/whylogs-python).
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

**Dataset:** A collection of records. WhyLogs v0.0.2 supports structured datasets, which represent data as a table where each row is a different record and each column is a feature of the record.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Inline HTML
Open

  <version>0.1.0</version>
Severity: Info
Found in java/README.md by markdownlint

MD033 - Inline HTML

Tags: html

Aliases: no-inline-html

This rule is triggered whenever raw HTML is used in a markdown document:

Inline HTML header

To fix this, use 'pure' markdown instead of including raw HTML:

# Markdown header

Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.

Line length
Open

* **Unified data instrumentation:** To enable data engineering pipelines and ML pipelines to share a common framework for tracking data quality and drifts, the WhyLogs library supports multiple languages and integrations.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

|        Dataset        | Size  | No. of Entries | No. of Features | Est. Memory Consumption | Output Size (uncompressed) |
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

        final DatasetProfile profile = new DatasetProfile("test-session", Instant.now(), tags);
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Fenced code blocks should be surrounded by blank lines
Open

```
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Inline HTML
Open

  <groupId>ai.whylabs</groupId>
Severity: Info
Found in java/README.md by markdownlint

MD033 - Inline HTML

Tags: html

Aliases: no-inline-html

This rule is triggered whenever raw HTML is used in a markdown document:

Inline HTML header

To fix this, use 'pure' markdown instead of including raw HTML:

# Markdown header

Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.

Line length
Open

We're excited to announce that v1 of the API to the Java side is coming soon! You can look forward to:
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

We tested WhyLogs Java performance on the following datasets to validate WhyLogs memory footprint and the output binary.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

| Pain pills in the USA | 75GB  | 178M           | 42              | 15MB                    | 2MB                   |
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Headers should be surrounded by blank lines
Open

## Serialization and deserialization
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

Line length
Open

WhyLogs calculates approximate statistics for datasets of any size up to TB-scale, making it easy for users to identify changes in the statistical properties of a model's inputs or outputs. Using approximate statistics allows the package to run on minimal infrastructure and monitor an entire dataset, rather than miss outliers and other anomalies by only using a sample of the data to calculate statistics. These qualities make WhyLogs an excellent solution for profiling production ML/AI pipelines that operate on TB-scale data and with enterprise SLAs.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

**Feature:** In the context of WhyLogs v0.0.2 and structured data, a feature is a column in a dataset. A feature can be discrete (like gender or eye color) or continuous (like age or salary).
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

This example shows how we use WhyLogs to profile a dataset based on time and categorical information. The data is from the
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

val profiles = df.newProfilingSession("profilingSession") // start a new WhyLogs profiling job
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Fenced code blocks should be surrounded by blank lines
Open

```xml
Severity: Info
Found in java/README.md by markdownlint

MD031 - Fenced code blocks should be surrounded by blank lines

Tags: code, blank_lines

Aliases: blanks-around-fences

This rule is triggered when fenced code blocks are either not preceded or not followed by a blank line:

Some text
```
Code block
```

```
Another code block
```
Some more text

To fix this, ensure that all fenced code blocks have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

```
Code block
```

```
Another code block
```

Some more text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse fenced code blocks that don't have blank lines before and after them.

Lists should be surrounded by blank lines
Open

* **Simple counters**: boolean, null values, data types.
Severity: Info
Found in java/README.md by markdownlint

MD032 - Lists should be surrounded by blank lines

Tags: bullet, ul, ol, blank_lines

Aliases: blanks-around-lists

This rule is triggered when lists (of any kind) are either not preceded or not followed by a blank line:

Some text
* Some
* List

1. Some
2. List
Some text

To fix this, ensure that all lists have a blank line both before and after (except where the block is at the beginning or end of the document):

Some text

* Some
* List

1. Some
2. List

Some text

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse lists that don't have blank lines before and after them.

Note: List items without hanging indents are a violation of this rule; list items with hanging indents are okay:

* This is
not okay

* This is
  okay

Headers should be surrounded by blank lines
Open

## Simple tracking
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

Line length
Open

* **Unique value counter** or **cardinality**: tracks an approximate unique value of your feature using HyperLogLog algorithm.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

In enterprise systems, data is often partitioned across multiple machines for distributed processing. Online systems may also process data on multiple machines, requiring engineers to run ad-hoc analysis using an ETL-based system to build complex metrics, such as counting unique visitors to a website.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Headers should be surrounded by blank lines
Open

## Statistical Profile
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

Line length
Open

We ran our profile (in `cli` sub-module in this package) on each the dataset and collected JMX metrics.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Multiple top level headers in the same document
Open

# Usage
Severity: Info
Found in java/README.md by markdownlint

MD025 - Multiple top level headers in the same document

Tags: headers

Aliases: single-h1

Parameters: level (number; default 1)

This rule is triggered when a top level header is in use (the first line of the file is a h1 header), and more than one h1 header is in use in the document:

# Top level header

# Another top level header

To fix, structure your document so that there is a single h1 header that is the title for the document, and all later headers are h2 or lower level headers:

# Title

## Header

## Another header

Rationale: A top level header is a h1 on the first line of the file, and serves as the title for the document. If this convention is in use, then there can not be more than one title for the document, and the entire document should be contained within this header.

Note: The level parameter can be used to change the top level (ex: to h2) in cases where an h1 is added externally.

Inline HTML
Open

<dependency>
Severity: Info
Found in java/README.md by markdownlint

MD033 - Inline HTML

Tags: html

Aliases: no-inline-html

This rule is triggered whenever raw HTML is used in a markdown document:

Inline HTML header

To fix this, use 'pure' markdown instead of including raw HTML:

# Markdown header

Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.

Inline HTML
Open

  <artifactId>whylogs-core</artifactId>
Severity: Info
Found in java/README.md by markdownlint

MD033 - Inline HTML

Tags: html

Aliases: no-inline-html

This rule is triggered whenever raw HTML is used in a markdown document:

Inline HTML header

To fix this, use 'pure' markdown instead of including raw HTML:

# Markdown header

Rationale: Raw HTML is allowed in markdown, but this rule is included for those who want their documents to only include "pure" markdown, or for those who are rendering markdown documents in something other than HTML.

Line length
Open

* **Lightweight:** WhyLogs produces small mergeable lightweight outputs in a variety of formats, using sketching algorithms and summarizing statistics.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Line length
Open

**Statistical Profile:** A collection of statistical properties of a feature. Properties can be different for discrete and continuous features.
Severity: Info
Found in java/README.md by markdownlint

MD013 - Line length

Tags: line_length

Aliases: line-length Parameters: linelength, codeblocks, tables (number; default 80, boolean; default true)

This rule is triggered when there are lines that are longer than the configured line length (default: 80 characters). To fix this, split the line up into multiple lines.

This rule has an exception where there is no whitespace beyond the configured line length. This allows you to still include items such as long URLs without being forced to break them in the middle.

You also have the option to exclude this rule for code blocks and tables. To do this, set the code_blocks and/or tables parameters to false.

Code blocks are included in this rule by default since it is often a requirement for document readability, and tentatively compatible with code rules. Still, some languages do not lend themselves to short lines.

Headers should be surrounded by blank lines
Open

# Version upgrade to 1.0.0 coming soon!
Severity: Info
Found in java/README.md by markdownlint

MD022 - Headers should be surrounded by blank lines

Tags: headers, blank_lines

Aliases: blanks-around-headers

This rule is triggered when headers (any style) are either not preceded or not followed by a blank line:

# Header 1
Some text

Some more text
## Header 2

To fix this, ensure that all headers have a blank line both before and after (except where the header is at the beginning or end of the document):

# Header 1

Some text

Some more text

## Header 2

Rationale: Aside from aesthetic reasons, some parsers, including kramdown, will not parse headers that don't have a blank line before, and will parse them as regular text.

There are no issues that match your filters.

Category
Status