whylabs/whylogs-python

View on GitHub
python/examples/README.md

Summary

Maintainability
Test Coverage
# Examples

Welcome to our examples!
If you want to get your hands dirty, check out the [Getting Started](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Getting_Started.ipynb) Notebook.

## ๐Ÿง‘๐Ÿผโ€๐Ÿซ Basic examples

In the table below you will find different use cases for whylogs that will help you get started understanding what whylogs can do to make your data and ML pipelines more reliable and sustainable.

| Example                                                                                                                                                 | Description                                                                                                |
| ------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- |
| [Visualizing Profiles](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Notebook_Profile_Visualizer.ipynb)               | Compare profiles to detect distribution shifts, visualize histograms and bar charts and explore your data. |
| [Logging Data](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Logging_Different_Data.ipynb)                            | See the different ways you can log your data with whylogs.                                                 |
| [Inspecting Profiles](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Inspecting_Profiles.ipynb)                        | A deeper dive on the metrics generated by whylogs.                                                         |
| [Schema Configuration for Tracking Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Schema_Configuration.ipynb) | Configure tracking metrics according to data type or column features.                                      |
| [Constraints Suite](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Constraints_Suite.ipynb)                            | A collection of simple out-of-the-box constraints for the most common use-cases.                           |
| [Merging Profiles](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/basic/Merging_Profiles.ipynb)                              | Merge your profiles logged across different computing instances, time periods or data segments.            |

## ๐ŸŒ‰ Whylogs Integrations

Welcome! In this section you will find examples on how to integrate `whylogs`' with different tools and platforms.

### Data Pipelines

| Integration                                                                                                               | Description                                                                  |
| ------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| [Apache Spark](https://github.com/whylabs/whylogs/blob/mainline/python/examples/integrations/Pyspark_Profiling.ipynb)     | Profile data in an Apache Spark environment                                  |
| [BigQuery](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/BigQuery_Example.ipynb) | Profile data queried from a Google BigQuery table                            |
| [Dask](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/Dask_Profiling.ipynb)       | Profile data in parallel with Dask                                           |
| [Databricks](https://docs.whylabs.ai/docs/integrations-databricks)                                                        | Learn how to configure and run whylogs on a Databricks cluster               |
| [Fugue](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/Fugue_Profiling.ipynb)     | Use Fugue to unify parallel whylogs profiling tasks                          |
| [Kafka](https://github.com/whylabs/whylogs/tree/mainline/python/examples/integrations/kafka-example)                      | Learn how to consume and profile streaming data from an existing Kafka topic |
| [Ray](https://docs.whylabs.ai/docs/ray-integration)                                                                       | Profile Big Data in parallel with the Ray integration                        |

### Storage

| Integration                                                                                                                  | Description                                                        |
| ---------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| [s3](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Profiles.ipynb)  | See how to write your whylogs profiles to AWS S3 object storage    |
| [GCS](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Profiles.ipynb) | See how to write your whylogs profiles to the Google Cloud Storage |

### Model lifecycle and deployment

| Integration                                                                                                                              | Description                                                                              |
| ---------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- |
| [Apache Airflow](https://github.com/whylabs/airflow-provider-whylogs)                                                                    | Use Airflow Operators to create drift reports and run contraint validations on your data |
| [BentoML](https://github.com/whylabs/whylogs/blob/mainline/python/examples/integrations/bentoml)                                         | Learn how monitor ML models managed and served with BentoML                              |
| [FastAPI](https://github.com/whylabs/whylogs/blob/mainline/python/examples/integrations/fastapi)                                         | Learn how monitor ML models served with FastAPI                                          |
| [Feast](https://github.com/whylabs/whylogs/blob/mainline/python/examples/integrations/Feature_Stores_and_whylogs.ipynb)                  | Learn how to log features from your Feature Store with Feast and whylogs                 |
| [Flask](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/flask_streaming/flask_with_whylogs.ipynb) | See how you can create a Flask app with this whylogs + WhyLabs integration               |
| [Flyte](https://docs.flyte.org/projects/cookbook/en/stable/auto/integrations/flytekit_plugins/whylogs_examples/index.html)               | Learn how to use whylogs' DatasetProfileView type natively on your Flyte workflows       |
| [Github Actions](https://docs.whylabs.ai/docs/integration-github-actions)                                                                | Monitor your ML datasets as part of your GitOps CI/CD pipeline                           |
| [MLflow](https://github.com/whylabs/whylogs/blob/mainline/python/examples/integrations/Mlflow_Logging.ipynb)                             | Log your whylogs profiles to an MLflow experiment                                        |
| [ZenML](https://blog.zenml.io/zero-six-zero-release/)                                                                                    | Combine different MLOps tools together with ZenML and whylogs!                           |

### Whylabs

You can monitor your profiles continuously with the WhyLabs Observability Platform, and have a single view of your different projects, data and ML models. To learn more how you can combine whylogs with WhyLabs and send over different profiles, refer to these following integration examples:

| Integration                                                                                                                                                                          | Description                                                                       |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------- |
| [Writing profiles](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_to_WhyLabs.ipynb)                                          | Send profiles to your WhyLabs Dashboard                                           |
| [Reference Profile](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Reference_Profiles_to_WhyLabs.ipynb)                      | Send profiles as Reference (Static) Profiles to WhyLabs                           |
| [Regression Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Regression_Performance_Metrics_to_WhyLabs.ipynb)         | Monitor Regression Model Performance Metrics with whylogs and WhyLabs             |
| [Classification Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Classification_Performance_Metrics_to_WhyLabs.ipynb) | Monitor Classification Model Performance Metrics with whylogs and WhyLabs         |
| [Ranking Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/experimental/Writing_Ranking_Performance_Metrics_to_WhyLabs.ipynb)                       | Monitor Ranking Model Performance Metrics with whylogs and WhyLabs (experimental) |
| [Writing Feature Weights](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Feature_Weights_to_WhyLabs.ipynb)                   | Send Feature Weights / Feature Importance information to your WhyLabs Dashboard   |

### Others

| Integration                                                                       | Description                                                                                   |
| --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| [whylogs Container](https://docs.whylabs.ai/docs/integrations-whylogs-container/) | A low code solution to profile your data with a Docker container deployed to your environment |
| [Java](https://docs.whylabs.ai/docs/java-integration/)                            | Profile data with whylogs with Java                                                           |

## ๐Ÿง‘๐Ÿผโ€๐Ÿ”ฌ Advanced examples

Here you will find more advanced use-cases for `whylogs`, and you will learn how to make the most out of your created profiles. Hop on to any example in the table down below to get started.

| Example                                                                                                                                                                                       | Description                                                                               |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| [Streaming Data with Log Rotation](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Log_Rotation_for_Streaming_Data/Streaming_Data_with_Log_Rotation.ipynb) | Generate profiles automatically at fixed intervals with rolling loggers                   |
| [Condition Count Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Condition_Count_Metrics.ipynb)                                                   | Create simple counter metrics with user-defined conditions                                |
| [Condition Validators](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Condition_Validators.ipynb)                                                         | Real-time Data Validation with Condition Validators.                                      |
| [Data Constraints](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Metric_Constraints.ipynb)                                                               | Set constraints to your data to ensure its quality.                                       |
| [Custom Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Custom_Metrics.ipynb)                                                                     | Create your own metrics and metric components                                             |
| [String Tracking](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/String_Tracking.ipynb)                                                                   | Track unicode ranges and character length distribution metrics for your textual features. |
| [Image Logging](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Image_Logging.ipynb)                                                                       | Log image properties and EXIF tags into profiles and send them to WhyLabs                 |
| [Segments](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Segments.ipynb)                                                                                 | Segment your data to improve visibility to the sub-group level                            |
| [Metric Constraints with Condition Count Metrics](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Metric_Constraints_with_Condition_Count_Metrics.ipynb)   | Build Metric Constraints on top of Condition Count Metrics                                |
| [Drift Algorithm Configuration](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Drift_Algorithm_Configuration.ipynb)                                       | Choose different drift algorithms and internal parameters for drift detection             |
| [Converting profiles from v0 to v1](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/advanced/converting_v0_to_v1.ipynb)                                             | Convert whylogs v0 profiles to v1 profiles                                                |

## ๐Ÿงช Experimental

Here you will find examples of features that are still on an experimental stage. Expect changes on the API and the functionality of these features.

| Example                                                                                                                                                                                                | Description                                                                         |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------- |
| [Performance Estimation - Estimating Accuracy for Binary Classification Problems](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/experimental/performance_estimation.ipynb) | Estimate accuracy for unlabeled target datasets for binary classification problems  |
| [Extracting and Monitoring Audio Samples](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/experimental/whylogs_Audio_examples.ipynb)                                         | Extract features from audio samples for the purpose of monitoring for drift/quality |
| [NLP Summarization](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/experimental/NLP_Summarization.ipynb)                                                                    | Monitor a document summarization task with whylogs                                  |
| [Embeddings Distance Logging](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/experimental/embeddings/Embeddings_Distance_Logging.ipynb)                                     | Profile embedding values by comparing them to reference data points                 |
| [Condition Validator UDFs](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/experimental/Condition_Validator_UDF.ipynb)                                                       | Easily create condition validators based on user-defined functions                  |

## ๐Ÿ““ Benchmarks

Here you will find experiments to benchmark different aspect of the whylogs package, such as computational performance and different statistical algorithms.

| Example                                                                                                                                                                                | Description                                                                                                                               |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| [Understanding Kolmogorov-Smirnov (KS) Tests for Data Drift on Profiled Data](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/benchmarks/KS_Profiling.ipynb) | Experiments comparing between Kolmogorov-Smirnov whylogs' implementation on profiled data and traditional implementation on complete data |

## ๐Ÿซ Tutorials

Here you will find tutorials that can span two or more concepts discussed in the previous sections. These tutorials are meant to be a more in-depth, and possibly domain-specific, explanation of the concepts discussed in the previous sections.

| Example                                                                                                                                                                                       | Description                                                                                                          |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- |
| [Data Validation for Spark Dataframes with whylogs](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/tutorials/Pyspark_and_Constraints.ipynb)                        | Profile a Spark Dataframe and Perform Data Validation with Condition Count Metrics and Metric Constraints            |
| [Monitoring Embeddings for Text Data](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/tutorials/Monitoring_Embeddings.ipynb)                                        | Monitor Embeddings, Tokens and Performance of your text classifier application                                       |
| [Data Validation at Scale - Detecting and Responding to Data Misbehavior](https://nbviewer.org/github/whylabs/whylogs/blob/mainline/python/examples/tutorials/Data_Validation_Tutorial.ipynb) | Log, validate, and debug failed conditions with Metric Constraints, Condition Count Metrics and Condition Validators |

## Get in touch

If you want to get more involved with whylogs adn interact with other practitioners, make sure to [join our community Slack](http://join.slack.whylabs.ai/)