python/examples/integrations/Mlflow_Logging.ipynb from whylabs/whylogs-python

python/examples/integrations/Mlflow_Logging.ipynb
Summary

Maintainability

Test Coverage

Issues
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">### 🚩 *Create a free WhyLabs account to get more value out of whylogs!*<br> \n",
    ">*Did you know you can store, visualize, and monitor whylogs profiles with the [WhyLabs Observability Platform](https://whylabs.ai/whylogs-free-signup?utm_source=whylogs-Github&utm_medium=whylogs-example&utm_campaign=Mlflow_Logging)? Sign up for a [free WhyLabs account](https://whylabs.ai/whylogs-free-signup?utm_source=whylogs-Github&utm_medium=whylogs-example&utm_campaign=Mlflow_Logging) to leverage the power of whylogs and WhyLabs together!*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# MLflow Logging\n",
    "\n",
    "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/whylabs/whylogs/blob/mainline/python/examples/integrations/Mlflow_Logging.ipynb)\n",
    "\n",
    "[MLflow](https://www.mlflow.org/) is an open-source model platform that can track, manage and help users deploy their models to production with a very consistent API and good software engineering practices. Whylogs users can benefit from our API to seamlessly log profiles to their Mlflow environment. Let's see how."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Setup\n",
    "\n",
    "For this tutorial we will simplify the approach by using MLflow's local client. One of MLflow's advantages is that it uses the exact same API to work both locally and in the cloud. So with a minor setup, the code shown here can be easily extended if you're working with MLflow in Kubernetes or in Databricks, for example. In order to get started, make sure you have both `mlflow` and `whylogs` installed in your environment by uncommenting the following cells:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Note: you may need to restart the kernel to use updated packages.\n",
    "%pip install 'whylogs[mlflow]'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "We are also installing `pandas`, `scikit-learn` and `matplotlib` in order to have a very simple training example and show you how you can start profiling your training data with `whylogs`. So, if you still haven't, also run the following cell:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "%pip install -q scikit-learn matplotlib pandas mlflow-skinny"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Get the data\n",
    "\n",
    "Now let us get an example dataset from the `scikit-learn` library and create a function that returns an aggregated dataframe with it. We will use this same function later on!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from sklearn.datasets import load_iris\n",
    "\n",
    "def get_data() -> pd.DataFrame:\n",
    "    iris_data = load_iris()\n",
    "    dataframe = pd.DataFrame(iris_data.data, columns=iris_data.feature_names)\n",
    "    dataframe[\"target\"] = pd.DataFrame(iris_data.target)\n",
    "    return dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = get_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sepal length (cm)</th>\n",
       "      <th>sepal width (cm)</th>\n",
       "      <th>petal length (cm)</th>\n",
       "      <th>petal width (cm)</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.1</td>\n",
       "      <td>3.5</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>4.9</td>\n",
       "      <td>3.0</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>4.7</td>\n",
       "      <td>3.2</td>\n",
       "      <td>1.3</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4.6</td>\n",
       "      <td>3.1</td>\n",
       "      <td>1.5</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.0</td>\n",
       "      <td>3.6</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \\\n",
       "0                5.1               3.5                1.4               0.2   \n",
       "1                4.9               3.0                1.4               0.2   \n",
       "2                4.7               3.2                1.3               0.2   \n",
       "3                4.6               3.1                1.5               0.2   \n",
       "4                5.0               3.6                1.4               0.2   \n",
       "\n",
       "   target  \n",
       "0       0  \n",
       "1       0  \n",
       "2       0  \n",
       "3       0  \n",
       "4       0  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train a model\n",
    "\n",
    "Let's define the simplest model to be trained with `scikit-learn`. We aren't interested in model performance nor deep ML concepts, but only in having some baseline model being trained and having the overall idea of how to use `whylogs` with your existing training pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.tree import DecisionTreeClassifier\n",
    "\n",
    "\n",
    "def train(dataframe: pd.DataFrame) -> None:\n",
    "    model = DecisionTreeClassifier(max_depth=2)\n",
    "    model.fit(dataframe.drop(\"target\", axis=1), y=dataframe[\"target\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could serialize a model, but we will take a shortcut here taking advantage of `mlflow`'s awesome `autolog` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mlflow\n",
    "\n",
    "with mlflow.start_run() as run:\n",
    "    mlflow.sklearn.autolog()\n",
    "\n",
    "    df = get_data()\n",
    "    train(dataframe=df)\n",
    "\n",
    "    run_id = run.info.run_id\n",
    "\n",
    "    mlflow.end_run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now we should see that a `mlruns/` directory was created and that we already have our trained model in there!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['python_env.yaml', 'requirements.txt', 'MLmodel', 'model.pkl', 'conda.yaml']"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import os \n",
    "os.listdir(f\"mlruns/0/{run_id}/artifacts/model\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Profile the training data with `whylogs`\n",
    "\n",
    "Now in order to profile your training data with `whylogs`, you'll basically need to use our `logger` API, which is as simple as:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "import whylogs as why\n",
    "\n",
    "profile_result = why.log(df)\n",
    "profile_view = profile_result.view()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>counts/n</th>\n",
       "      <th>counts/null</th>\n",
       "      <th>types/integral</th>\n",
       "      <th>types/fractional</th>\n",
       "      <th>types/boolean</th>\n",
       "      <th>types/string</th>\n",
       "      <th>types/object</th>\n",
       "      <th>distribution/mean</th>\n",
       "      <th>distribution/stddev</th>\n",
       "      <th>distribution/n</th>\n",
       "      <th>...</th>\n",
       "      <th>distribution/q_90</th>\n",
       "      <th>distribution/q_95</th>\n",
       "      <th>distribution/q_99</th>\n",
       "      <th>ints/max</th>\n",
       "      <th>ints/min</th>\n",
       "      <th>cardinality/est</th>\n",
       "      <th>cardinality/upper_1</th>\n",
       "      <th>cardinality/lower_1</th>\n",
       "      <th>frequent_items/frequent_strings</th>\n",
       "      <th>type</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>column</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>target</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.819232</td>\n",
       "      <td>150</td>\n",
       "      <td>...</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>3.000150</td>\n",
       "      <td>3.0</td>\n",
       "      <td>[FrequentItem(value='0.000000', est=50, upper=...</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>petal width (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1.199333</td>\n",
       "      <td>0.762238</td>\n",
       "      <td>150</td>\n",
       "      <td>...</td>\n",
       "      <td>2.2</td>\n",
       "      <td>2.3</td>\n",
       "      <td>2.5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>22.000001</td>\n",
       "      <td>22.001100</td>\n",
       "      <td>22.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sepal width (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3.057333</td>\n",
       "      <td>0.435866</td>\n",
       "      <td>150</td>\n",
       "      <td>...</td>\n",
       "      <td>3.7</td>\n",
       "      <td>3.8</td>\n",
       "      <td>4.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>23.000001</td>\n",
       "      <td>23.001150</td>\n",
       "      <td>23.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>petal length (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3.758000</td>\n",
       "      <td>1.765298</td>\n",
       "      <td>150</td>\n",
       "      <td>...</td>\n",
       "      <td>5.8</td>\n",
       "      <td>6.1</td>\n",
       "      <td>6.7</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>43.000004</td>\n",
       "      <td>43.002151</td>\n",
       "      <td>43.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sepal length (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>5.843333</td>\n",
       "      <td>0.828066</td>\n",
       "      <td>150</td>\n",
       "      <td>...</td>\n",
       "      <td>6.9</td>\n",
       "      <td>7.3</td>\n",
       "      <td>7.7</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>35.000003</td>\n",
       "      <td>35.001750</td>\n",
       "      <td>35.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 28 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                   counts/n  counts/null  types/integral  types/fractional  \\\n",
       "column                                                                       \n",
       "target                  150            0             150                 0   \n",
       "petal width (cm)        150            0               0               150   \n",
       "sepal width (cm)        150            0               0               150   \n",
       "petal length (cm)       150            0               0               150   \n",
       "sepal length (cm)       150            0               0               150   \n",
       "\n",
       "                   types/boolean  types/string  types/object  \\\n",
       "column                                                         \n",
       "target                         0             0             0   \n",
       "petal width (cm)               0             0             0   \n",
       "sepal width (cm)               0             0             0   \n",
       "petal length (cm)              0             0             0   \n",
       "sepal length (cm)              0             0             0   \n",
       "\n",
       "                   distribution/mean  distribution/stddev  distribution/n  \\\n",
       "column                                                                      \n",
       "target                      1.000000             0.819232             150   \n",
       "petal width (cm)            1.199333             0.762238             150   \n",
       "sepal width (cm)            3.057333             0.435866             150   \n",
       "petal length (cm)           3.758000             1.765298             150   \n",
       "sepal length (cm)           5.843333             0.828066             150   \n",
       "\n",
       "                   ...  distribution/q_90  distribution/q_95  \\\n",
       "column             ...                                         \n",
       "target             ...                2.0                2.0   \n",
       "petal width (cm)   ...                2.2                2.3   \n",
       "sepal width (cm)   ...                3.7                3.8   \n",
       "petal length (cm)  ...                5.8                6.1   \n",
       "sepal length (cm)  ...                6.9                7.3   \n",
       "\n",
       "                   distribution/q_99  ints/max  ints/min  cardinality/est  \\\n",
       "column                                                                      \n",
       "target                           2.0       2.0       0.0         3.000000   \n",
       "petal width (cm)                 2.5       NaN       NaN        22.000001   \n",
       "sepal width (cm)                 4.2       NaN       NaN        23.000001   \n",
       "petal length (cm)                6.7       NaN       NaN        43.000004   \n",
       "sepal length (cm)                7.7       NaN       NaN        35.000003   \n",
       "\n",
       "                   cardinality/upper_1  cardinality/lower_1  \\\n",
       "column                                                        \n",
       "target                        3.000150                  3.0   \n",
       "petal width (cm)             22.001100                 22.0   \n",
       "sepal width (cm)             23.001150                 23.0   \n",
       "petal length (cm)            43.002151                 43.0   \n",
       "sepal length (cm)            35.001750                 35.0   \n",
       "\n",
       "                                     frequent_items/frequent_strings  \\\n",
       "column                                                                 \n",
       "target             [FrequentItem(value='0.000000', est=50, upper=...   \n",
       "petal width (cm)                                                 NaN   \n",
       "sepal width (cm)                                                 NaN   \n",
       "petal length (cm)                                                NaN   \n",
       "sepal length (cm)                                                NaN   \n",
       "\n",
       "                                 type  \n",
       "column                                 \n",
       "target             SummaryType.COLUMN  \n",
       "petal width (cm)   SummaryType.COLUMN  \n",
       "sepal width (cm)   SummaryType.COLUMN  \n",
       "petal length (cm)  SummaryType.COLUMN  \n",
       "sepal length (cm)  SummaryType.COLUMN  \n",
       "\n",
       "[5 rows x 28 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "profile_view.to_pandas()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing your profile to `mlflow`\n",
    "\n",
    "Now even more interesting than writing this profile locally is the ability to use `mlflow`'s API **together** with `whylogs`', in order to store the training data profile and analyze the results of your experiments over time. For that, we basically need to define a function that will\n",
    "\n",
    "1. Profile our training data\n",
    "2. Log the profile as an `mlflow` artifact\n",
    "\n",
    "Let's see how this function can be written:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def log_profile(dataframe: pd.DataFrame) -> None:\n",
    "    profile_result = why.log(dataframe)\n",
    "    profile_result.writer(\"mlflow\").write()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can call that function we defined in our `mlflow` run experiment, like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "with mlflow.start_run() as run:\n",
    "    mlflow.sklearn.autolog()\n",
    "\n",
    "    df = get_data()\n",
    "    train(dataframe=df)\n",
    "\n",
    "    log_profile(dataframe=df)\n",
    "\n",
    "    run_id = run.info.run_id\n",
    "\n",
    "    mlflow.end_run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we inspect the recently created experiment folder, we will see that a `whylogs` directory was created there with our profile."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['whylogs_profile_4724587f9aa146b6a19be2f4268c5005.bin']"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "os.listdir(f\"mlruns/0/{run_id}/artifacts/whylogs\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can even use `mlflow`'s API to fetch and read back our profile, like:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mlflow.tracking import MlflowClient\n",
    "\n",
    "client = MlflowClient()\n",
    "\n",
    "local_dir = \"/tmp/artifact_downloads\"\n",
    "if not os.path.exists(local_dir):\n",
    "    os.mkdir(local_dir)\n",
    "local_path = client.download_artifacts(run_id, \"whylogs\", local_dir)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['whylogs_profile_4724587f9aa146b6a19be2f4268c5005.bin']"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "os.listdir(local_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "profile_name = os.listdir(local_path)[0]\n",
    "result = why.read(path=f\"{local_path}/{profile_name}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>counts/n</th>\n",
       "      <th>counts/null</th>\n",
       "      <th>types/integral</th>\n",
       "      <th>types/fractional</th>\n",
       "      <th>types/boolean</th>\n",
       "      <th>types/string</th>\n",
       "      <th>types/object</th>\n",
       "      <th>cardinality/est</th>\n",
       "      <th>cardinality/upper_1</th>\n",
       "      <th>cardinality/lower_1</th>\n",
       "      <th>...</th>\n",
       "      <th>distribution/q_25</th>\n",
       "      <th>distribution/median</th>\n",
       "      <th>distribution/q_75</th>\n",
       "      <th>distribution/q_90</th>\n",
       "      <th>distribution/q_95</th>\n",
       "      <th>distribution/q_99</th>\n",
       "      <th>type</th>\n",
       "      <th>ints/max</th>\n",
       "      <th>ints/min</th>\n",
       "      <th>frequent_items/frequent_strings</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>column</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>petal length (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>43.000004</td>\n",
       "      <td>43.002151</td>\n",
       "      <td>43.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.6</td>\n",
       "      <td>4.4</td>\n",
       "      <td>5.1</td>\n",
       "      <td>5.8</td>\n",
       "      <td>6.1</td>\n",
       "      <td>6.7</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>petal width (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>22.000001</td>\n",
       "      <td>22.001100</td>\n",
       "      <td>22.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.3</td>\n",
       "      <td>1.3</td>\n",
       "      <td>1.8</td>\n",
       "      <td>2.2</td>\n",
       "      <td>2.3</td>\n",
       "      <td>2.5</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sepal length (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>35.000003</td>\n",
       "      <td>35.001750</td>\n",
       "      <td>35.0</td>\n",
       "      <td>...</td>\n",
       "      <td>5.1</td>\n",
       "      <td>5.8</td>\n",
       "      <td>6.4</td>\n",
       "      <td>6.9</td>\n",
       "      <td>7.3</td>\n",
       "      <td>7.7</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sepal width (cm)</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>23.000001</td>\n",
       "      <td>23.001150</td>\n",
       "      <td>23.0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.8</td>\n",
       "      <td>3.0</td>\n",
       "      <td>3.3</td>\n",
       "      <td>3.7</td>\n",
       "      <td>3.8</td>\n",
       "      <td>4.2</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>target</th>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>150</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>3.000150</td>\n",
       "      <td>3.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>SummaryType.COLUMN</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>[FrequentItem(value='0.000000', est=50, upper=...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 28 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                   counts/n  counts/null  types/integral  types/fractional  \\\n",
       "column                                                                       \n",
       "petal length (cm)       150            0               0               150   \n",
       "petal width (cm)        150            0               0               150   \n",
       "sepal length (cm)       150            0               0               150   \n",
       "sepal width (cm)        150            0               0               150   \n",
       "target                  150            0             150                 0   \n",
       "\n",
       "                   types/boolean  types/string  types/object  cardinality/est  \\\n",
       "column                                                                          \n",
       "petal length (cm)              0             0             0        43.000004   \n",
       "petal width (cm)               0             0             0        22.000001   \n",
       "sepal length (cm)              0             0             0        35.000003   \n",
       "sepal width (cm)               0             0             0        23.000001   \n",
       "target                         0             0             0         3.000000   \n",
       "\n",
       "                   cardinality/upper_1  cardinality/lower_1  ...  \\\n",
       "column                                                       ...   \n",
       "petal length (cm)            43.002151                 43.0  ...   \n",
       "petal width (cm)             22.001100                 22.0  ...   \n",
       "sepal length (cm)            35.001750                 35.0  ...   \n",
       "sepal width (cm)             23.001150                 23.0  ...   \n",
       "target                        3.000150                  3.0  ...   \n",
       "\n",
       "                   distribution/q_25  distribution/median  distribution/q_75  \\\n",
       "column                                                                         \n",
       "petal length (cm)                1.6                  4.4                5.1   \n",
       "petal width (cm)                 0.3                  1.3                1.8   \n",
       "sepal length (cm)                5.1                  5.8                6.4   \n",
       "sepal width (cm)                 2.8                  3.0                3.3   \n",
       "target                           0.0                  1.0                2.0   \n",
       "\n",
       "                   distribution/q_90  distribution/q_95  distribution/q_99  \\\n",
       "column                                                                       \n",
       "petal length (cm)                5.8                6.1                6.7   \n",
       "petal width (cm)                 2.2                2.3                2.5   \n",
       "sepal length (cm)                6.9                7.3                7.7   \n",
       "sepal width (cm)                 3.7                3.8                4.2   \n",
       "target                           2.0                2.0                2.0   \n",
       "\n",
       "                                 type  ints/max  ints/min  \\\n",
       "column                                                      \n",
       "petal length (cm)  SummaryType.COLUMN       NaN       NaN   \n",
       "petal width (cm)   SummaryType.COLUMN       NaN       NaN   \n",
       "sepal length (cm)  SummaryType.COLUMN       NaN       NaN   \n",
       "sepal width (cm)   SummaryType.COLUMN       NaN       NaN   \n",
       "target             SummaryType.COLUMN       2.0       0.0   \n",
       "\n",
       "                                     frequent_items/frequent_strings  \n",
       "column                                                                \n",
       "petal length (cm)                                                NaN  \n",
       "petal width (cm)                                                 NaN  \n",
       "sepal length (cm)                                                NaN  \n",
       "sepal width (cm)                                                 NaN  \n",
       "target             [FrequentItem(value='0.000000', est=50, upper=...  \n",
       "\n",
       "[5 rows x 28 columns]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result.view().to_pandas()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And with those few lines we have successfully fetched the profile artifact from our experiment. Over time, we will be able to track down some very relevant information on how our data behaves, **why** is our model generating the results and walk towards a more Robust and Responsible AI field.\n",
    "\n",
    "Hope this tutorial will help you get started with `whylogs`. Stay tuned to our [Github repo](https://github.com/whylabs/whylogs) and also our [community Slack](https://github.com/whylabs/whylogs#:~:text=us%2C%20please%20join-,our%20Slack%20Community,-.%20In%20addition%20to) to get the latest from `whylogs`.\n",
    "\n",
    "See you soon!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.13 ('v1.x')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  },
  "vscode": {
   "interpreter": {
    "hash": "f76ec28949fecf16b926a3fc5a03c1aa6468ee82fa5da4ce6fd607df021af5b5"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}