migraf/fhir-kindling

View on GitHub
examples/demo.ipynb

Summary

Maintainability
Test Coverage
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "![FHIR Kindling](assets/kindling_header.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "<center>Python toolkit for interacting with HL7 FHIR servers and resources</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Why this project was created\n",
    "- PHT required FHIR project warehouses\n",
    "- Data transfer between FHIR servers difficult and tedious\n",
    "- No automatic conversion to tabular format for analysis\n",
    "- Existing libraries felt slow\n",
    "- Simplify FHIR data science & engineering tasks\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "\n",
    "# Features\n",
    "\n",
    "- Create, Read, Update and Delete resources using a server's REST API\n",
    "- Resource validation powered by pydantic models\n",
    "- Transfer resources between FHIR servers\n",
    "- CSV/Dataframe serialization for resources & bundles\n",
    "- Synthetic data generation and upload\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "## In this presentation\n",
    "\n",
    "- Core feature refresher\n",
    "- Graph based probabilistic dataset generation\n",
    "- Resource transfer between servers\n",
    "- Benchmarks for servers\n",
    "- Kindling App"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Installation\n",
    "\n",
    "Install the latest published version from pypi:\n",
    "```bash\n",
    "pip install --user fhir-kindling\n",
    "```\n",
    "or install the newest version directly from github:\n",
    "```bash\n",
    "pip install --user git+https://github.com/migraf/fhir-kindling.git\n",
    "```\n",
    "\n",
    "More details can be found in the [documentation]()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "!pip install --upgrade fhir-kindling\n",
    "!pip install RISE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "<center><h2>👨‍💻   How to use the library</h2></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Connecting to a server\n",
    "\n",
    "- Different auth methods: Basic, Bearer, OIDC\n",
    "- Configuration of proxies and custom headers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\Michael Graf\\tbi\\repos\\fhir_kindling\\fhir_kindling\\fhir_server\\transfer.py:10: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
      "  from tqdm.autonotebook import tqdm\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "from dotenv import load_dotenv, find_dotenv\n",
    "from fhir_kindling import FhirServer\n",
    "\n",
    "_ = load_dotenv(find_dotenv())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [],
   "source": [
    "\n",
    "fhir_api = \"http://localhost:9090/fhir\"\n",
    "server = FhirServer(\n",
    "    api_address=fhir_api,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Query for resources\n",
    "\n",
    "Query the server with the `query()` method of the server class."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Three ways to define a query:\n",
    "- Iteratively build the query on a resource using methods like `where()`, `include()`, `has()`\n",
    "- Use an existing `query_string` to define the query i.e. `Patient?_id=123\"`\n",
    "- Pass a `FhirQueryParameters` object to the query method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Iteratively building a query\n",
    "\n",
    "Start building a query by selecting the base resource first"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://localhost:9090/fhir/Patient?_count=5000&_format=json'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = server.query(\"Patient\")\n",
    "query.query_url"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Querying the server\n",
    "the query is executed against the server using one of the methods `all()`, `first()`, `limit()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<QueryResponse(resource=Patient, n=13)>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "response = query.all()\n",
    "response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<QueryResponse(resource=Patient, n=5)>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "response = query.limit(5)\n",
    "response"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Accessing the resources in a `QueryResponse` object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Patient(resource_type='Patient', fhir_comments=None, id='DCFPI2LRYF7Y6L2J', implicitRules=None, implicitRules__ext=None, language=None, language__ext=None, meta=Meta(resource_type='Meta', fhir_comments=None, extension=None, id=None, lastUpdated=datetime.datetime(2023, 6, 12, 11, 3, 39, 37000, tzinfo=datetime.timezone.utc), lastUpdated__ext=None, profile=None, profile__ext=None, security=None, source=None, source__ext=None, tag=None, versionId='4', versionId__ext=None), contained=None, extension=None, modifierExtension=None, text=None, active=None, active__ext=None, address=None, birthDate=datetime.date(1939, 2, 9), birthDate__ext=None, communication=None, contact=None, deceasedBoolean=None, deceasedBoolean__ext=None, deceasedDateTime=None, deceasedDateTime__ext=None, gender='female', gender__ext=None, generalPractitioner=None, identifier=None, link=None, managingOrganization=None, maritalStatus=None, multipleBirthBoolean=None, multipleBirthBoolean__ext=None, multipleBirthInteger=None, multipleBirthInteger__ext=None, name=[HumanName(resource_type='HumanName', fhir_comments=None, extension=None, id=None, family='Valdez', family__ext=None, given=['Johnathan'], given__ext=None, period=None, prefix=None, prefix__ext=None, suffix=None, suffix__ext=None, text=None, text__ext=None, use=None, use__ext=None)], photo=None, telecom=None)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "response.resources[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Adding filter conditions\n",
    "\n",
    "Filter parameters are added on the fields of the base resource using the `where()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://localhost:9090/fhir/Patient?birthdate=lt1990-01-01&_count=5000&_format=json'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_2 = server.query(\"Patient\").where(\"birthdate\", \"lt\", \"1990-01-01\")\n",
    "query_2.query_url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<QueryResponse(resource=Patient, n=12)>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_2.all()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Including related resources"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://localhost:9090/fhir/Patient?birthdate=lt1990-01-01&_revinclude=Condition:subject&_count=5000&_format=json'"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_3 = query_2.include(resource=\"Condition\", reference_param=\"subject\", reverse=True)\n",
    "query_3.query_url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<QueryResponse(resource=Patient, n=12)>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "resp = query_3.all()\n",
    "resp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Working with the response\n",
    "\n",
    "The response to the query is a `QueryResponse` object.\n",
    "\n",
    "- The `resources` attribute contains a list of resources of the base resource type returned by the query\n",
    "- The `included_resources` attribute contains a list of included resources. Each entry in the list represents a list of resources of a certain type\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Condition']"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    " [resource.resource_type for resource in resp.included_resources]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Condition(resource_type='Condition', fhir_comments=None, id='DCFPIXWM7Q6NWZ2A', implicitRules=None, implicitRules__ext=None, language=None, language__ext=None, meta=Meta(resource_type='Meta', fhir_comments=None, extension=None, id=None, lastUpdated=datetime.datetime(2023, 6, 12, 11, 2, 55, 513000, tzinfo=datetime.timezone.utc), lastUpdated__ext=None, profile=None, profile__ext=None, security=None, source=None, source__ext=None, tag=None, versionId='2', versionId__ext=None), contained=None, extension=None, modifierExtension=None, text=None, abatementAge=None, abatementDateTime=None, abatementDateTime__ext=None, abatementPeriod=None, abatementRange=None, abatementString=None, abatementString__ext=None, asserter=None, bodySite=None, category=None, clinicalStatus=None, code=CodeableConcept(resource_type='CodeableConcept', fhir_comments=None, extension=None, id=None, coding=[Coding(resource_type='Coding', fhir_comments=None, extension=None, id=None, code='RA01.0', code__ext=None, display='COVID-19, virus identified', display__ext=None, system='http://id.who.int/icd/release/11/mms', system__ext=None, userSelected=None, userSelected__ext=None, version=None, version__ext=None)], text='COVID-19', text__ext=None), encounter=None, evidence=None, identifier=None, note=None, onsetAge=None, onsetDateTime=None, onsetDateTime__ext=None, onsetPeriod=None, onsetRange=None, onsetString=None, onsetString__ext=None, recordedDate=None, recordedDate__ext=None, recorder=None, severity=None, stage=None, subject=Reference(resource_type='Reference', fhir_comments=None, extension=None, id=None, display=None, display__ext=None, identifier=None, reference='Patient/DCFPIXVY34SQMPSQ', reference__ext=None, type=None, type__ext=None), verificationStatus=None)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "resp.included_resources[0].resources[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Saving the response\n",
    "\n",
    "Responses can be saved to a file using the `save()` method of the `QueryResponse` class.\n",
    "Supported formats are `json`, `xml` (if the query was executed with `xml` format) and `csv`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "path = os.path.join(os.getcwd(), \"query_response.json\")\n",
    "\n",
    "resp.save(file_path=path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"id\": \"DCFQACHFUYO32KDI\",\n",
      "  \"type\": \"searchset\",\n",
      "  \"entry\": [\n",
      "    {\n",
      "      \"fullUrl\": \"http://localhost:9090/fhir/Patient/DCFPIXVY34SQMPST\",\n",
      "      \"resource\": {\n",
      "        \"meta\": {\n",
      "\n"
     ]
    }
   ],
   "source": [
    "with open(path, \"r\") as f:\n",
    "    print(\"\".join(f.readlines()[:8]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Serializing resources into a pandas dataframe\n",
    "\n",
    "A response (or any bundle) can be serialized into pandas dataframes.\n",
    "If the response contains resources of different types, the resources are serialized into separate dataframes for each type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>resourceType</th>\n",
       "      <th>id</th>\n",
       "      <th>meta_versionId</th>\n",
       "      <th>meta_lastUpdated</th>\n",
       "      <th>name_0_family</th>\n",
       "      <th>name_0_given_0</th>\n",
       "      <th>gender</th>\n",
       "      <th>birthDate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPST</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Jones</td>\n",
       "      <td>Brianna</td>\n",
       "      <td>male</td>\n",
       "      <td>1987-06-13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSM</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Martinez</td>\n",
       "      <td>Tracey</td>\n",
       "      <td>female</td>\n",
       "      <td>1984-01-28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPI2LVEWAH3V23</td>\n",
       "      <td>6</td>\n",
       "      <td>2023-06-12 11:03:39.091000+00:00</td>\n",
       "      <td>Bennett</td>\n",
       "      <td>Lauren</td>\n",
       "      <td>female</td>\n",
       "      <td>1966-07-03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSO</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Cordova</td>\n",
       "      <td>William</td>\n",
       "      <td>male</td>\n",
       "      <td>1959-07-09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSS</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Williams</td>\n",
       "      <td>David</td>\n",
       "      <td>male</td>\n",
       "      <td>1956-12-02</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  resourceType                id meta_versionId  \\\n",
       "0      Patient  DCFPIXVY34SQMPST              1   \n",
       "1      Patient  DCFPIXVY34SQMPSM              1   \n",
       "2      Patient  DCFPI2LVEWAH3V23              6   \n",
       "3      Patient  DCFPIXVY34SQMPSO              1   \n",
       "4      Patient  DCFPIXVY34SQMPSS              1   \n",
       "\n",
       "                  meta_lastUpdated name_0_family name_0_given_0  gender  \\\n",
       "0 2023-06-12 11:02:55.144000+00:00         Jones        Brianna    male   \n",
       "1 2023-06-12 11:02:55.144000+00:00      Martinez         Tracey  female   \n",
       "2 2023-06-12 11:03:39.091000+00:00       Bennett         Lauren  female   \n",
       "3 2023-06-12 11:02:55.144000+00:00       Cordova        William    male   \n",
       "4 2023-06-12 11:02:55.144000+00:00      Williams          David    male   \n",
       "\n",
       "    birthDate  \n",
       "0  1987-06-13  \n",
       "1  1984-01-28  \n",
       "2  1966-07-03  \n",
       "3  1959-07-09  \n",
       "4  1956-12-02  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from fhir_kindling.serde.flatten import flatten_response\n",
    "\n",
    "dfs = flatten_response(resp)\n",
    "\n",
    "dfs[0].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>resourceType</th>\n",
       "      <th>id</th>\n",
       "      <th>meta_versionId</th>\n",
       "      <th>meta_lastUpdated</th>\n",
       "      <th>code_coding_0_system</th>\n",
       "      <th>code_coding_0_code</th>\n",
       "      <th>code_coding_0_display</th>\n",
       "      <th>code_text</th>\n",
       "      <th>subject_reference</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Condition</td>\n",
       "      <td>DCFPIXWM7Q6NWZ2A</td>\n",
       "      <td>2</td>\n",
       "      <td>2023-06-12 11:02:55.513000+00:00</td>\n",
       "      <td>http://id.who.int/icd/release/11/mms</td>\n",
       "      <td>RA01.0</td>\n",
       "      <td>COVID-19, virus identified</td>\n",
       "      <td>COVID-19</td>\n",
       "      <td>Patient/DCFPIXVY34SQMPSQ</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Condition</td>\n",
       "      <td>DCFPIXWM7Q6NWZXE</td>\n",
       "      <td>2</td>\n",
       "      <td>2023-06-12 11:02:55.513000+00:00</td>\n",
       "      <td>http://id.who.int/icd/release/11/mms</td>\n",
       "      <td>RA01.0</td>\n",
       "      <td>COVID-19, virus identified</td>\n",
       "      <td>COVID-19</td>\n",
       "      <td>Patient/DCFPIXVY34SQMPSK</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Condition</td>\n",
       "      <td>DCFPIXWM7Q6NWZYE</td>\n",
       "      <td>2</td>\n",
       "      <td>2023-06-12 11:02:55.513000+00:00</td>\n",
       "      <td>http://id.who.int/icd/release/11/mms</td>\n",
       "      <td>RA01.0</td>\n",
       "      <td>COVID-19, virus identified</td>\n",
       "      <td>COVID-19</td>\n",
       "      <td>Patient/DCFPIXVY34SQMPSM</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Condition</td>\n",
       "      <td>DCFPIXWM7Q6NWZZB</td>\n",
       "      <td>2</td>\n",
       "      <td>2023-06-12 11:02:55.513000+00:00</td>\n",
       "      <td>http://id.who.int/icd/release/11/mms</td>\n",
       "      <td>RA01.0</td>\n",
       "      <td>COVID-19, virus identified</td>\n",
       "      <td>COVID-19</td>\n",
       "      <td>Patient/DCFPIXVY34SQMPSO</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Condition</td>\n",
       "      <td>DCFPIXWM7Q6NWZ23</td>\n",
       "      <td>2</td>\n",
       "      <td>2023-06-12 11:02:55.513000+00:00</td>\n",
       "      <td>http://id.who.int/icd/release/11/mms</td>\n",
       "      <td>RA01.0</td>\n",
       "      <td>COVID-19, virus identified</td>\n",
       "      <td>COVID-19</td>\n",
       "      <td>Patient/DCFPIXVY34SQMPSS</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  resourceType                id meta_versionId  \\\n",
       "0    Condition  DCFPIXWM7Q6NWZ2A              2   \n",
       "1    Condition  DCFPIXWM7Q6NWZXE              2   \n",
       "2    Condition  DCFPIXWM7Q6NWZYE              2   \n",
       "3    Condition  DCFPIXWM7Q6NWZZB              2   \n",
       "4    Condition  DCFPIXWM7Q6NWZ23              2   \n",
       "\n",
       "                  meta_lastUpdated                  code_coding_0_system  \\\n",
       "0 2023-06-12 11:02:55.513000+00:00  http://id.who.int/icd/release/11/mms   \n",
       "1 2023-06-12 11:02:55.513000+00:00  http://id.who.int/icd/release/11/mms   \n",
       "2 2023-06-12 11:02:55.513000+00:00  http://id.who.int/icd/release/11/mms   \n",
       "3 2023-06-12 11:02:55.513000+00:00  http://id.who.int/icd/release/11/mms   \n",
       "4 2023-06-12 11:02:55.513000+00:00  http://id.who.int/icd/release/11/mms   \n",
       "\n",
       "  code_coding_0_code       code_coding_0_display code_text  \\\n",
       "0             RA01.0  COVID-19, virus identified  COVID-19   \n",
       "1             RA01.0  COVID-19, virus identified  COVID-19   \n",
       "2             RA01.0  COVID-19, virus identified  COVID-19   \n",
       "3             RA01.0  COVID-19, virus identified  COVID-19   \n",
       "4             RA01.0  COVID-19, virus identified  COVID-19   \n",
       "\n",
       "          subject_reference  \n",
       "0  Patient/DCFPIXVY34SQMPSQ  \n",
       "1  Patient/DCFPIXVY34SQMPSK  \n",
       "2  Patient/DCFPIXVY34SQMPSM  \n",
       "3  Patient/DCFPIXVY34SQMPSO  \n",
       "4  Patient/DCFPIXVY34SQMPSS  "
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dfs[1].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Converting a list of resources to a dataframe\n",
    "\n",
    "Any list of resources (pydantic models or dicts) can be converted to a dataframe using the `flatten()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from fhir_kindling.serde import flatten_resources\n",
    "\n",
    "# get a list of patient resources\n",
    "patients = server.query(\"Patient\").limit(100).resources"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>resourceType</th>\n",
       "      <th>id</th>\n",
       "      <th>meta_versionId</th>\n",
       "      <th>meta_lastUpdated</th>\n",
       "      <th>name_0_family</th>\n",
       "      <th>name_0_given_0</th>\n",
       "      <th>gender</th>\n",
       "      <th>birthDate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPI2LRYF7Y6L2J</td>\n",
       "      <td>4</td>\n",
       "      <td>2023-06-12 11:03:39.037000+00:00</td>\n",
       "      <td>Valdez</td>\n",
       "      <td>Johnathan</td>\n",
       "      <td>female</td>\n",
       "      <td>1939-02-09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPI2LTTE3N3KDN</td>\n",
       "      <td>5</td>\n",
       "      <td>2023-06-12 11:03:39.066000+00:00</td>\n",
       "      <td>Hill</td>\n",
       "      <td>Ryan</td>\n",
       "      <td>male</td>\n",
       "      <td>1941-07-20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPI2LVEWAH3V23</td>\n",
       "      <td>6</td>\n",
       "      <td>2023-06-12 11:03:39.091000+00:00</td>\n",
       "      <td>Bennett</td>\n",
       "      <td>Lauren</td>\n",
       "      <td>female</td>\n",
       "      <td>1966-07-03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSK</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Hutchinson</td>\n",
       "      <td>Jason</td>\n",
       "      <td>male</td>\n",
       "      <td>1939-06-28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSL</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>MD</td>\n",
       "      <td>Mrs. Renee Wagner</td>\n",
       "      <td>male</td>\n",
       "      <td>1933-01-10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSM</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Martinez</td>\n",
       "      <td>Tracey</td>\n",
       "      <td>female</td>\n",
       "      <td>1984-01-28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSN</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Williams</td>\n",
       "      <td>Jennifer</td>\n",
       "      <td>female</td>\n",
       "      <td>1999-05-02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSO</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Cordova</td>\n",
       "      <td>William</td>\n",
       "      <td>male</td>\n",
       "      <td>1959-07-09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSP</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Smith</td>\n",
       "      <td>Taylor</td>\n",
       "      <td>female</td>\n",
       "      <td>1923-10-14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSQ</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Lyons</td>\n",
       "      <td>Jordan</td>\n",
       "      <td>male</td>\n",
       "      <td>1951-05-05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSR</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Yu</td>\n",
       "      <td>Christine</td>\n",
       "      <td>female</td>\n",
       "      <td>1922-10-14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPSS</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Williams</td>\n",
       "      <td>David</td>\n",
       "      <td>male</td>\n",
       "      <td>1956-12-02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Patient</td>\n",
       "      <td>DCFPIXVY34SQMPST</td>\n",
       "      <td>1</td>\n",
       "      <td>2023-06-12 11:02:55.144000+00:00</td>\n",
       "      <td>Jones</td>\n",
       "      <td>Brianna</td>\n",
       "      <td>male</td>\n",
       "      <td>1987-06-13</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   resourceType                id meta_versionId  \\\n",
       "0       Patient  DCFPI2LRYF7Y6L2J              4   \n",
       "1       Patient  DCFPI2LTTE3N3KDN              5   \n",
       "2       Patient  DCFPI2LVEWAH3V23              6   \n",
       "3       Patient  DCFPIXVY34SQMPSK              1   \n",
       "4       Patient  DCFPIXVY34SQMPSL              1   \n",
       "5       Patient  DCFPIXVY34SQMPSM              1   \n",
       "6       Patient  DCFPIXVY34SQMPSN              1   \n",
       "7       Patient  DCFPIXVY34SQMPSO              1   \n",
       "8       Patient  DCFPIXVY34SQMPSP              1   \n",
       "9       Patient  DCFPIXVY34SQMPSQ              1   \n",
       "10      Patient  DCFPIXVY34SQMPSR              1   \n",
       "11      Patient  DCFPIXVY34SQMPSS              1   \n",
       "12      Patient  DCFPIXVY34SQMPST              1   \n",
       "\n",
       "                   meta_lastUpdated name_0_family     name_0_given_0  gender  \\\n",
       "0  2023-06-12 11:03:39.037000+00:00        Valdez          Johnathan  female   \n",
       "1  2023-06-12 11:03:39.066000+00:00          Hill               Ryan    male   \n",
       "2  2023-06-12 11:03:39.091000+00:00       Bennett             Lauren  female   \n",
       "3  2023-06-12 11:02:55.144000+00:00    Hutchinson              Jason    male   \n",
       "4  2023-06-12 11:02:55.144000+00:00            MD  Mrs. Renee Wagner    male   \n",
       "5  2023-06-12 11:02:55.144000+00:00      Martinez             Tracey  female   \n",
       "6  2023-06-12 11:02:55.144000+00:00      Williams           Jennifer  female   \n",
       "7  2023-06-12 11:02:55.144000+00:00       Cordova            William    male   \n",
       "8  2023-06-12 11:02:55.144000+00:00         Smith             Taylor  female   \n",
       "9  2023-06-12 11:02:55.144000+00:00         Lyons             Jordan    male   \n",
       "10 2023-06-12 11:02:55.144000+00:00            Yu          Christine  female   \n",
       "11 2023-06-12 11:02:55.144000+00:00      Williams              David    male   \n",
       "12 2023-06-12 11:02:55.144000+00:00         Jones            Brianna    male   \n",
       "\n",
       "     birthDate  \n",
       "0   1939-02-09  \n",
       "1   1941-07-20  \n",
       "2   1966-07-03  \n",
       "3   1939-06-28  \n",
       "4   1933-01-10  \n",
       "5   1984-01-28  \n",
       "6   1999-05-02  \n",
       "7   1959-07-09  \n",
       "8   1923-10-14  \n",
       "9   1951-05-05  \n",
       "10  1922-10-14  \n",
       "11  1956-12-02  \n",
       "12  1987-06-13  "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "flatten_resources(patients)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Additional CRUD operations\n",
    "\n",
    "All other CRUD operations (and their asynchronous equivalents) are exposed as methods on the `FhirServer` object.\n",
    "\n",
    "- Create: `add()`, `add_all()`\n",
    "- Read: `get()`\n",
    "- Update: `update()`\n",
    "- Delete: `delete()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Adding resources & resource validation\n",
    "\n",
    "- Resources as pydantic models or simple dictionaries. \n",
    "- Dictionaries are validated with the corresponding model before being added to the server\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<ResourceCreateResponse(resource_id=DCFQASRT5IP7RDVE, location=http://localhost:9090/fhir/Patient/DCFQASRT5IP7RDVE, version=None)>"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "patient_dict = {\"resourceType\": \"Patient\", \"birthDate\": \"2000-01-01\", \"name\": [{\"family\": \"Mustermann\", \"given\": [\"Max\"]}]}\n",
    "create_resp = server.add(patient_dict)\n",
    "create_resp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Get, Update, Delete"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.date(2000, 1, 1)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "patient_ref = f\"Patient/{create_resp.resource_id}\" \n",
    "patient = server.get(patient_ref)\n",
    "patient.birthDate "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.date(1990, 1, 1)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import datetime\n",
    "\n",
    "patient.birthDate = datetime.date(1990, 1, 1)\n",
    "server.update([patient])\n",
    "updated = server.get(patient_ref)\n",
    "updated.birthDate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "ename": "HTTPStatusError",
     "evalue": "Client error '410 Gone' for url 'http://localhost:9090/fhir/Patient/DCFQASRT5IP7RDVE'\nFor more information check: https://httpstatuses.com/410",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mHTTPStatusError\u001b[0m                           Traceback (most recent call last)",
      "Cell \u001b[1;32mIn[25], line 2\u001b[0m\n\u001b[0;32m      1\u001b[0m server\u001b[38;5;241m.\u001b[39mdelete([updated])\n\u001b[1;32m----> 2\u001b[0m \u001b[43mserver\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget\u001b[49m\u001b[43m(\u001b[49m\u001b[43mpatient_ref\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[1;32m~\\tbi\\repos\\fhir_kindling\\fhir_kindling\\fhir_server\\fhir_server.py:300\u001b[0m, in \u001b[0;36mFhirServer.get\u001b[1;34m(self, reference)\u001b[0m\n\u001b[0;32m    298\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_sync_client() \u001b[38;5;28;01mas\u001b[39;00m client:\n\u001b[0;32m    299\u001b[0m     r \u001b[38;5;241m=\u001b[39m client\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mapi_address\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m/\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mreference\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m--> 300\u001b[0m \u001b[43mr\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mraise_for_status\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m    301\u001b[0m resource_dict \u001b[38;5;241m=\u001b[39m r\u001b[38;5;241m.\u001b[39mjson()\n\u001b[0;32m    302\u001b[0m resource \u001b[38;5;241m=\u001b[39m construct_fhir_element(resource_dict[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mresourceType\u001b[39m\u001b[38;5;124m\"\u001b[39m], resource_dict)\n",
      "File \u001b[1;32m~\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\fhir-kindling-aPmNxlUm-py3.11\\Lib\\site-packages\\httpx\\_models.py:749\u001b[0m, in \u001b[0;36mResponse.raise_for_status\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m    747\u001b[0m error_type \u001b[38;5;241m=\u001b[39m error_types\u001b[38;5;241m.\u001b[39mget(status_class, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInvalid status code\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m    748\u001b[0m message \u001b[38;5;241m=\u001b[39m message\u001b[38;5;241m.\u001b[39mformat(\u001b[38;5;28mself\u001b[39m, error_type\u001b[38;5;241m=\u001b[39merror_type)\n\u001b[1;32m--> 749\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m HTTPStatusError(message, request\u001b[38;5;241m=\u001b[39mrequest, response\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m)\n",
      "\u001b[1;31mHTTPStatusError\u001b[0m: Client error '410 Gone' for url 'http://localhost:9090/fhir/Patient/DCFQASRT5IP7RDVE'\nFor more information check: https://httpstatuses.com/410"
     ]
    }
   ],
   "source": [
    "server.delete([updated])\n",
    "server.get(patient_ref)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Generating synthetic data\n",
    "\n",
    "Generate complex synthetic data sets using dataset and resource generator functions.\n",
    "Interdependencies between resources and the likelihood of a resource being generated can be defined."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "## Generators\n",
    "Fhir kindling provides generator classses for different components of a dataset\n",
    "- FieldGenerator & FieldValue\n",
    "- ResourceGenerator\n",
    "- TimeSeriesGenerator\n",
    "- DatasetResourceGenerator\n",
    "- DatasetGenerator\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "We will reproduce the benchmark dataset which contains:\n",
    "- Patients\n",
    "- with Covid-19 conditions\n",
    "- a certain likelihood of being vaccinated.\n",
    "- that have an emergency room visit\n",
    "- get admitted to the ICU\n",
    "- data about vital parameters in the ICU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Start by importing some constants from the benchmark module. These will define most of the constant values of our resources i.e. Codes, Codings.\n",
    "We also need to import the generators that we will use and set up the dataset generator to which we will iteratively add resources."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CodeableConcept(resource_type='CodeableConcept', fhir_comments=None, extension=None, id=None, coding=[Coding(resource_type='Coding', fhir_comments=None, extension=None, id=None, code='XM0GQ8', code__ext=None, display='COVID-19 vaccine, RNA based', display__ext=None, system='http://id.who.int/icd/release/11/mms', system__ext=None, userSelected=None, userSelected__ext=None, version=None, version__ext=None)], text='COVID vaccination', text__ext=None)"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import random\n",
    "from fhir_kindling.benchmark.constants import Codes\n",
    "from fhir_kindling.util.date_utils import (\n",
    "    add,\n",
    "    subtract,\n",
    "    to_iso_string,\n",
    ")\n",
    "\n",
    "from fhir_kindling.generators.dataset import DatasetGenerator\n",
    "from fhir_kindling.generators.field_generator import FieldGenerator\n",
    "from fhir_kindling.generators.resource_generator import (\n",
    "    FieldValue,\n",
    "    GeneratorParameters,\n",
    "    ResourceGenerator,\n",
    ")\n",
    "from fhir_kindling.generators.time_series_generator import TimeSeriesGenerator\n",
    "\n",
    "Codes.COVID_VACC_RNA.value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now we set up the dataset_generator and define our first resource generator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<DatasetGenerator(name=25db9bc3-49ca-4b50-9dbf-d4d06acacdaf, resource_types={'Patient'}, n=10, generators=[<DataSetResourceGenerator base, generator=<PatientGenerator(n=1, age_range=None, gender_distribution=None>>])>"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n_patients = 10\n",
    "\n",
    "dataset_generator = DatasetGenerator(\"Patient\", n=n_patients)\n",
    "dataset_generator"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Now add a condition resource for covid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "covid_params = GeneratorParameters(\n",
    "        field_values=[\n",
    "            FieldValue(field=\"code\", value=Codes.COVID.value),\n",
    "        ]\n",
    "    )\n",
    "covid_generator = ResourceGenerator(\"Condition\", generator_parameters=covid_params)\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    covid_generator, name=\"covid\", depends_on=\"base\", reference_field=\"subject\"\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now add the first shot of a MRNA based covid vaccination to the dataset. But this will only occur with a certain probability"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "vaccination_date_generator = FieldGenerator(\n",
    "        field=\"occurrenceDateTime\",\n",
    "        generator_function=lambda: to_iso_string(subtract(datetime.datetime.now(), days=720)),\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "first_vax_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"vaccineCode\", value=Codes.COVID_VACC_RNA.value),\n",
    "        FieldValue(field=\"status\", value=\"completed\"),\n",
    "    ],\n",
    "    field_generators=[vaccination_date_generator],\n",
    ")\n",
    "vaccination_generator = ResourceGenerator(\n",
    "    \"Immunization\",\n",
    "    generator_parameters=first_vax_params,\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    vaccination_generator,\n",
    "    \"vacc-mrna-1\",\n",
    "    depends_on=\"base\",\n",
    "    likelihood=0.7,\n",
    "    reference_field=\"patient\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Lets have a look at what we have so far "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<module 'matplotlib.pyplot' from 'C:\\\\Users\\\\Michael Graf\\\\AppData\\\\Local\\\\pypoetry\\\\Cache\\\\virtualenvs\\\\fhir-kindling-aPmNxlUm-py3.11\\\\Lib\\\\site-packages\\\\matplotlib\\\\pyplot.py'>"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "dataset_generator.draw_graph()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now generate the second and third vaccinations with different probablitiies. They all need to depend on the previous vaccinations and the patient."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# second shot\n",
    "second_vax_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"vaccineCode\", value=Codes.COVID_VACC_RNA.value),\n",
    "        FieldValue(field=\"status\", value=\"completed\"),\n",
    "    ],\n",
    "    field_generators=[vaccination_date_generator],\n",
    ")\n",
    "second_vaccination_generator = ResourceGenerator(\n",
    "    \"Immunization\", generator_parameters=second_vax_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    second_vaccination_generator,\n",
    "    \"vacc-mrna-2\",\n",
    "    depends_on=[\"base\", \"vacc-mrna-1\"],\n",
    "    likelihood=0.9,\n",
    "    reference_field=[\"patient\", None],\n",
    ")\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "# third shot\n",
    "second_vax_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"vaccineCode\", value=Codes.COVID_VACC_RNA.value),\n",
    "        FieldValue(field=\"status\", value=\"completed\"),\n",
    "    ],\n",
    "    field_generators=[vaccination_date_generator],\n",
    ")\n",
    "third_vaccination_generator = ResourceGenerator(\n",
    "    \"Immunization\", generator_parameters=second_vax_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    third_vaccination_generator,\n",
    "    \"vacc-mrna-3\",\n",
    "    depends_on=[\"base\", \"vacc-mrna-1\", \"vacc-mrna-2\"],\n",
    "    reference_field=[\"patient\", None, None],\n",
    "    likelihood=0.8,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now lets look at the graph again"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<module 'matplotlib.pyplot' from 'C:\\\\Users\\\\Michael Graf\\\\AppData\\\\Local\\\\pypoetry\\\\Cache\\\\virtualenvs\\\\fhir-kindling-aPmNxlUm-py3.11\\\\Lib\\\\site-packages\\\\matplotlib\\\\pyplot.py'>"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "dataset_generator.draw_graph()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Generate an emergency room encounter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "emergency_encounter_period_generator = FieldGenerator(\n",
    "    field=\"period\",\n",
    "    generator_function=lambda: {\n",
    "        \"start\": pendulum.now().subtract(days=730).to_date_string(),\n",
    "        \"end\": pendulum.now().subtract(days=729).to_date_string(),\n",
    "    },\n",
    ")\n",
    "\n",
    "# emergency encounter\n",
    "emergency_encounter_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"class\", value=Codes.EMERGENCY_ENCOUNTER.value),\n",
    "        FieldValue(field=\"status\", value=\"finished\"),\n",
    "    ],\n",
    "    field_generators=[emergency_encounter_period_generator],\n",
    ")\n",
    "\n",
    "emergency_encounter_generator = ResourceGenerator(\n",
    "    \"Encounter\", generator_parameters=emergency_encounter_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    emergency_encounter_generator,\n",
    "    \"emergency-encounter\",\n",
    "    depends_on=\"base\",\n",
    "    reference_field=\"subject\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "and and ICU encounter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "icu_encounter_period_generator = FieldGenerator(\n",
    "    field=\"period\",\n",
    "    generator_function=lambda: {\n",
    "        \"start\": pendulum.now().subtract(days=720).to_date_string(),\n",
    "        \"end\": pendulum.now().subtract(days=710).to_date_string(),\n",
    "    },\n",
    ")\n",
    "\n",
    "# icu encounter\n",
    "icu_encounter_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"class\", value=Codes.ICU_ENCOUNTER.value),\n",
    "        FieldValue(\n",
    "            field=\"type\", value=Codes.ICU_ENCOUNTER_TYPE.value, list_field=True\n",
    "        ),\n",
    "        FieldValue(field=\"status\", value=\"finished\"),\n",
    "    ],\n",
    "    field_generators=[icu_encounter_period_generator],\n",
    ")\n",
    "\n",
    "icu_encounter_generator = ResourceGenerator(\n",
    "    \"Encounter\", generator_parameters=icu_encounter_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    icu_encounter_generator,\n",
    "    \"icu-encounter\",\n",
    "    depends_on=[\"base\", \"emergency-encounter\"],\n",
    "    reference_field=[\"subject\", None],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<module 'matplotlib.pyplot' from 'C:\\\\Users\\\\Michael Graf\\\\AppData\\\\Local\\\\pypoetry\\\\Cache\\\\virtualenvs\\\\fhir-kindling-aPmNxlUm-py3.11\\\\Lib\\\\site-packages\\\\matplotlib\\\\pyplot.py'>"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "dataset_generator.draw_graph()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now we generate some observations for the patient. We start with blood oxygen saturation as a time series generator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# blood oxygen saturation\n",
    "blood_oxygen_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"code\", value=Codes.OXYGEN_SATURATION.value),\n",
    "        FieldValue(field=\"status\", value=\"final\"),\n",
    "    ],\n",
    "    field_generators=[\n",
    "        FieldGenerator(\n",
    "            field=\"valueQuantity\",\n",
    "            generator_function=lambda: {\n",
    "                \"value\": random.randint(80, 100),\n",
    "                \"unit\": \"%\",\n",
    "            },\n",
    "        ),\n",
    "    ],\n",
    ")\n",
    "blood_oxygen_saturation_generator = ResourceGenerator(\n",
    "    \"Observation\", generator_parameters=blood_oxygen_params\n",
    ")\n",
    "\n",
    "bo_time_series_generator = TimeSeriesGenerator(\n",
    "    resource_generator=blood_oxygen_saturation_generator,\n",
    "    start=pendulum.now().subtract(days=720),\n",
    "    n=10,\n",
    "    time_field=\"effectiveDateTime\",\n",
    "    freq=\"daily\"\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    bo_time_series_generator,\n",
    "    \"blood-oxygen-saturation\",\n",
    "    depends_on=\"base\",\n",
    "    reference_field=\"subject\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Body temperature"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# body temperature\n",
    "body_temperature_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"code\", value=Codes.BODY_TEMPERATURE.value),\n",
    "        FieldValue(field=\"status\", value=\"final\"),\n",
    "    ],\n",
    "    field_generators=[\n",
    "        FieldGenerator(\n",
    "            field=\"valueQuantity\",\n",
    "            generator_function=lambda: {\n",
    "                \"value\": random.randint(36, 40) + random.random(),\n",
    "                \"unit\": \"°C\",\n",
    "            },\n",
    "        ),\n",
    "    ],\n",
    ")\n",
    "\n",
    "body_temperature_generator = ResourceGenerator(\n",
    "    \"Observation\", generator_parameters=body_temperature_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    body_temperature_generator,\n",
    "    \"body-temperature\",\n",
    "    depends_on=\"icu-encounter\",\n",
    "    reference_field=\"encounter\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "respiratory rate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "# respiratory rate\n",
    "respiratory_rate_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"code\", value=Codes.RESPIRATORY_RATE.value),\n",
    "        FieldValue(field=\"status\", value=\"final\"),\n",
    "    ],\n",
    "    field_generators=[\n",
    "        FieldGenerator(\n",
    "            field=\"valueQuantity\",\n",
    "            generator_function=lambda: {\n",
    "                \"value\": random.randint(12, 30),\n",
    "                \"unit\": \"breaths/min\",\n",
    "            },\n",
    "        ),\n",
    "    ],\n",
    ")\n",
    "\n",
    "respiratory_rate_generator = ResourceGenerator(\n",
    "    \"Observation\", generator_parameters=respiratory_rate_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    respiratory_rate_generator,\n",
    "    \"respiratory-rate\",\n",
    "    depends_on=\"icu-encounter\",\n",
    "    reference_field=\"encounter\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "heart rate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "heart_rate_params = GeneratorParameters(\n",
    "    field_values=[\n",
    "        FieldValue(field=\"code\", value=Codes.HEART_RATE.value),\n",
    "        FieldValue(field=\"status\", value=\"final\"),\n",
    "    ],\n",
    "    field_generators=[\n",
    "        FieldGenerator(\n",
    "            field=\"valueQuantity\",\n",
    "            generator_function=lambda: {\n",
    "                \"value\": random.randint(60, 100),\n",
    "                \"unit\": \"beats/min\",\n",
    "            },\n",
    "        ),\n",
    "    ],\n",
    ")\n",
    "\n",
    "heart_rate_generator = ResourceGenerator(\n",
    "    \"Observation\", generator_parameters=heart_rate_params\n",
    ")\n",
    "\n",
    "dataset_generator = dataset_generator.add_resource_generator(\n",
    "    heart_rate_generator,\n",
    "    name=\"heart-rate\",\n",
    "    depends_on=\"icu-encounter\",\n",
    "    reference_field=\"encounter\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "With this we are done for now and can look at the graph again"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<module 'matplotlib.pyplot' from 'C:\\\\Users\\\\Michael Graf\\\\AppData\\\\Local\\\\pypoetry\\\\Cache\\\\virtualenvs\\\\fhir-kindling-aPmNxlUm-py3.11\\\\Lib\\\\site-packages\\\\matplotlib\\\\pyplot.py'>"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "dataset_generator.draw_graph()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Now we can actually use the dataset generator to generate data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8df3708f8ed548dda6174c7d36cb8cef",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Generating dataset:   0%|          | 0/10 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "dataset = dataset_generator.generate(display=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "183"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dataset.n_resources"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Upload the dataset to a server"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "6c2cb9abb9dd48fca9bf19aecc0c9c2c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/183 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "res = dataset.upload(server, display=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<QueryResponse(resource=Patient, n=20)>"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "covid_query = server.query(\"Patient\").has(\n",
    "    resource=\"Condition\",\n",
    "    search_param=\"code\",\n",
    "    operator=\"eq\",\n",
    "    value=\"RA01.0\",\n",
    "    reference_param=\"subject\",\n",
    ").include(\n",
    "    resource=\"Condition\",\n",
    "    reference_param=\"subject\",\n",
    "    reverse=True\n",
    ")\n",
    "\n",
    "covid_response = covid_query.all()\n",
    "covid_response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Transferring resources from one server to another\n",
    "\n",
    "Use the `transfer()` function on a server object to transfer resources from one server to another while keeping referential integrity and using server assigned IDs.  \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The transfer is a three-step process:\n",
    "1. Analyze the resources to be transferred and build a DAG modeling the references\n",
    "2. Obtain any missing resources that are referenced from the source server\n",
    "3. Upload the resources to the target server based on the reference DAG\n",
    "4. Generate a record linkage dictionary that links the transfered remsources"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Resources form a reference graph\n",
    "\n",
    "![FHIR Kindling](assets/resource_graphs.png)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Which gets resolved step by step\n",
    "![FHIR Kindling](assets/upload_graph.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Connect to an additional server and transfer based on the query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8df9df80bd4b4fa2a0be8e650efb72d7",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/40 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "<TransferResponse(origin_server=http://localhost:9090/fhir, destination_server=http://localhost:9091/fhir)"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# define a new server\n",
    "transfer_api_url = \"http://localhost:9091/fhir\"\n",
    "transfer_server = FhirServer(api_address=transfer_api_url)\n",
    "\n",
    "server.transfer(transfer_server, query=covid_query, display=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Benchmarking"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Benchmarks for clients (to see if the library is faster :)) as well as generalizable tools to benchmark the performance of FHIR servers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Client benchmark\n",
    "![FHIR Kindling](assets/query_plot.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Server Benchmark"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Compare the performance of multiple FHIR servers on standard CRUD operations as well as search\n",
    "- Create\n",
    "- Batch Create\n",
    "- Search\n",
    "- Delete\n",
    "\n",
    "The benchmarking tool will generate the same synthetic data that we created before and insert the resources into each server.\n",
    "The created resources are tracked and removed at the end."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Compare the three most common fhir servers\n",
    "- Blaze\n",
    "- Hapi\n",
    "- Linux4Health"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "servers = [\n",
    "    {\"name\": \"blaze\", \"api_address\": \"http://localhost:9090/fhir\"},\n",
    "    {\"name\": \"hapi\", \"api_address\": \"http://localhost:9091/fhir\"},\n",
    "    {\n",
    "        \"name\": \"linux4h\",\n",
    "        \"api_address\": \"http://localhost:9080/fhir-server/api/v4/\",\n",
    "        \"credentials\": {\"username\": \"fhiruser\", \"password\": \"change-password\"},\n",
    "    },\n",
    "]\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Setup the servers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "initializing Server blaze -- http://localhost:9090/fhir\n",
      "initializing Server hapi -- http://localhost:9091/fhir\n",
      "initializing Server linux4h -- http://localhost:9080/fhir-server/api/v4/\n"
     ]
    }
   ],
   "source": [
    "benchmark_servers = []\n",
    "for s in servers:\n",
    "    print(f\"initializing Server {s['name']} -- {s['api_address']}\")\n",
    "    credentials = s.get(\"credentials\", None)\n",
    "    if credentials:\n",
    "        benchmark_servers.append(\n",
    "            FhirServer(\n",
    "                api_address=s[\"api_address\"],\n",
    "                **credentials,\n",
    "            )\n",
    "        )\n",
    "    else:\n",
    "        benchmark_servers.append(FhirServer(api_address=s[\"api_address\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Configure the benchmark"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from fhir_kindling.benchmark import ServerBenchmark\n",
    "benchmark = ServerBenchmark(\n",
    "    servers=benchmark_servers,\n",
    "    server_names=[s[\"name\"] for s in servers],\n",
    "    dataset_size=100,\n",
    "    n_attempts=2,\n",
    ")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Run the test suite"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8f40b12ab63147e0a3376873484376c8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Generating dataset:   0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8bc23f8ca4054fc0a5807bdaabd2cf4c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Running bechmarks for 3 servers:: 0it [00:00, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Server http://localhost:9090/fhir:   0%|          | 0/7 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c835f3605914435ea5c5c05a423f6a73",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Server http://localhost:9091/fhir:   0%|          | 0/7 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "benchmark.run_suite()\n",
    "figure = benchmark.plot()\n",
    "figure.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### How to use & customize the benchmark\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "##### Test the performance of the servers on a new vm\n",
    "\n",
    "1. Clone & install the library\n",
    "2. Navigate into the benchmarks folder\n",
    "3. Start the compose file containing the three servers `docker compose up`\n",
    "4. Run the preconfigured benchmark script `python benchmark_servers.py`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "#### Add custom queries for specific pain points\n",
    "You can customize the queries to be tracked by providing `custom_queries` to the server benchmarks.\n",
    "You can also configure which steps should be run by providing a list of steps:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Custom benchmark"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "benchmark = ServerBenchmark(\n",
    "    servers=benchmark_servers,\n",
    "    server_names=[s[\"name\"] for s in servers],\n",
    "    dataset_size=100,\n",
    "    n_attempts=3,\n",
    "    steps=[\"search\"],\n",
    "    custom_query\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Outlook\n",
    "\n",
    "- Optimizations for handling large amounts of data\n",
    "- Stable 1.0 release soon\n",
    "\n",
    "### Privacy Methods\n",
    "- use automatic tabular serialization to evaluate bundle responses with methods like k-anonymity\n",
    "- automatic anonymization of fields (i.e. rounding dates to the year recoreded)\n",
    "\n",
    "### User interface\n",
    "- Graphical userinterface to create, save and execute queries\n",
    "- Autocomplete for resources and their fields"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "<center><h2>Questions?</h2></center>\n",
    "\n",
    "\n",
    "<center><h2>Feature requests, contributions and 🌟 are very welcome!</h2></center>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}