Journey Overview
Patient Chart Search empowers clinicians to quickly find critical information within patient records using natural language queries. It leverages semantic understanding to deliver fast, accurate, and context-aware results across notes, labs, medications, and more. For clinicians, it reduces time spent searching and supports informed decision-making. For health systems, it enhances clinical efficiency and improves documentation quality. For patients, it contributes to safer, more responsive care.
Federated Search: The Foundation of Patient Chart Search
Federated search is a core feature that enables Patient Chart Search to deliver comprehensive results by simultaneously querying multiple disparate data sources within a health system. This capability allows clinicians to search across different resources, such as:- Clinical notes from various EMR systems
- Allergy and intolerance records
- Condition and diagnosis information
- Encounter and visit details from multiple systems
- Medication statements from pharmacy systems
- Procedure records from clinical systems
- Vital signs and laboratory observations
- Immunization records
- Diagnostic reports from multiple departments
- Historical records from legacy systems
- And many more!
Getting Started
To get a broad understanding of the components within our data fabric, you can refer to our platform overview. To get started in this journey, you will need:- A Clinia workspace
- A Clinia service account (API Key)
- Ability to execute HTTP requests
- Some data to ingest
Workspace Configuration
To leverage semantic search and Clinia’s query understanding capabilities, you will need a HGS partition instead of our Standard partition offering.Create a Data Source
Documentation Currently, the only data source type available is a Registry. To create your{name}
data source, run the following request:
Create your Patient Chart Profiles
FHIR to Clinia Platform Mapping: The profiles below are designed to work with the included Python mapping script (
fhir_to_clinia_mapper.py
) that converts FHIR resources to Clinia Platform datatypes. This provides a realistic approach where you transform your FHIR data before ingestion.FHIR to Clinia Platform Mapping Script
The profiles above are designed to work with proper Clinia Platform datatypes. Since most healthcare systems work with FHIR data, we’ve provided a Python script that helps transform FHIR resources into the correct Clinia Platform format before ingestion.
- Pydantic Models: Type-safe models for both FHIR input and Clinia Platform output formats
- Validation: Automatic validation of data structure during conversion
- Identifier Handling: Proper conversion of patient references to Clinia Platform identifier datatype
- Comprehensive Mapping: Support for all FHIR resource types used in patient chart search
Key Features of the Mapping Script
- Patient ID Normalization: Converts FHIR patient references (
Patient/PAT-12345
) to proper Clinia Platform identifiers - Complex Object Mapping: Handles nested FHIR structures like Coding, Reference, and Period datatypes
- Bulk Operation Support: Generates the correct format for Clinia’s bulk ingestion API
- Type Safety: Uses Pydantic v2 for runtime validation and IDE support
Example Usage
Installation Requirements
fhir_to_clinia_mapper.py
) is included in this documentation and provides mapping functions for all supported FHIR resource types.
FHIR to Clinia Platform Mapping Files
A Github Repository to help you quickstart your Clinia Journey is coming soon.
Create your Data Partition
Documentation Once your profile is created, we can now create the data partition, which is a virtual, searchable view of your data. Let’s create apatient_chart_search
partition for your EMR search application. Specifying the HEALTH_GRADE_SEARCH
is key to indicating that we want this partition to be built so that we can use semantic search functionalities instead of regular keyword search.
Ingestion Pipeline
Documentation Now into the fun stuff. To leverage Clinia’s HGS engine, you will need to augment your raw data using our various processors. For patient chart search, we typically deal with diverse types of clinical content - from dense clinical notes to structured FHIR resources. While dense retrievers excel at understanding clinical context, the challenge in a federated search system is to return the most relevant information from across all patient data sources to feed to the LLM in a token-efficient manner. For Clinical Notes, we use theChunker
processor to break down lengthy narrative text into meaningful segments. For FHIR resources (Allergies, Conditions, Encounters, Medications, Procedures, Observations, Immunizations, and DiagnosticReports), we apply the Vectorizer
processor directly to create semantic representations of the structured clinical data.
The Segmenter takes as input a symbol or markdown data type and returns a list of chunks. Internally, chunks are object datatypes, with a text
symbol property that we can then search on — or apply other processors to!
To properly support semantic search across patient charts, we will also need a Vectorizer
processor to create semantic representations of the clinical content. We recommend using our mte-base-clinical
model to do this as it was expressly trained on clinical data and designed to work well in patient care workflows.
The Vectorizer takes as input symbol data types and returns vectors (arrays of float-value points) representing your data in the vector space. This vector space is built in such a way that semantically related ideas or sentences (e.g. “diabetes” and “hyperglycemia”) are closer together and dissimilar ideas (e.g. “banana” and “psychologist”) are farther apart.
In the context of patient chart search, we will focus on generating vectors for properties where traditional keyword search may not be sufficient and that can benefit from semantic understanding (synonyms, hierarchical relationships, etc.). Since FHIR data contains a great number of clinical coding datatypes, we will use our model specialized in clinical terminology that is most efficient at representing clinical concept names and labels, mte-base-clinical.1
.
Ingesting data
Once everything is configured, you can create your FHIR-compliant patient chart records using our Standard or Bulk API. The Bulk API allows you to ingest data from multiple collections representing different aspects of patient care in a single endpoint.Important: The bulk ingestion example below shows data that has already been converted from FHIR to Clinia Platform format using the
fhir_to_clinia_mapper.py
script. In a real implementation, you would first use the mapping script to convert your FHIR resources before sending them to Clinia.bulkId
from the response that the request above will give you to track the status of the bulk ingestion request. Use this request to do so:
Querying your Federated Patient Chart Search Partition
Documentation Once the ingestion is complete, you are now ready to use your HGS partition! The federated search capability allows you to query across all patient data collections simultaneously, providing comprehensive results from clinical notes, allergies, conditions, encounters, medications, procedures, observations, immunizations, and diagnostic reports. You can use the Data Partition API to perform federated search queries. Here is one example of a query that uses theknn
operator for your semantic fields across multiple collections to find information about a patient’s cardiac condition:
highlighting
support, to tell you why a given result was relevant. Using highlighting, you will be able to tell which of the chunks or passages within each patient record hit was most relevant. This is particularly useful for display purposes in clinical interfaces, but also to generate the best answer possible using our Summarization API.
The federated search response will include results from all relevant FHIR-compliant collections (clinical notes, allergies, conditions, encounters, medications, procedures, observations, immunizations, and diagnostic reports), allowing clinicians to see a comprehensive view of patient information related to their query.