Combine lexical and semantic retrieval for high recall and precision.
Hybrid search blends multiple retrieval methods in a single query—typically lexical (keyword) and semantic (vector/embedding) search—to return results that are both precise and comprehensive. Instead of choosing one approach, you combine signals so users get exact matches for explicit terms plus conceptually related content for intent. This page focuses specifically on the match parameter of the query object. For a higher‑level overview of how search works in Clinia, see Search Parameters.
Hybrid search is most effective when your data contains both well‑structured fields (names, codes, IDs) and unstructured text (notes, descriptions), or when queries mix exact terms and fuzzy natural language.
Broader recall for natural language. Vectors capture synonyms and intent (“heart attack” ≈ “myocardial infarction”).
Higher precision for critical terms. Lexical matching anchors on exact fields (identifiers, codes, brands, dosages).
Resilience to query drift. If a user’s terms don’t exist verbatim, semantic matching still retrieves plausible candidates, while keyword matching prevents off‑topic drift.
Better ranking quality. Fusing multiple relevance signals typically outperforms either signal alone on heterogeneous data.
Consumer or clinical search where language varies (“stomach bug” vs ICD/LOINC terms).
Knowledge, directory, and chart search where documents blend metadata and narrative.
Safety‑critical workflows where you must honor exact constraints (coverage, specialty, location) yet still support natural‑language discovery.
If your corpus is small, highly structured, and queries are consistent, pure lexical search may suffice. If queries are open‑ended with few exact identifiers, semantic‑only may be competitive. Most real‑world health data benefits from hybrid.
Hybrid queries commonly wrap sub‑queries in boolean operators. Choosing or vs and controls recall vs precision:Use or to broaden recall. Return results that match either lexical or semantic criteria. Best for exploration or conversational search use cases, where missing a relevant item is more costly than including a few loosely related results that the agent won’t account for anyway.Use and to enforce precision. Require that results satisfy both lexical and semantic conditions (for example, match a specialty code and be semantically similar to the narrative).
Start with or at the query stage for discovery, combine with strict filter constraints (coverage, geography) to keep results focused.
Use and when you must guarantee a hard match (e.g., an identifier, specialty, or vocabulary binding) and want semantic relevance inside that slice.
Prefer and for short, ambiguous queries that otherwise yield too many broad semantic matches; prefer or for longer, specific queries where either signal could be sufficient.
Roughly speaking, knn returns a fixed‑size nearest‑neighbor candidate set (top‑K) from the chosen vector field, ordered by semantic similarity to the query vector.When you combine knn with and, you are filtering that candidate bucket: only items in the knn set that also satisfy the other clauses remain eligible, and their semantic similarity continues to influence ranking alongside any lexical scores.With or, you take the union of candidates from all clauses and blend ranking signals, so results that satisfy both semantic and lexical conditions typically rank higher than those matching only one.Top‑level filter constraints gate eligibility; they do not add positive score.
The examples below show explicit query objects with operators and values written out manually. In a real app, when users type natural language, you may be better served by a semanticQuery. When specifying operators manually, you need to already know “what” you are looking for. The knn operator will work best with good match operators and filter constraints that you already know.
Mix consumer phrasing with structured constraints; maximize recall.Patients rarely use taxonomy codes. They type “heart doctor near me that takes Blue Cross.” Hybrid search lets you keep hard eligibility filters (plan, distance, status) while still understanding lay language via semantic similarity, reducing zero‑result queries and time‑to‑find.Using or here allows us to increase recall by allowing semantic matches on the bio property even
if the specialty term is not explicitly mentioned.In practice, providers that match both clauses (specialty + bio) will rank higher, but we allow
more candidates to be considered. This is very useful if there might be very few exact matches for the
speciality term alone.
Set up
This example uses a minimal provider directory dataset with vectorized bios.
The and operator ensures that all results will match the requested speciality code (match operator)
while knn finds the most semantically relevant providers within that set.Using and here gives us precision at the cost of some recall (semantically related specialities will be excluded).In referral routing, you must honor exact specialty taxonomy codes for compliance and payer rules. Hybrid search narrows results to the right specialty first, then uses semantic signals to surface the most relevant providers within that compliant set.
Combine precise mentions with conceptual similarity over structured or unstructured content.Clinical content and biomedical literature mix attributes that carry canonical entities (title, tags, concepts, related terms, acronyms etc.) and attributes that contain answers and insights, oftentime relying on varying appelations and synonyms to convey meaning. Hybrid search lets you catch exact these entity mentions while retrieving conceptually related content, improving answer quality by balancing recall and precision.
Set up
This example uses a lightweight knowledge base with chunked content.