It is important to be able to show your user why a certain result is relevant to their search,
highlighting can fill this role in a few different ways.
Highlights in a vector search context only make sense if the search is done on arrays (usually segmented symbol properties).
content; the highlight will return the most relevant array items along with their scores, the resource property
will return the entire property content. This allows you to show the user which part of the array
matched the query.Highlights are returned for the three most relevant items for every matched resource. Matched
resources are found through HNSW (Hierarchical Navigable Small World) search.
Consider the threshold an implementation detail, it may change from model to model and Clinia takes
care of fine-tuning it to return relevant results.The score will vary between 0 and 1. It is the cosine similarity between the passage and the knn operator value.
Copy
Ask AI
// Example highlighting property of a query hit where path.to.property is an array property{ "path.to.property": [ { "data": "passage matched by the query", "path": "path.to.property.0", // where 0 is the index of the passage when highlighting a segmented property "score": 0.9, // score range can vary from model to model, use this in relation to other highlights "type": "vector" // discriminator against textual highlights } ]}
The following setup is required to run the subsequent code snippets, but you can skip it and still
follow along with the rest of the tutorial if you prefer!In short, it sets up and ingests data for a profile with three properties:
title: a symbol property
abstract: a symbol property that is segmented into passages and vectorized
content: an array of objects, each object having two symbol properties: sectionTitle and text.
The text property is vectorized.
Required setup
To correctly showcase highlights, we need to set up our environment for an hybrid search.
For the sake of this tutorial, we will be using the prestigious-journal source, the articles profile
and the abstracts partition.
curl -X POST "https://$CLINIA_WORKSPACE/sources/prestigious-journal/v1/resources/bulk" \ -H "X-Clinia-API-Key: $CLINIA_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "operations": [ { "action": "CREATE", "create": { "type": "articles", "data": { "title": "Metabolic Resilience Index: Continuous Multi-Sensor Signatures for Early Detection of Dysmetabolic Risk", "abstract": "Researchers proposing an integrated “Metabolic Resilience Index” argue that subtle shifts in glucose variability precede overt fasting hyperglycemia. In their conceptual framework, circadian misalignment amplifies low‑grade inflammation through maladaptive cortisol rhythms. They describe how wearable sensor data—heart rate variability, peripheral temperature, and sleep fragmentation—can triangulate emerging autonomic imbalance. The model asserts that postprandial spikes combined with elevated nocturnal glucose plateaus predict mitochondrial oxidative stress. Mitochondrial efficiency is inferred indirectly via delayed recovery of resting heart rate after mild exertion. The authors layer in gut microbiome diversity metrics, noting decreased short‑chain fatty acid proxy scores in tandem with rising inflammatory cytokine panels. They suggest that composite biomarker clustering outperforms any single lab value for early metabolic syndrome detection. Beta cell “whisper distress” is depicted as a phase where insulin pulsatility dampens before fasting labs appear abnormal. Subjective fatigue ratings correlate with sleep efficiency dips on days of higher glycemic excursions. A proposed dashboard flags when rolling 7‑day variability in glucose exceeds a personalized threshold. Cortisol awakening response flattening is labeled a sentinel of hypothalamic–pituitary axis strain. The narrative links microglial priming to systemic inflammatory tone during chronic circadian disruption. A feedback loop is illustrated where inflammation impairs mitochondrial turnover, worsening energetic flexibility. They emphasize that early intervention windows are missed when clinicians focus solely on annual fasting labs. Proposed interventions include light timing hygiene and meal distribution realignment rather than immediate pharmacology. The concept paper also mentions that residual post-lunch glycemic tails predict evening cravings. A pilot simulation shows that reducing late eating narrows nocturnal glucose variability bands. The framework highlights that continuous metrics enable detection of inflection points, not just static abnormalities. An uncertainty layer is added to prevent overconfidence in noisy wearable signals. Finally, the authors call for federated learning to refine the composite inflammation and glucose variability signatures across diverse populations.", "content": [ { "sectionTitle": "Methods", "text": "We conducted a longitudinal study involving 500 participants monitored over 12 months using wearable sensors that tracked heart rate variability, glucose levels, sleep patterns, and physical activity." }, { "sectionTitle": "Methods", "text": "Data were collected in real-time and analyzed using machine learning algorithms to identify patterns indicative of metabolic resilience or vulnerability." }, { "sectionTitle": "Results", "text": "Preliminary findings indicate that individuals with lower MRI scores exhibited higher variability in glucose levels and reduced heart rate variability, correlating with increased inflammatory markers." }, { "sectionTitle": "Conclusion", "text": "Notably, these changes were detectable weeks before traditional clinical indicators of metabolic syndrome appeared." } ] } } } ]}'
6
Wait for ingestion to complete
Look at the Task API guide to better understand how
to poll for task status. This is not done here since it cannot be expressed as a single curl command.
Vector search using the knn operator on the segmented and vectorized abstract.passages.vector property.
One highlight will be returned for the 3 most relevant (highest score) passages.
// response shortened to .hits[0].highlighting{ "highlighting": { "abstract.passages": [ { "data": "Researchers proposing an integrated “Metabolic Resilience Index” argue that subtle shifts in glucose variability precede overt fasting hyperglycemia. In their conceptual framework, circadian misalignment amplifies low‑grade inflammation through maladaptive cortisol rhythms. They describe how wearable sensor data—heart rate variability, peripheral temperature, and sleep fragmentation—can triangulate emerging autonomic imbalance. The model asserts that postprandial spikes combined with elevated nocturnal glucose plateaus predict mitochondrial oxidative stress. Mitochondrial efficiency is inferred indirectly via delayed recovery of resting heart rate after mild exertion. The authors layer in gut microbiome diversity metrics, noting decreased short‑chain fatty acid proxy scores in tandem with rising inflammatory cytokine panels. They suggest that composite biomarker clustering outperforms any single lab value for early metabolic syndrome detection. Beta cell “whisper distress” is depicted as a phase where insulin pulsatility dampens before fasting labs appear abnormal. Subjective fatigue ratings correlate with sleep efficiency dips on days of higher glycemic excursions. A proposed dashboard flags when rolling 7‑day variability in glucose exceeds a personalized threshold. Cortisol awakening response flattening is labeled a sentinel of hypothalamic–pituitary axis strain. The narrative links microglial priming to systemic inflammatory tone during chronic circadian disruption. A feedback loop is illustrated where inflammation impairs mitochondrial turnover, worsening energetic flexibility.", "path": "abstract.passages.0", "score": 0.9013, "type": "vector" }, { "data": " Proposed interventions include light timing hygiene and meal distribution realignment rather than immediate pharmacology. The concept paper also mentions that residual post-lunch glycemic tails predict evening cravings. A pilot simulation shows that reducing late eating narrows nocturnal glucose variability bands. The framework highlights that continuous metrics enable detection of inflection points, not just static abnormalities. An uncertainty layer is added to prevent overconfidence in noisy wearable signals. Finally, the authors call for federated learning to refine the composite inflammation and glucose variability signatures across diverse populations.", "path": "abstract.passages.2", "score": 0.8862, "type": "vector" }, { "data": " They emphasize that early intervention windows are missed when clinicians focus solely on annual fasting labs.", "path": "abstract.passages.1", "score": 0.8699, "type": "vector" } ] }}
Textual search can also be highlighted in a segmented property. In which case, a different highlight
will be returned for every segment that matches the query.
// response shortened to .hits[0].highlighting{ "highlighting": { "abstract.passages": [ { "highlight": "Researchers proposing an integrated “Metabolic Resilience Index” argue that subtle shifts in <em>glucose</em>", "type": "textual" }, { "highlight": "The model asserts that postprandial spikes combined with elevated nocturnal <em>glucose</em> plateaus predict", "type": "textual" }, { "highlight": "A proposed dashboard flags when rolling 7‑day variability in <em>glucose</em> exceeds a personalized threshold", "type": "textual" }, { "highlight": "A pilot simulation shows that reducing late eating narrows nocturnal <em>glucose</em> variability bands.", "type": "textual" }, { "highlight": "Finally, the authors call for federated learning to refine the composite inflammation and <em>glucose</em> variability", "type": "textual" } ] }}
Multiple vector queries on the same Array property
The following query does not work. You cannot request highlights for a vector property that is queried
by two different knn operators since there would be no way to know which highlight corresponds to
which knn operator.
You can also request highlights on an array of objects as long as your operator matches a textual
property inside the object!In this example, the content property is an array of objects, each object having two symbol properties: sectionTitle and text.