Pipeline overview
- Single pipeline per collection — Each collection can declare one pipeline. You can sequence as many processors as needed inside it.
- Versioned definitions — Updating a pipeline creates a new version (for example
clinics.2
). In-flight ingestions finish on the version that was active when they started. - Deterministic execution — Steps run in order. A failure stops the pipeline and the ingestion fails.
Work in progress
Work in progress
Pipeline creation APIs are rolling out. Contact Clinia support if you need early access. You can already monitor executions.
Dataflow through the pipeline
- Ingestion request hits a source (
bulk
,bundle
, or single writes). - Pipeline dispatch loads the latest pipeline version configured on the target collection.
- Processor chain runs sequentially. Each step can:
- Enrich the payload (segmenters, vectorizers)
- Mutate properties (address augmentation, Clinia functions, OCR)
- Validate intermediate results before continuing
- Default schema validation executes after all processors to ensure the resource still complies with its profile.
- Persistence writes the transformed data into the registry and emits receipts for observability.
Creating a pipeline
Triggers
Triggers control whether a processor runs for a specific payload.Operator trigger
Always trigger
onlyOnTriggeredPipeline
to true
when a processor should run only if a prior step has already executed (for example, conditional validation).
Schema validation
- Pipelines include an automatic validation pass at the end of the chain.
- Add explicit Schema Validator steps earlier to fail fast before expensive processing or to validate post-mutation states.
Monitoring pipelines
Use the pipeline execution APIs to audit and debug ingestion flows:- Get pipeline execution for a specific execution ID.
- Query pipeline executions to build dashboards or human-in-the-loop queues.
withOperationBody=true
if you need to inspect the payload that triggered the execution.
Next steps:
- See built-in processors for enrichment, validation, and OCR options.
- Explore custom processors to extend the pipeline with bespoke logic.