May 13th, 2025
Added Registry Ingestion Pipelines and Task API
Task API
Create, update and delete requests to the registry generate and return a task ID specific to your individual request. Using this task ID, you can retrieve the task using theGetTask
endpoint to get detail about its overall status. Use the withReceipts=true
query params to include the underlying operation receipts.Task Status
A task status is one of the following:SUCCESS
indicates the task and its underlying operations have successfully completed and operations are persisted to source or were dropped by aREJECT
action of the user.PENDING
indicates at least one operations linked to the task is still being processed by the system. This also include task that needs manual action (IN_REVIEW
). Even if the task is marked asPENDING
, individual operations can be marked asSUCCESS
.FAILURE
indicates the task and its underlying operations have finished processing but at least one operation failed or was cancelled.CANCELLED
indicates that all the operations in the task were cancelled due to a concurrent operation.
Operation Status
An operations receipt status is one of the following:SUCCESS
indicates the operation has successfully completed and is persisted to source or was dropped by aREJECT
action of the user.PENDING
indicates the operations is actively being processed by the system.FAILURE
indicates the operations failed to be fully processed and was not persisted to source.CANCELLED
indicates the operation was cancelled by the system and was not persisted to source.
Operation Pipeline Execution
When an operation triggers the ingestion pipeline, you will see apipelineExecution
object in the associated operation receipt. It contains:- The
id
of the pipeline execution. Use this ID to retrieve the full pipeline execution using theGetPipelineExecution
endpoint. - The
status
of the execution. - The
pipelineDefinitionId
of the execution, indicating which collection pipeline the execution is a part of.
Pipeline Execution API
When an operation enters the ingestion pipeline, a pipeline execution is created. You can retrieve pipeline executions using theGetPipelineExecution
endpoint.Pipeline Execution Status
A pipeline execution status is one of the followingSUCCESSFUL
indicates the pipeline execution completed successfully. Every step of the pipeline was executed successfully and the change has been persisted.
Note: Completed successfully does not indicate that the data is valid. It only means that the pipeline successfully executed all of its steps.ERROR
indicates the pipeline execution failed at a step due to a system error.IN_PROGRESS
indicates the pipeline execution is being processed by the system.IN_REVIEW
indicates the pipeline execution is waiting for a manual action, the entity is sitting in a queue.CANCELLED
indicates the pipeline execution has been short-circuited by the system and never completed. This is usually the result of a concurrent request coming in the system on the same record/relationship.DROPPED
indicates the pipeline execution has been dropped because the change has been rejected by the user following review.
Pipeline Execution Operation
The operation that is at the origin of this pipeline execution. It contains:- The
action
that was performed, eitherCREATE
,UPSERT
,UPDATE
orDELETE
. - The
target
record or relationship - The associated
taskId
Pipeline Definition
ThepipelineDefinitonId
contains the type of the pipeline execution and its revision.Pipeline Execution Steps
Pipeline execution steps are populated when a execution is created. This means you can easily determine the progression of a pipeline execution at all times. Note that these steps do not include entity resolution and the associated resolution queue, which happens after a record has been persisted to the source. Individual pipeline steps are presented in the same order they are configured in the pipeline definition. They contain:- The instant at which they have been processed (
processedAt
) - The status of the step. It is one of the following:
PROCESSED
indicates the step has been processed.ERROR
indicates the step has failed due to a system failure. The step will contain anerror
object when that is the case.IN_PROGRESS
indicates the step is being processed by the system.IN_REVIEW
indicates the step is waiting for a manual action.CANCELLED
indicates the step has been cancelled. This is usually the result of a concurrent request coming in the system on the same record/relationship.PENDING
indicates the step evaluation has not yet happened.NOT_TRIGGERED
indicates the record/relationship did not match the trigger for that step, and therefore the step was skipped.
- For manual review steps, a
decision
object will also contain information the following information:- An
id
for the decision - A
description
of the decision, or reason for the decision - The
author
of the decision. - The instant at which the decision was taken (
processedAt
)
- An
How Requests Are Processed
Single Record/Relationship Request
List of endpointsCreate ResourceUpsert ResourceDelete ResourceCreate RelationshipUpdate RelationshipDelete RelationshipSingle record/relationshipCREATE
and UPDATE
requests are processed independently from one another. Each request result in a single task being created by the system.-
If the record/relationship matches any of the ingestion pipeline step triggers, an
ACCEPTED
response is returned by the system and the record/relationship will be processed asynchronously. You will receive a task id with which you can query for the task corresponding to this request. -
If the record/relationship does not match any of the ingestion pipeline step triggers, a
PERSISTED
response is returned by the system indicating the record/relationship has been processed synchronously. You will still receive a task id with which you can query for the task corresponding to this request.
DELETE
requests do not create pipeline executions, but they will still be attributed a task ID the same way CREATE
and UPDATE
requests are.Concurrent Requests
Only one active pipeline execution can exists for a specific record at any given time. If a record is currently sitting in a queue, and another update to the same record is sent to the system, the initial pipeline execution will be canceled by the system in favor of the second.Bulk Request
List of endpointsBulk ResourcesBulk RelationshipsBulk requests are processed as a whole, but each operation is processed independently from one another. A bulk request result in a single task being created by the system. That task is associated with multiple pipeline executions, one for everyCREATE
and UPDATE
operations.Bulk request processing is always asynchronous. An ACCEPTED
response is returned by the system and you will receive a task id with which you can query for the task corresponding to this request. Even if the taskId
and bulkdId
are the same, use the taskId
. The bulkId
will eventually be deprecated.SUCCESS
while others remain in the ingestion pipeline. When an operation is marked as SUCCESS
, it has been persisted to source. As long as at least one operation remains in the ingestion pipeline, the bulk task status will remain PENDING
.DELETE
operations do not result in a pipeline execution, but they will still be attributed a task ID like the CREATE
and UPDATE
operations.Concurrent Requests
Only one active pipeline execution can exists for specific record at any given time. If a record is currently sitting in a queue, and another update to the same record is sent to the system, the initial pipeline execution will be canceled by the system in favor of the second.Since bulk operations from a single bulk request are independent from one another, this doesn’t affect other operations from the same bulk request if they do not target the same record.Bundle Request
List of endpointsBundleBundle requests are processed as a whole, and each operation need to successfully exit the ingestion pipeline for the bundle to be persisted. A bundle request result in a single task being created by the system. That task is associated with multiple pipeline execution, one for every record’sCREATE
and UPDATE
operations.-
If any record/relationship matches any of the ingestion pipeline step triggers, an
ACCEPTED
response is returned by the system and the records/relationships will be processed asynchronously. You will receive a task id with which you can query for the task corresponding to this request. -
If no record/relationship match any of the ingestion pipeline step triggers, a
PERSISTED
response is returned by the system indicating the records/relationships have been processed synchronously. You will still receive a task id with which you can query for the task corresponding to this request.
ACCEPTED
instead of PERSISTED
.PENDING
.DELETE
operations do not result in a pipeline execution, but they will still be attributed a task ID like the CREATE
and UPDATE
operations.Concurrent Requests
Only one active pipeline execution can exists for specific record at any given time. If a record is currently sitting in a queue, and another update to the same record is sent to the system, the initial pipeline execution will be canceled by the system in favor of the second.Since all operations of a bundle are dependent on one another to succeed, cancelling any of the bundle operation pipeline execution will result in the entire bundle being dropped. This means a concurrent request to a record/relationship that is part of a bundle will short-circuit the execution of the entire bundle if it is stillPENDING
.April 22nd, 2025
Registry APIs now support asynchronous processing with status indicators and global task IDs for workflows.
Breaking Changes
New status for registry operations
The behaviour of the Registry APIs has changed to support asynchronous processes. The response payload for allPOST, PUT and DELETE
endpoints will now include a status property. Depending on the status of the operation, a different response code will be returned.ACCEPTED
indicates that the request was captured and an asynchronous workflow has started. This returns a202
HTTP response code.PERSISTED
indicates that the request is completed and the data has been persisted. This returns a201
HTTP response code when a record is created and200
when a record has been updated or deleted.
POST | /sources/{sourceKey}/v1/resources/{resourceType} |
---|---|
PUT | /sources/{sourceKey}/v1/resources/{resourceType}/{id} |
DELETE | /sources/{sourceKey}/v1/resources/{resourceType}/{id} |
POST | /sources/{sourceKey}/v1/relationships/{relationshipType} |
PATCH | /sources/{sourceKey}/v1/relationships/{relationshipType}/{id} |
DELETE | /sources/{sourceKey}/v1/relationships/{relationshipType}/{id} |
PUT | /sources/{sourceKey}/v1/resources/{resourceType}/{id}/{nestedContainsPath} |
POST | /sources/{sourceKey}/v1/resources/{resourceType}/{id}/{nestedContainsPath} |
DELETE | /sources/{sourceKey}/v1/resources/{resourceType}/{id}/{nestedContainsPath} |
taskId
property in addition to their workflow IDs. The IDs will be identical. bulkId
and purgeId
properties are planned to be deprecated in future versions, so you should migrate to using the taskId
property whenever possible. For now, both the bulkId
and purgeId
will live side-by-side with a global taskId
.New payload for single registry operations
When creating, upserting or updating a single resource or relationship, the record data is now nested within aresource
object in the response payload instead of being at the root-level.Updated format for partition record source identifiers
A partition record often contains source identifiers in themeta.identifier
section, pointing back to the source records that make it up. The previous convention for those identifiers was the following:Updated format for relationship primary ID and partition relationship source identifiers
Source and Partition Relationship ID
A source or partition relationship has a primary ID, located in theid
property of a relationship. The previous convention accepted multiple formats for this ID, and they were the following:fromRecordID
and toRecordID
are the source record IDs of the from
and to
source records for this relationship. In the partitions, they are be unified record IDs of the from
and to
unified records for the relationship.Partition Relationship Source Identifiers
A partition relationship often contains source identifiers in themeta.identifier
section, pointing back to the source relationships that make it up. The previous convention for those identifiers was the following:Features
- Adds option to use clinia models in the ingestion pipeline for chunking and embedding data
Improvements
- Improvements in propagation performance of relationships from source to MDM to partitions.
- Improved search performance for data partitions with better indexing
- Clearer error messages for purge operations
- Improve the consistency of relationship identifiers in GET endpoints and query filters
January 9th, 2025
Improved Bulk request status management
- Modified how Registry’s Bulk requests status is set so that it remains PENDING as long as all operations have not been processed, instead of toggling to FAILURE as soon as a bulk operation fails.