added

Registry Ingestion Pipelines and Task API

The ingestion pipeline enables a multi-step process for ingesting source records and relationships, augmenting, validating, and reviewing incoming data before it is persisted in the system. Ingestion pipelines are powered by asynchronous tasks that are accessible through the new Task API.

Task API

Create, update and delete requests to the registry generate and return a task ID specific to your individual request. Using this task ID, you can retrieve the task using the GetTask endpoint to get detail about its overall status. Use the withReceipts=true query params to include the underlying operation receipts.

{
    "receipts": [
        {
            "action": "CREATE",
            "id": "2wil2zeXT0V0wAw25pb9Xqha5qw",
            "pipelineExecution": {
                "id": "2wil31Z26Qht2Zkq3NoVKeK8Ste",
                "pipelineDefinitionId": "clinic.1",
                "status": "IN_REVIEW"
            },
            "receivedAt": "2025-05-06T11:47:51.654Z",
            "status": "PENDING",
            "targetType": "RESOURCE",
            "taskId": "bk_2wil2vg5uqi08x1aet097gLW8DO",
            "type": "clinic"
        },
        {
            "action": "CREATE",
            "completedAt": "2025-05-06T11:47:51.714Z",
            "id": "2wil2yilLFtscmwIxfB5R93d0Wv",
            "receivedAt": "2025-05-06T11:47:51.671Z",
            "status": "SUCCESS",
            "targetType": "RESOURCE",
            "taskId": "bk_2wil2vg5uqi08x1aet097gLW8DO",
            "type": "clinic"
        }
    ],
    "status": "PENDING",
    "taskId": "bk_2wil2vg5uqi08x1aet097gLW8DO"
}

Task Status

A task status is one of the following:

  1. SUCCESS indicates the task and its underlying operations have successfully completed and operations are persisted to source or were dropped by a REJECT action of the user.
  2. PENDING indicates at least one operations linked to the task is still being processed by the system. This also include task that needs manual action (IN_REVIEW). Even if the task is marked as PENDING, individual operations can be marked as SUCCESS.
  3. FAILURE indicates the task and its underlying operations have finished processing but at least one operation failed or was cancelled.
  4. CANCELLED indicates that all the operations in the task were cancelled due to a concurrent operation.

Operation Status

An operations receipt status is one of the following:

  1. SUCCESS indicates the operation has successfully completed and is persisted to source or was dropped by a REJECT action of the user.
  2. PENDING indicates the operations is actively being processed by the system.
  3. FAILURE indicates the operations failed to be fully processed and was not persisted to source.
  4. CANCELLED indicates the operation was cancelled by the system and was not persisted to source.

Operation Pipeline Execution

When an operation triggers the ingestion pipeline, you will see a pipelineExecution object in the associated operation receipt. It contains:

  1. The id of the pipeline execution. Use this ID to retrieve the full pipeline execution using the GetPipelineExecution endpoint.
  2. The status of the execution.
  3. The pipelineDefinitionId of the execution, indicating which collection pipeline the execution is a part of.

Pipeline Execution API

When an operation enters the ingestion pipeline, a pipeline execution is created. You can retrieve pipeline executions using the GetPipelineExecution endpoint.

{
    "id": "2winvG4ysVmapoh25oDdO1OlSJh",
    "operation": {
        "action": "CREATE",
        "target": {
            "id": "2winvJuYG1YMarssGdN5ojTE8Sg",
            "targetType": "RESOURCE",
            "type": "clinic"
        },
        "taskId": "bd_2winvHvw8oMXWAtefbcI9j6zytL"
    },
    "pipelineDefinitionId": "clinic.2",
    "status": "IN_REVIEW",
    "steps": [
        {
            "processedAt": "2025-05-06T12:11:30.85Z",
            "status": "PROCESSED"
        },
        {
            "status": "IN_REVIEW"
        },
        {
            "status": "PENDING"
        }
    ]
}

Pipeline Execution Status

A pipeline execution status is one of the following

  1. SUCCESSFUL indicates the pipeline execution completed successfully. Every step of the pipeline was executed successfully and the change has been persisted.
    Note: Completed successfully does not indicate that the data is valid. It only means that the pipeline successfully executed all of its steps.
  2. ERROR indicates the pipeline execution failed at a step due to a system error.
  3. IN_PROGRESS indicates the pipeline execution is being processed by the system.
  4. IN_REVIEW indicates the pipeline execution is waiting for a manual action, the entity is sitting in a queue.
  5. CANCELLED indicates the pipeline execution has been short-circuited by the system and never completed. This is usually the result of a concurrent request coming in the system on the same record/relationship.
  6. DROPPED indicates the pipeline execution has been dropped because the change has been rejected by the user following review.

Pipeline Execution Operation

The operation that is at the origin of this pipeline execution. It contains:

  1. The action that was performed, either CREATE , UPSERT, UPDATE or DELETE.
  2. The target record or relationship
  3. The associated taskId

Pipeline Definition

The pipelineDefinitonId contains the type of the pipeline execution and its revision.

Pipeline Execution Steps

Pipeline execution steps are populated when a execution is created. This means you can easily determine the progression of a pipeline execution at all times. Note that these steps do not include entity resolution and the associated resolution queue, which happens after a record has been persisted to the source. Individual pipeline steps are presented in the same order they are configured in the pipeline definition. They contain:

  1. The instant at which they have been processed (processedAt)
  2. The status of the step. It is one of the following:
    1. PROCESSED indicates the step has been processed.
    2. ERROR indicates the step has failed due to a system failure. The step will contain an error object when that is the case.
    3. IN_PROGRESS indicates the step is being processed by the system.
    4. IN_REVIEW indicates the step is waiting for a manual action.
    5. CANCELLED indicates the step has been cancelled. This is usually the result of a concurrent request coming in the system on the same record/relationship.
    6. PENDING indicates the step evaluation has not yet happened.
    7. NOT_TRIGGERED indicates the record/relationship did not match the trigger for that step, and therefore the step was skipped.
  3. For manual review steps, a decision object will also contain information the following information:
    1. An id for the decision
    2. A description of the decision, or reason for the decision
    3. The author of the decision.
    4. The instant at which the decision was taken (processedAt)

How Requests Are Processed

Single Record/Relationship Request

List of endpoints

Create Resource

Upsert Resource

Delete Resource

Create Relationship

Update Relationship

Delete Relationship

Single record/relationship CREATE and UPDATE requests are processed independently from one another. Each request result in a single task being created by the system.

  1. If the record/relationship matches any of the ingestion pipeline step triggers, an ACCEPTED response is returned by the system and the record/relationship will be processed asynchronously. You will receive a task id with which you can query for the task corresponding to this request.

    {
        "status": "ACCEPTED",
        "taskId": "s_2wgHKz8Zqt9p9JVAWy1kFh8wwnq"
    }
  2. If the record/relationship does not match any of the ingestion pipeline step triggers, a PERSISTED response is returned by the system indicating the record/relationship has been processed synchronously. You will still receive a task id with which you can query for the task corresponding to this request.

    {
        "resource": {
            // ...
        },
        "status": "PERSISTED",
        "taskId": "s_2wgHKz8Zqt9p9JVAWy1kFh8wwnq"
    }

DELETE requests do not create pipeline executions, but they will still be attributed a task ID the same way CREATE and UPDATE requests are.

sIngle-request.png

Concurrent Requests

Only one active pipeline execution can exists for a specific record at any given time. If a record is currently sitting in a queue, and another update to the same record is sent to the system, the initial pipeline execution will be canceled by the system in favor of the second.

Bulk Request

List of endpoints

Bulk Resources

Bulk Relationships

Bulk requests are processed as a whole, but each operation is processed independently from one another. A bulk request result in a single task being created by the system. That task is associated with multiple pipeline executions, one for every CREATE and UPDATE operations.

Bulk request processing is always asynchronous. An ACCEPTED response is returned by the system and you will receive a task id with which you can query for the task corresponding to this request. Even if the taskId and bulkdId are the same, use the taskId. The bulkId will eventually be deprecated.

{
    "bulkId": "bk_2wgIwVfJWi30JO80HvuK6cJ3fJa",
    "status": "ACCEPTED",
    "taskId": "bk_2wgIwVfJWi30JO80HvuK6cJ3fJa"
}

Because bulk operations from a single request are independent from one another, it is possible some operations to be marked as SUCCESS while others remain in the ingestion pipeline. When an operation is marked as SUCCESS, it has been persisted to source. As long as at least one operation remains in the ingestion pipeline, the bulk task status will remain PENDING.

{
    "receipts": [
        {
            "action": "CREATE",
            "id": "2wil2zeXT0V0wAw25pb9Xqha5qw",
            "pipelineExecution": {
                "id": "2wil31Z26Qht2Zkq3NoVKeK8Ste",
                "pipelineDefinitionId": "clinic.1",
                "status": "IN_REVIEW"
            },
            "receivedAt": "2025-05-06T11:47:51.654Z",
            "status": "PENDING",
            "targetType": "RESOURCE",
            "taskId": "bk_2wil2vg5uqi08x1aet097gLW8DO",
            "type": "clinic"
        },
        {
            "action": "CREATE",
            "completedAt": "2025-05-06T11:47:51.714Z",
            "id": "2wil2yilLFtscmwIxfB5R93d0Wv",
            "receivedAt": "2025-05-06T11:47:51.671Z",
            "status": "SUCCESS",
            "targetType": "RESOURCE",
            "taskId": "bk_2wil2vg5uqi08x1aet097gLW8DO",
            "type": "clinic"
        }
    ],
    "status": "PENDING",
    "taskId": "bk_2wil2vg5uqi08x1aet097gLW8DO"
}

DELETE operations do not result in a pipeline execution, but they will still be attributed a task ID like the CREATE and UPDATE operations.

bulk-request.png

Concurrent Requests

Only one active pipeline execution can exists for specific record at any given time. If a record is currently sitting in a queue, and another update to the same record is sent to the system, the initial pipeline execution will be canceled by the system in favor of the second.

Since bulk operations from a single bulk request are independent from one another, this doesn’t affect other operations from the same bulk request if they do not target the same record.

Bundle Request

List of endpoints

Bundle

Bundle requests are processed as a whole, and each operation need to successfully exit the ingestion pipeline for the bundle to be persisted. A bundle request result in a single task being created by the system. That task is associated with multiple pipeline execution, one for every record’s CREATE and UPDATE operations.

  1. If any record/relationship matches any of the ingestion pipeline step triggers, an ACCEPTED response is returned by the system and the records/relationships will be processed asynchronously. You will receive a task id with which you can query for the task corresponding to this request.

    {
        "status": "ACCEPTED",
        "taskId": "s_2wgHKz8Zqt9p9JVAWy1kFh8wwnq"
    }
  2. If no record/relationship match any of the ingestion pipeline step triggers, a PERSISTED response is returned by the system indicating the records/relationships have been processed synchronously. You will still receive a task id with which you can query for the task corresponding to this request.

Synchronous bundles will be deprecated in the future and will become 100% asynchronous, ingestion pipeline or not. We recommend processing all bundle responses as if they wereACCEPTED instead of PERSISTED.

{
		"relationshipOperationReceipts": [],
    "resourceOperationReceipts": []
    "status": "PERSISTED",
    "taskId": "s_2wgHKz8Zqt9p9JVAWy1kFh8wwnq"
}

Because bundle are processed as a transaction, all records/relationships need to exit their respective pipeline execution before the bundle is persisted. As long as at least one operation remains in the ingestion pipeline, the bundle’s status and all underlying operations will remain PENDING.

{
    "receipts": [
        {
            "action": "CREATE",
            "id": "2winvFQpOPlwmS8hAf3HjQbteCJ",
            "pipelineExecution": {
                "id": "2winvKoJxdEIMeGnRAu1B6UmyJn",
                "pipelineDefinitionId": "clinic.1",
                "status": "SUCCESSFUL"
            },
            "receivedAt": "2025-05-06T12:11:30.838Z",
            "status": "PENDING",
            "targetType": "RESOURCE",
            "taskId": "bd_2winvHvw8oMXWAtefbcI9j6zytL",
            "type": "clinic"
        },
        {
            "action": "CREATE",
            "id": "2winvJuYG1YMarssGdN5ojTE8Sg",
            "pipelineExecution": {
                "id": "2winvG4ysVmapoh25oDdO1OlSJh",
                "pipelineDefinitionId": "clinic.1",
                "status": "IN_REVIEW"
            },
            "receivedAt": "2025-05-06T12:11:30.838Z",
            "status": "PENDING",
            "targetType": "RESOURCE",
            "taskId": "bd_2winvHvw8oMXWAtefbcI9j6zytL",
            "type": "clinic"
        }
    ],
    "status": "PENDING",
    "taskId": "bd_2winvHvw8oMXWAtefbcI9j6zytL"
}

DELETE operations do not result in a pipeline execution, but they will still be attributed a task ID like the CREATE and UPDATE operations.

bundle-request.png

Concurrent Requests

Only one active pipeline execution can exists for specific record at any given time. If a record is currently sitting in a queue, and another update to the same record is sent to the system, the initial pipeline execution will be canceled by the system in favor of the second.

Since all operations of a bundle are dependent on one another to succeed, cancelling any of the bundle operation pipeline execution will result in the entire bundle being dropped. This means a concurrent request to a record/relationship that is part of a bundle will short-circuit the execution of the entire bundle if it is still PENDING.