Batches
Introduction
Batches were introduced in 20.2
to easily and safely operate on a large amount of data. Batches operate on a large data set of records conveyed via a filter. The data set is then linked to one of the various tasks. Batches can be started manually in the MediaHaven REST API or created by a workflow process.
Error Handling & Reporting 24.2+
Batches will handle every record matching the provided query. Depending on the outcome of the handling the following properties of the batch change
The record did change → Completed increments by 1
The record did not change → Skipped increments by 1
Failure → Failed increments by 1
The batch does not abort on failure but keeps on processing the subsequent records unless at least 20% of the total failed. In the latter case, the batch is assigned the status TooManyFailed
.
Failed records can be retrieved through monitoring (Batches → Failed records) or the API: /batches/:batchId/failures
Status
Status | Meaning |
---|---|
Waiting | A batch has been created but no page has been picked up yet |
Processing | At least one page of the batch is already processing |
Completed | All records have been processed and there was no failure for any record |
CompletedWithErrors | All records have been processed and there was at 1 failure for a record |
PostBatchFailed | All records have been processed but in the post-batch step an exception occurred. |
TooManyFailed | At 20% of records encountered a failure and the batch was aborted as a consequence |
Cancelling | A request has been sent to cancel the batch. No new jobs will be created. |
Cancelled | Status after all the existing jobs of the batch are finished after cancelling. |
API Permissions
POST batches/
Any user can create batches for the index of their organisation; the created batches search as the user who created to batch.
GET batches/
The returned batches depend on the function of the user
Function | Effect |
---|---|
No | Can read the batches created by this user |
| Can read all the batches from the index of the organisation of this user |
| Can read all batches from all indices |
PATCH batches/ 24.3+
The following functions are needed to update a batch partially:
Function | When needed? |
---|---|
| Own batches |
| Batches from the same organisation |
| Batches from other organisations |
DELETE batches/ 24.3+
Requires the function ADMIN_BACKEND_SERVICES
.
For cancelling a batch you need the same functions as for PATCH.
Cleanup 24.2+
Completed batches are cleaned up 30 days after their finish date.
Multi indices 22.2+
Normally, a batch is executed on the index of the organization to which the user belongs. When the zeticon@installation or system@installation user starts a batch, it will be executed across all indices available on the system.
Heavy batches 24.3+
Requires the function ADMIN_BACKEND_SERVICES
.
To prevent heavy batches from clogging the system, the following actions can be taken:
Lower the priority of the batch using a PATCH
Change the worker daemon zone using a PATCH
Cancel the batch by sending a DELETE request, which sets the status to
Cancelling
. The cancellation is not an instant process, instead, it will let the existing jobs of the batch finish. No new jobs will be created. After the jobs are finished, the status is set toCancelled
.
Â