Skip to end of banner
Go to start of banner

Batches

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Current »

Introduction

Batches were introduced in 20.2 to easily and safely operate on a large amount of data. Batches operate on a large data set of records conveyed via a filter. The data set is then linked to one of the various tasks. Batches can either be started manually in the MediaHaven REST API or created by a workflow process.

Error Handling & Reporting 24.2+

Batches will handle every record matching the provided query. Depending on the outcome of the handling the following properties of the batch change

  • The record did change → Completed increments by 1

  • The record did not change → Skipped increments by 1

  • Failure → Failed increments by 1

The batch does not abort on failure but keeps on processing the subsequent records, unless at least 20% of the total failed. In the latter case, the batch is assigned the status TooManyFailed.

Failed records can be retrieved through monitoring (Batches → Failed records) or the API: /batches/:batchId/failures

Status

Status

Meaning

Waiting

Batch has been created but no page has been picked up yet

Processing

At least one page of the batch is already processing

Completed

All records have been processed and there was no failure for any record

CompletedWithErrors

All records have been processed and there was at 1 failure for a record

PostBatchFailed

All records have been processed but in the post batch step an exception occured.

TooManyFailed

At 20% of records encountered a failure and the batch was aborted as a consequence

Cancellling

A request has been sent to cancel the batch. No new jobs will be created.

Cancelled

Status after all the existing jobs of the batch are finished after cancelling.

API Permissions

POST batches/

Any user can create batches for the index of their own organisation; the created batches search as the user who created to batch.

GET batches/

The returned batches depend on the function of the user

Function

Effect

No

Can read the batches created by this user

ADMIN_BATCHES

Can read all the batches from the index of the organisation of this user

ADMIN_BATCHES + ADMIN_VIEW_ALL_ORGANISATIONS

Can read all batches from all indices

PATCH batches/ 24.3+

Following functions are needed to partially update a batch:

Function

When needed?

ADMIN_BACKEND_SERVICES

Own batches

ADMIN_BACKEND_SERVICES + ADMIN_BATCHES

Batches from the same organisation

ADMIN_BACKEND_SERVICES + ADMIN_BATCHES + ADMIN_EDIT_ALL_ORGANISATIONS

Batches from other organisations

DELETE batches/ 24.3+

For cancelling a batch you need the same functions as for PATCH.

Cleanup 24.2+

Completed batches are cleaned up 30 days after their finish date.

Multi indices 22.2+

Normally, a batch is executed on the index of the organization to which the user belongs. When a batch is started by the zeticon@installation or system@installation user, it will be executed across all indices available on the system.

Heavy batches 24.3+

To prevent heavy batches from clogging the system, following actions can be taken:

  • Lower the priority of the batch using a PATCH

  • Change the worker daemon zone using a PATCH

  • Cancel the batch by sending a DELETE request, which sets the status to Cancelling. The cancellation is not an instant process, but instead, it will let the existing jobs of the batch finish. No new jobs will be created. After the jobs are finished, the status is set to Cancelled.

  • No labels