Skip to end of banner
Go to start of banner

Resumable Uploads

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

This feature is introduced in version 25.1

Introduction

The MediaHaven REST API already allows for uploading files but this manner is not robust for uploading large files or when the network connection is unstable. Resumable uploads will fix this problem case, by offering a multipart chunked upload.

Design

Business Logic

Metadata

  • The upload API accepts the record ID as an input parameter. To obtain the corresponding session ID of the S3 multipart upload

    • Store the session ID as a new metadata field RecordInformation.ResumableUpload.SessionId with the type HiddenField

    • The workflow to be started when the resumable upload is completed will be stored in RecordInformation.ResumableUpload.Workflow

    • Stores the session ID in a memory cache, to prevent excessively contacting the REST API

Record status

Error handling

  • When the multipart upload to S3 encounters an error (i.e. incorrect checksum of a chunk), the record will be Rejected

Storage

The S3 storage is treated as one of /wiki/spaces/CS/pages/20643843 in the system to re-use many existing features

  • A new cluster group “resumable_uploads” with the role TRANSIENT is created, highly similar to the existing cluster group “ingest”, which contains 1 shared storage pool for the S3 object store

  • The uploading record will be linked with this shared storage pool

  • The file is stored on the S3 object store using the standardized naming convention <Record ID>/<Record ID>.<Original Extension>

  • The garbage collection will clean up this object store for this record when

    • The ingest workflow has transferred the record to the definitive storage

    • The record is permanently deleted (manually or automatically after being inactive for 2 weeks or longer)

  • The used capacity will tracked by the standard “Storage-free space” workflow

  • No labels