Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The MediaHaven REST API already allows for uploading files but this manner is not robust for uploading large files or when the network connection is unstable. Resumble Resumable uploads will fix this problem case, by offering a robust resumable uploadsmultipart chunked upload.

Design

Drawio sketch
mVer2
simple0
zoom1
simpleinComment0
inCommentpageId04555931652
custContentId4555538458
pageIddiagramDisplayName4555931652resumable uploads
lbox1diagramDisplayNameresumable uploads
contentVer24
revision24
baseUrlhttps://mediahaven.atlassian.net/wiki
diagramNameresumable uploads
pCenter0
width914913.50004249780570000424978057
links
tbstyle
height981.5

API

Business Logic

Record status

Error handling

  • When the multipart upload to S3 encounters an error (i.e. incorrect checksum of a chunk), the record will be Rejectedwith an appropriate error reason

Storage

The S3 storage is treated as one of the /wiki/spaces/CS/pages/20643843 in the system to re-use many existing features

  • A new cluster group “resumable_uploads” with the role TRANSIENT is created, highly similar to the existing cluster group “ingest”, which contains 1 shared storage pool for the S3 object store

  • The uploading record will be linked with this shared storage pool and the session ID will also be stored as the distribution ID

  • The file is stored on the S3 object store using the standardized naming convention <Record ID>/<Record ID>.<Original Extension>

  • The garbage collection will clean up this object store for this record

    • When the ingest workflow has transferred the record to the definitive storage

    • When the record is permanently deleted (manually or automatically after being inactive for 2 weeks or longer)

    • When the upload is still in progress it can use the session ID to delete it

Activation

See https://mediahaven.atlassian.net/wiki/spaces/CS/pages/4606820353/Resumable+upload#Activation