Complex Objects 2.0 aka SIP
Introduction
We introduce an extension to the existing SIP model from Complex Objects 1.0. The file structure remains unchanged, namely, a ZIP
file with files organised in folders and containing a top-level METS file describing the files in the ZIP
. The structure of the METS has been changed to be correctly in line with the METS standard and have the expressivity for describing https://mediahaven.atlassian.net/wiki/spaces/CS/pages/4064641076.
MediaHaven already offers to export a complete record tree together with an METS XML.
Record Trees
Complex Objects 2.0 now can describe any of the https://mediahaven.atlassian.net/wiki/spaces/CS/pages/4064641076 one can model in MediaHaven.
Example 1
Newspaper
Newspaper page 1
TIFF image
Original representation
JP2 image
Original representation
Newspaper page 2
TIFF image
Original representation
JP2 image
Original representation
…
ALTO XML for the entire newspaper
Example 2
Dossier
Document 1 (e-mail)
Original representation (e-mail)
Email attachment 1
Original representation
Email attachment 2
Original representation
Document 2
Original representation
…
METS
SIP
The structure of the SIP is unchanged from the old Complex Objects 1.0.
Requirements
Unchanged from Complex Objects Reference
The following requirements are imposed by the Complex Ingest workflow that go beyond the well-formatted XML and the validation by the provided XSDs.
Every file in the archive must be referenced by the METS. If files are not referenced or if a referenced file is missing, the entire archive is rejected.
The MD5 checksums provided in the XML are compared against the calculated MD5 checksums. The entire archive is rejected if one check fails.
The file paths used in the METS are paths relative to the root the accompanying archive.