You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
Version 1
Next »
Introduction
METS is an international standard defined by the Library of Congress. It defines a metadata XML, which allows for describing a structured set of physical files (files and their metadata) and non-physical objects (i.e. metadata-only containers). MediaHaven uses it to describe a structured set of files grouped together in a ZIP file. By adding the METS XML inside the ZIP, this combination of files and metadata forms a Submission Information Package (SIP) defined by the OAIS standard.
Complete Examples
Structure
Record Types in StructMap
The attribute TYPE
in mets:div
must refer to a valid record type. These types are configurable and additional ones can be configured.
For the newspaper ingest the following record types are applicable:
Newspaper
NewspaperPage
Media
Representation
Sip
For DigiHaven the following records types are available (non exhaustive list)
Dossier
Document
Representation
StructMap
StructMap
<!-- Describes the entire record tree: Newspaper > NewspaperPage > Digital Representations -->
<mets:structMap>
<mets:div TYPE="Newspaper" DMDID="DMDID-NEWSPAPER" AMDID="PREMISID-171 PREMISID-172">
<!-- Page 1 -->
<mets:div TYPE="NewspaperPage" DMDID="DMDID-NEWSPAPER-PAGE1">
<!-- Tiff -->
<mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF">
<mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF-ORIGINAL">
<mets:fptr FILEID="FILEID-NEWSPAPER-PAGE1-TIFF-ORIGINAL" />
</mets:div>
</mets:div>
<!-- JP2 -->
<mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF">
<mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-PAGE1-JP2-ORIGINAL">
<mets:fptr FILEID="FILEID-NEWSPAPER-PAGE1-JP2-ORIGINAL" />
</mets:div>
</mets:div>
</mets:div>
<!-- Page 2 -->
<mets:div TYPE="NewspaperPage" DMDID="DMDID-NEWSPAPER-PAGE2">
<!-- Tiff... -->
<!-- JP2... -->
</mets:div>
<!-- ALTO -->
<mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-ALTO">
<mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-ALTO-ORIGINAL">
<mets:fptr FILEID="FILEID-NEWSPAPER-ALTO-ORIGINAL" />
</mets:div>
</mets:div>
<!-- PDF -->
<mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-PDF">
<mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-PDF-ORIGINAL">
<mets:fptr FILEID="FILEID-NEWSPAPER-PDF-ORIGINAL" />
</mets:div>
</mets:div>
</mets:div>
</mets:structMap>
FileSec
FileSec
<!-- FileSec only provides the physically stored files as flat list -->
<mets:fileSec>
<mets:fileGrp USE="Original">
<mets:file ID="FILEID-NEWSPAPER-PAGE1-TIFF-ORIGINAL" USE="Tape" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
<mets:FLocat LOCTYPE="URL" xlink:href="tiff/pid_page1.tiff"/>
</mets:file>
<mets:file ID="FILEID-NEWSPAPER-PAGE1-JP2-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
<mets:FLocat LOCTYPE="URL" xlink:href="jp2/pid_page1.jp2"/>
</mets:file>
<mets:file ID="FILEID-NEWSPAPER-PAGE2-TIFF-ORIGINAL" USE="Tape" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
<mets:FLocat LOCTYPE="URL" xlink:href="tiff/pid_page2.tiff"/>
</mets:file>
<mets:file ID="FILEID-NEWSPAPER-PAGE2-JP2-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
<mets:FLocat LOCTYPE="URL" xlink:href="jp2/pid_page2.jp2"/>
</mets:file>
<mets:file ID="FILEID-NEWSPAPER-ALTO-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
<mets:FLocat LOCTYPE="URL" xlink:href="pid_alto.xml"/>
</mets:file>
<mets:file ID="FILEID-NEWSPAPER-PDF-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
<mets:FLocat LOCTYPE="URL" xlink:href="pid_pdf.pdf"/>
</mets:file>
</mets:fileGrp>
</mets:fileSec>
Storage Location
By using the attribute “USE” on the element mets:file
inside the mets:fileSec
you can control whether or not to
By using the attribute DMDID
on the element mets:div
in the section mets:structMap
we can link it to metadata described in the sections mets:dmdSec
. In the new format the embedded metadata is the new Metadata Sidecar format.
The example below includes the field ExternalId
.
Metadata
<!-- Repeat this section for each object -->
<mets:dmdSec ID="DMDID-NEWSPAPER">
<mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="mhs:Sidecar">
<mets:xmlData>
<mhs:Sidecar xmlns:mh="https://zeticon.mediahaven.com/metadata/22.1/mh/"
xmlns:mhs="https://zeticon.mediahaven.com/metadata/22.1/mhs/"
version="22.1"
xsi:schemaLocation="https://zeticon.mediahaven.com/metadata/22.1/mhs/ https://zeticon.mediahaven.com/metadata/22.1/mhs.xsd https://zeticon.mediahaven.com/metadata/22.1/mh/ https://zeticon.mediahaven.com/metadata/22.1/mh.xsd">
<mhs:Administrative>
<mh:ExternalId>f47gq96g1g</mh:ExternalId>
</mhs:Administrative>
...
</mhs:Sidecar>
</mets:xmlData>
</mets:mdWrap>
</mets:dmdSec>
Events
By using the attribute AMDID
on the element mets:div
in the section mets:structMap
we can link it to events described in the sections mets:amdSec
.
The premis event will be validated against the premis XSD 3.0, premis V2 is not supported
Provided events will be stored in MediaHaven as actual premis events for the object
Benefits
Provided events are visible in the monitoring for the object
Provided events are validated to be syntactically correct → ensures stability of OAI and webhooks
Future screen in MediaHaven 2.0 will show these events
Mapping of premis:event
into MediaHaven
Part of the premis event recognised by MediaHaven (first agent, type, date, outcome, first comment) will be extracted
The complete premis event in XML will be fully stored as extra column
When returning events in the API in XML, the original premis event XML enhanced with the extra identifier is returned
Events
<mets:amdSec ID="ADMID-Newspaper">
<mets:digiprovMD ID="PREMISID-171">
<mets:mdWrap MDTYPE="PREMIS:EVENT">
<mets:xmlData>
<premis:event>
<premis:eventIdentifier>
<premis:eventIdentifierType>MEDIAHAVEN_EVENT</premis:eventIdentifierType>
<premis:eventIdentifierValue>171</premis:eventIdentifierValue>
</premis:eventIdentifier>
<premis:eventType>RECORDS.CREATE</premis:eventType>
<premis:eventDateTime>2021-09-27T10:40:11.495Z</premis:eventDateTime>
<premis:eventDetail/>
<premis:eventOutcomeInformation>
<premis:eventOutcome>OK</premis:eventOutcome>
</premis:eventOutcomeInformation>
<premis:linkingAgentIdentifier>
<premis:linkingAgentIdentifierType>MEDIAHAVEN_USER</premis:linkingAgentIdentifierType>
<premis:linkingAgentIdentifierValue>informatiebeheerder-edp@hfb</premis:linkingAgentIdentifierValue>
</premis:linkingAgentIdentifier>
<premis:linkingObjectIdentifier>
<premis:linkingObjectIdentifierType>MEDIAHAVEN_ID</premis:linkingObjectIdentifierType>
<premis:linkingObjectIdentifierValue>74aefb862f2b459993ae81a617eb572e5110964a965842a7aa25005bfd3616e2ed3353b2db00455ab5f6d14f872acd72</premis:linkingObjectIdentifierValue>
</premis:linkingObjectIdentifier>
</premis:event>
</mets:xmlData>
</mets:mdWrap>
</mets:digiprovMD>
<mets:digiprovMD ID="PREMISID-172">
<mets:mdWrap MDTYPE="PREMIS:EVENT">
<mets:xmlData>
<premis:event>
<premis:eventIdentifier>
<premis:eventIdentifierType>MEDIAHAVEN_EVENT</premis:eventIdentifierType>
<premis:eventIdentifierValue>172</premis:eventIdentifierValue>
</premis:eventIdentifier>
<premis:eventType>RECORDS.UPDATE.PUBLISH</premis:eventType>
<premis:eventDateTime>2021-09-27T10:40:11.495Z</premis:eventDateTime>
<premis:eventDetail/>
<premis:eventOutcomeInformation>
<premis:eventOutcome>OK</premis:eventOutcome>
</premis:eventOutcomeInformation>
<premis:linkingAgentIdentifier>
<premis:linkingAgentIdentifierType>MEDIAHAVEN_USER</premis:linkingAgentIdentifierType>
<premis:linkingAgentIdentifierValue>informatiebeheerder-edp@hfb</premis:linkingAgentIdentifierValue>
</premis:linkingAgentIdentifier>
<premis:linkingObjectIdentifier>
<premis:linkingObjectIdentifierType>MEDIAHAVEN_ID</premis:linkingObjectIdentifierType>
<premis:linkingObjectIdentifierValue>74aefb862f2b459993ae81a617eb572e5110964a965842a7aa25005bfd3616e2ed3353b2db00455ab5f6d14f872acd72</premis:linkingObjectIdentifierValue>
</premis:linkingObjectIdentifier>
</premis:event>
</mets:xmlData>
</mets:mdWrap>
</mets:digiprovMD>
<!-- ... more events -->
</mets:amdSec>