Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

View file
name6t0gt8cw96_mets.xml
View file
namef47gq96g1g.xml

Structure

Record Types in StructMap

The attribute TYPE in mets:div must refer to a valid record type. These types are configurable and additional ones can be configured.

For the newspaper ingest the following record types are applicable:

  • Newspaper

  • NewspaperPage

  • Media

  • Representation

  • Sip

For DigiHaven the following records types are available (non exhaustive list)

  • Dossier

  • Document

  • Representation

StructMap

...

titleStructMap

...

languagexml

...

Validation

METS XML forms a skeleton structure with additional embedded metadata from other standards. As such the validation of the METS XML using XSD requires several XSD files. In particular, the XSD validation of the METS inside the SIP requires the following 3 XSDs

Structure

Record Types in StructMap

The attribute TYPE in mets:div must refer to a valid record type. These types are configurable and additional ones can be configured.

For the newspaper ingest the following record types are applicable:

  • Newspaper

  • NewspaperPage

  • Media

  • Representation

  • Sip

For DigiHaven the following records types are available (non exhaustive list)

  • Dossier

  • Document

  • Representation

StructMap

Expand
titleStructMap
Code Block
languagexml
<!-- Describes the entire record tree: Newspaper > NewspaperPage > Digital Representations -->
<mets:structMap>
  <mets:div TYPE="MediaNewspaper" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF" AMDID="PREMISID-171 PREMISID-172">
    <!-- Page 1 -->
    <mets:div TYPE="Representation" LABEL="OriginalNewspaperPage" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF-ORIGINAL">
      <!-- Tiff -->
      <mets:fptrdiv FILEIDTYPE="FILEIDMedia" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF-ORIGINAL" />
        </mets<mets:div>div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF-ORIGINAL">
   </mets:div>       <mets:fptr FILEID="FILEID-NEWSPAPER-PAGE1-TIFF-ORIGINAL" />
        </mets:div>
      </mets:div>
      
      <!-- JP2 -->
      <mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-PAGE1-TIFF">
        <mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-PAGE1-JP2-ORIGINAL">
          <mets:fptr FILEID="FILEID-NEWSPAPER-PAGE1-JP2-ORIGINAL" />
        </mets:div>
      </mets:div>
    </mets:div>
    
    <!-- Page 2 -->
    <mets:div TYPE="NewspaperPage" DMDID="DMDID-NEWSPAPER-PAGE2">
      <!-- Tiff... -->
      <!-- JP2... -->
    </mets:div>
   
    <!-- ALTO -->   
    <mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-ALTO">
      <mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-ALTO-ORIGINAL">
        <mets:fptr FILEID="FILEID-NEWSPAPER-ALTO-ORIGINAL" />
      </mets:div>
    </mets:div> 
    
    <!-- PDF -->   
    <mets:div TYPE="Media" DMDID="DMDID-NEWSPAPER-PDF">
      <mets:div TYPE="Representation" LABEL="Original" DMDID="DMDID-NEWSPAPER-PDF-ORIGINAL">
        <mets:fptr FILEID="FILEID-NEWSPAPER-PDF-ORIGINAL" />
      </mets:div>
    </mets:div> 
    
  </mets:div>
</mets:structMap>

...

Expand
titleFileSec
Code Block
languagexml
<!-- FileSec only provides the physically stored files as flat list -->
<mets:fileSec>
    <mets:fileGrp USE="Original">
        <mets:file ID="FILEID-NEWSPAPER-PAGE1-TIFF-ORIGINAL" USE="Tape" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
          <mets:FLocat LOCTYPE="URL" xlink:href="tiff/pid_page1.tiff"/>
        </mets:file>
        <mets:file ID="FILEID-NEWSPAPER-PAGE1-JP2-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
          <mets:FLocat LOCTYPE="URL" xlink:href="jp2/pid_page1.jp2"/>
        </mets:file>
        <mets:file ID="FILEID-NEWSPAPER-PAGE2-TIFF-ORIGINAL" USE="Tape" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
          <mets:FLocat LOCTYPE="URL" xlink:href="tiff/pid_page2.tiff"/>
        </mets:file>
        <mets:file ID="FILEID-NEWSPAPER-PAGE2-JP2-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
          <mets:FLocat LOCTYPE="URL" xlink:href="jp2/pid_page2.jp2"/>
        </mets:file>
        <mets:file ID="FILEID-NEWSPAPER-ALTO-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
          <mets:FLocat LOCTYPE="URL" xlink:href="pid_alto.xml"/>
        </mets:file>
        <mets:file ID="FILEID-NEWSPAPER-PDF-ORIGINAL" USE="Disk" CHECKSUMTYPE="MD5" CHECKSUM="..." CREATED="2019-10-10T15:42:15Z">
          <mets:FLocat LOCTYPE="URL" xlink:href="pid_pdf.pdf"/>
        </mets:file>
    </mets:fileGrp>
</mets:fileSec>

...

Metadata

By using the attribute “USE” DMDID on the element mets:file inside div in the section mets:fileSec you can control whether or not to

  • USE="Tape" Store the object on tapes

  • USE="Disk" Store the object on disk(s)

Metadata

By using the attribute DMDID on the element mets:div in the section mets:structMap we can link it to metadata described in the structMap we can link it to metadata described in the sections mets:dmdSec. In the new format the embedded metadata is the new Metadata Sidecar format.

...

  • By using the attribute AMDID on the element mets:div in the section mets:structMap we can link it to events described in the sections mets:amdSec.

  • The premis event will be validated against the premis XSD 3.0, premis V2 is not supported

  • Provided events will be stored in MediaHaven as actual premis events for the object

  • Benefits

    • Provided events are visible in the monitoring for the object

    • Provided events are validated to be syntactically correct → ensures stability of OAI and webhooks

    • Future screen in MediaHaven 2.0 will show these events

  • Mapping of premis:event into MediaHaven

    • Part of the premis event recognised by MediaHaven (first agent, type, date, outcome, first comment) will be extracted

    • The complete premis event in XML will be fully stored as extra column

    • When returning events in the API in XML, the original premis event XML enhanced with the extra identifier is returned

...

titleEvents

...

    • with the extra identifier is returned

Expand
titleEvents
Code Block
<mets:amdSec ID="ADMID-Newspaper">
      <mets:digiprovMD ID="PREMISID-171">
         <mets:mdWrap MDTYPE="PREMIS:EVENT">
            <mets:xmlData>
               <premis:event>
                  <premis:eventIdentifier>
                     <premis:eventIdentifierType>MEDIAHAVEN_EVENT</premis:eventIdentifierType>
          <mets:mdWrap MDTYPE="PREMIS:EVENT">          <premis:eventIdentifierValue>171</premis:eventIdentifierValue>
  <mets:xmlData>                <premis</premis:event>eventIdentifier>
                  <premis:eventIdentifier>
  eventType>RECORDS.CREATE</premis:eventType>
                  <premis:eventIdentifierType>MEDIAHAVEN_EVENT<:eventDateTime>2021-09-27T10:40:11.495Z</premis:eventIdentifierType>
  eventDateTime>
                  <premis:eventIdentifierValue>171<eventDetail/premis:eventIdentifierValue>>
                  </premis:eventIdentifier><premis:eventOutcomeInformation>
                     <premis:eventType>RECORDS.CREATE<eventOutcome>OK</premis:eventType>eventOutcome>
                  <premis:eventDateTime>2021-09-27T10:40:11.495Z<</premis:eventDateTime>eventOutcomeInformation>
                  <premis:eventDetail/>linkingAgentIdentifier>
                      <premis:eventOutcomeInformation><premis:linkingAgentIdentifierType>MEDIAHAVEN_USER</premis:linkingAgentIdentifierType>
                     <premis:eventOutcome>OK<linkingAgentIdentifierValue>informatiebeheerder-edp@hfb</premis:eventOutcome>linkingAgentIdentifierValue>
                  </premis:eventOutcomeInformation>linkingAgentIdentifier>
                  <premis:linkingAgentIdentifier>linkingObjectIdentifier>
                     <premis:linkingAgentIdentifierType>MEDIAHAVENlinkingObjectIdentifierType>MEDIAHAVEN_USER<ID</premis:linkingAgentIdentifierType>linkingObjectIdentifierType>
                     <premis:linkingAgentIdentifierValue>informatiebeheerder-edp@hfb<linkingObjectIdentifierValue>74aefb862f2b459993ae81a617eb572e5110964a965842a7aa25005bfd3616e2ed3353b2db00455ab5f6d14f872acd72</premis:linkingAgentIdentifierValue>linkingObjectIdentifierValue>
                  </premis:linkingAgentIdentifier>linkingObjectIdentifier>
               </premis:event>
    <premis:linkingObjectIdentifier>        </mets:xmlData>
         </mets:mdWrap>
   <premis:linkingObjectIdentifierType>MEDIAHAVEN_ID</premis:linkingObjectIdentifierType>   </mets:digiprovMD>
      <mets:digiprovMD ID="PREMISID-172">
          <premis:linkingObjectIdentifierValue>74aefb862f2b459993ae81a617eb572e5110964a965842a7aa25005bfd3616e2ed3353b2db00455ab5f6d14f872acd72</premis:linkingObjectIdentifierValue><mets:mdWrap MDTYPE="PREMIS:EVENT">
            <mets:xmlData>
     </premis:linkingObjectIdentifier>          <premis:event>
     </premis:event>             </mets<premis:xmlData>eventIdentifier>
         </mets:mdWrap>       </mets:digiprovMD>       <mets:digiprovMD ID="PREMISID-172"><premis:eventIdentifierType>MEDIAHAVEN_EVENT</premis:eventIdentifierType>
          <mets:mdWrap MDTYPE="PREMIS:EVENT">          <premis:eventIdentifierValue>172</premis:eventIdentifierValue>
  <mets:xmlData>                <premis</premis:event>eventIdentifier>
                  <premis:eventIdentifier>eventType>RECORDS.UPDATE.PUBLISH</premis:eventType>
                     <premis:eventIdentifierType>MEDIAHAVEN_EVENT<<premis:eventDateTime>2021-09-27T10:40:11.495Z</premis:eventIdentifierType>
eventDateTime>
                    <premis:eventIdentifierValue>172<eventDetail/premis:eventIdentifierValue>>
                  </premis:eventIdentifier><premis:eventOutcomeInformation>
                     <premis:eventType>RECORDS.UPDATE.PUBLISH<eventOutcome>OK</premis:eventType>eventOutcome>
                  <premis:eventDateTime>2021-09-27T10:40:11.495Z<</premis:eventDateTime>eventOutcomeInformation>
                  <premis:eventDetail/>linkingAgentIdentifier>
                     <premis:eventOutcomeInformation>:linkingAgentIdentifierType>MEDIAHAVEN_USER</premis:linkingAgentIdentifierType>
                     <premis:eventOutcome>OK<linkingAgentIdentifierValue>informatiebeheerder-edp@hfb</premis:eventOutcome>linkingAgentIdentifierValue>
                  </premis:eventOutcomeInformation>linkingAgentIdentifier>
                  <premis:linkingAgentIdentifier>linkingObjectIdentifier>
                     <premis:linkingAgentIdentifierType>MEDIAHAVENlinkingObjectIdentifierType>MEDIAHAVEN_USER<ID</premis:linkingAgentIdentifierType>linkingObjectIdentifierType>
                     <premis:linkingAgentIdentifierValue>informatiebeheerder-edp@hfb<linkingObjectIdentifierValue>74aefb862f2b459993ae81a617eb572e5110964a965842a7aa25005bfd3616e2ed3353b2db00455ab5f6d14f872acd72</premis:linkingAgentIdentifierValue>linkingObjectIdentifierValue>
                  </premis:linkingAgentIdentifier>
                  <premis:linkingObjectIdentifier>

                    <premis:linkingObjectIdentifierType>MEDIAHAVEN_ID< </premis:linkingObjectIdentifierType>event>
            </mets:xmlData>
         <premis:linkingObjectIdentifierValue>74aefb862f2b459993ae81a617eb572e5110964a965842a7aa25005bfd3616e2ed3353b2db00455ab5f6d14f872acd72<</premismets:linkingObjectIdentifierValue>mdWrap>
      </mets:digiprovMD>
      <!-- ... more events -->
   </premis:linkingObjectIdentifier>
               </premis:event>
            </mets:xmlData>
         </mets:mdWrap>
      </mets:digiprovMD>
      <!-- ... more events -->
   </mets:amdSec></mets:amdSec>

Storage Location

By using the attribute “USE” on the element mets:file inside the mets:fileSec you can control whether or not to

  • USE="Tape" Store the object on tapes

  • USE="Disk" Store the object on disk(s)

Common mistakes

Tip

The value of the attribute ID must be a valid XML tag name which has the following rules:

  • Must be unique within the METS xml

  • Must start with a letter or underscore

  • Cannot start with the letters xml (or XML, or Xml, etc)

  • Can contain letters, digits, hyphens, underscores, and periods

  • Cannot contain spaces

Tip

Because the value of the attribute xlink:href must be a valid URI, you will have to URI encode its value if the path to the file contains illegal URI symbols such as ? % etc.