Herbert Van de Sompel
Technical Staff Member, Research Library
Los Alamos National Laboratory
OAI-PMH compliant Institutional Repository software exposes Dublin Core metadata about stored content, allowing service providers to create discovery services across repositories by recurrent metadata harvesting. Next generation service providers will also be interested in collecting the actual content stored in those repositories, because that would facilitate the creation of more elaborate and attractive services. The OAI-PMH could still function as the framework through which such service providers would remain in sync with the evolving collections stored in the repositories, provided that those repositories can expose a more complex “metadata format”. In order to be transportable through the OAI-PMH, such a complex “metadata format” needs to be an XML-based representation of a complex document model. A variety of such formats exist, and the Research Library of the Los Alamos National Laboratory (LANL) has focused its attention on the MPEG-21 Digital Item Declaration Language. As LANL also operates a DSpace server, research has been conducted aimed at exposing this format through the DSpace OAI-PMH interface. Exposed records wrap the DSpace Dublin Core metadata for an item and its associated content (bundles) in a single XML document that can be harvested from a DSpace Repository through the OAI-PMH. Related research has resulted in the creation of an OAI-PMH Federator that makes OAI-PMH harvesting of these documents from a distributed federation of institutional repositories straightforward by providing a single point of access.
This briefing will describe and demonstrate a prototype DSpace plug-in that allows harvesting of complex objects compliant with the MPEG-21 DIDL specification from a DSpace repository. It will also touch upon the problems faced in mapping the DSpace data model to the complex object format. Furthermore, the presentation will describe and demonstrate the OAI-PMH Federator that allows harvesters to collect complex objects from a federation of institutional repositories through a single point of access, and that provides the capability to crosswalk between various complex object formats.
Federations of Institutional Repositories and OAI-PMH Harvesting (PDF)