Federations of Institutional Repositories
and OAI-PMH Harvesting:
Beyond Dublin Core
Herbert Van de Sompel
Technical Staff Member, Research Library
Los Alamos National Laboratory
OAI-PMH compliant Institutional Repository software exposes Dublin Core
metadata about stored content, allowing service providers to create
discovery services across repositories by recurrent metadata harvesting.
Next generation service providers will also be interested in collecting
the actual content stored in those repositories, because that would
facilitate the creation of more elaborate and attractive services. The
OAI-PMH could still function as the framework through which such service
providers would remain in sync with the evolving collections stored
in the repositories, provided that those repositories can expose a more
complex "metadata format". In order to be transportable through
the OAI-PMH, such a complex "metadata format" needs to be
an XML-based representation of a complex document model. A variety of
such formats exist, and the Research Library of the Los Alamos National
Laboratory (LANL) has focused its attention on the MPEG-21 Digital Item
Declaration Language. As LANL also operates a DSpace server, research
has been conducted aimed at exposing this format through the DSpace
OAI-PMH interface. Exposed records wrap the DSpace Dublin Core metadata
for an item and its associated content (bundles) in a single XML document
that can be harvested from a DSpace Repository through the OAI-PMH.
Related research has resulted in the creation of an OAI-PMH Federator
that makes OAI-PMH harvesting of these documents from a distributed
federation of institutional repositories straightforward by providing
a single point of access.
This briefing will describe and demonstrate a prototype
DSpace plug-in that allows harvesting of complex objects compliant with
the MPEG-21 DIDL specification from a DSpace repository. It will also
touch upon the problems faced in mapping the DSpace data model to the
complex object format. Furthermore, the presentation will describe and
demonstrate the OAI-PMH Federator that allows harvesters to collect
complex objects from a federation of institutional repositories through
a single point of access, and that provides the capability to crosswalk
between various complex object formats.
Web Link:
http://www.dlib.org/dlib/february04/bekaert/02bekaert.html
Presentation:
Federations
of Institutional Repositories
and OAI-PMH Harvesting (PDF)