Subject: European Metadata Engine Project
NINCH-ANNOUNCE (david@ninch.org)
Date: Tue, 27 Feb 2001 10:21:39 -0500
Message-Id: <v04210102b6c1744a92ed@[192.100.21.23]> Date: Tue, 27 Feb 2001 10:21:39 -0500 To: ninch-announce@cni.org From: NINCH-ANNOUNCE <david@ninch.org> Subject: European Metadata Engine Project
NINCH ANNOUNCEMENT
News on Networking Cultural Heritage Resources
from across the Community
February 27, 2001
The Metadata Engine Project (METAe)
http://meta-e.uibk.ac.at/
METAe Newsletter Now Available
http://meta-e.uibk.ac.at/newsletter/news.htm
A promising European project, The METADATA ENGINE, is described
below, along with the first issue of the project's newsletter.
Essentially, the project is working on developing software that will
automatically generate metadata during the digitization of printed
material and hopefully making "large scale digitisation of printed
material, such as books and journals, more reliable in terms of
digital preservation, more cost-effective in terms of automation, and
more user-oriented in terms of future applications."
David Green
===========
>Date: Mon, 26 Feb 2001 10:06:02 +0000
>Sender: Digital Libraries Research mailing list
><DIGLIB@INFOSERV.NLC-BNC.CA>
>From: Simon Tanner <S.G.Tanner@HERTS.AC.UK>
>Subject: Metadata Engine - Newsletter now available
*** Apologies for cross-postings ***
The Metadata Engine Project (METAe) - Newsletter now available.
The first issue of the METAe Newsletter is now available from:
http://meta-e.uibk.ac.at/newsletter/news.htm
(for an introduction to METAe see the base of this email)
In this first issue we introduce our project and tell you some
information about progress to date. Our next issue due out in April
2001 will have even more detail and information. The METAe homepage
has further information and of course the METAe team welcome contact
at any time: http://meta-e.uibk.ac.at/. The METAe Project is funded
under the European Union IST Programme.
In this issue, Günter Mühlberger, from the Project Co-ordination team
at University of Innsbruck explains the genesis of the idea that led
to the Metadata Engine Project. Also, the influence of the METAe
project is already being felt on the international scene and
Alexander Eggar explains why METAe have been invited to attend the
next MOA2 DTD meeting in New York.
We also introduce the 14 partners that make up the Metadata Engine
project. In future issues two partners per issue will showcase their
expertise and involvement in METAe. This will give a good opportunity
to find out more about the backgrounds to our various partners.
We will endeavour to keep you up to date with the METAe project
progress and to give details of forthcoming events that METAe
organises or will be presenting information at. The newsletter may
also include reports on meetings attended by METAe partners - as this
issue does, with an article by Gerd Prasthofer on the
SCHEMAS-workshop held in Bonn during November 2000.
We hope you will find this newsletter useful and informative. Any
feedback can be directed to Simon Tanner, Editor of the METAe
Newsletter at mailto:s.g.tanner@herts.ac.uk
Best regards,
Simon Tanner
Senior Digitisation Consultant (HEDS)
Higher Education Digitisation Service
Web: http://heds.herts.ac.uk
Some further information about METAe:
The METADATA ENGINE Project
"Metadata" are playing a significant role in "digital preservation":
Firstly, they are, in conjunction with emerging standards (such as
XML, EAD, Dublin Core or RDF ), among the most promising ways to keep
digital material "alive" over the years and decades. Secondly,
metadata are needed for all kinds of resource discovery, i. e. using
and accessing digital collections in a user-friendly way. The
METADATA ENGINE project picks up these considerations and will
develop software modules in order to automate metadata capturing by
introducing layout and document analysis as a key technology for
digitisation software. METAe will enhance dramatically the quality of
creating and maintaining digital collections of printed material such
as books and journals.
Objectives
The METAe project will address the need for an automated generation
of metadata during the conversion of printed documents and thus be
able to make large scale digitisation of printed material, such as
books and journals, more reliable in terms of digital preservation,
more cost-effective in terms of automation, and more user-oriented in
terms of future applications.
In order to achieve these aims the METADATA ENGINE project will
(1) introduce layout and document analysis to be employed as a key
technology in future digitisation software,
(2) develop capturing and conversion tools for the automated
recording and generation of administrative and descriptive metadata,
(3) develop an omnifont OCR-engine specialising in processing old
European typefaces of the 19th century,
(4) strictly obey emerging standards in the fields of digital
preservation and resource description, such as XML, EAD, TEI, or ISO
12083,
(5) develop a XML search engine capable for retrieving the tagged
full text and the images.
Description of work
The METAe project will develop a software package which extensively
automates and improves the generation of metadata by applying new
technologies for character, layout and document recognition, and
converts the captured information into XML documents. These XML files
will serve as a basis for a variety of applications, such as new XML
search engines, navigation tools, electronic books, audio books, or
the automated production of HTML, XHTML, PDF or PS files.
The METAe package consists of (1) an input module for scanning
printed material and importing existing bibliographic metadata, (2)
an omnifont character recognition module (OCR-engine) specialising in
typefaces of the 19th century, (3) a document analysis module capable
of classifying pages according to their physical and logical
structure (items such as title pages, table of contents pages, etc.,
will be recognised automatically), (4) a page layout analysis module
capable of analysing and segmenting page elements such as page
numbers, headings, captions, footnotes, pictures, highlighted
phrases, or graphical separators, (5) a knowledge base providing a
controlled vocabulary and rules for the recognition process (the
table of contents is, in most cases, called "contents"), (6) a
conversion module assembling an XML document containing all
recognised metadata, and (7) an export module for the XML enriched
document and the scanned image.
The XML documents will be generated according to emerging standards
for digital preservation and the electronic interchange of
information such as RDF, DC, EAD, TEI, or ISO 12083.
In order to introduce a wide public to the new features of accessing
and browsing images and XML-marked full texts, a METAe search engine
and web application will be developed as well.
============================================================
Simon Tanner
Senior Digitisation Consultant (HEDS)
Higher Education Digitisation Service
University of Hertfordshire
Phone: +44 (0) 1707 286078
Fax: +44 (0) 1707 286079
Web: http://heds.herts.ac.uk
METAe Project: http://meta-e.uibk.ac.at/
******************************************************************
Sun Microsystems, Inc. has published the second edition of its
popular "Digital Library Toolkit", a valuable resource for anyone
planning a digital collection. To download a free copy, go to:
http://www.sun.com/products-n-solutions/edu/libraries/digitaltoolkit.html
******************************************************************
==============================================================
NINCH-Announce is an announcement listserv, produced by the National
Initiative for a Networked Cultural Heritage (NINCH). The subjects of
announcements are not the projects of NINCH, unless otherwise noted;
neither does NINCH necessarily endorse the subjects of announcements.
We attempt to credit all re-distributed news and announcements and
appreciate reciprocal credit.
For questions, comments or requests to un-subscribe, contact the editor:
<mailto:david@ninch.org>
==============================================================
See and search back issues of NINCH-ANNOUNCE at
<http://www.cni.org/Hforums/ninch-announce/>.
==============================================================
This archive was generated by hypermail 2a16 : Fri Dec 21 2001 - 11:30:19 EST