Loading
 

Adding Value to Digitization with GIS

Marianne Stowell Bracke
Agricultural Sciences Information Specialist, Associate Professor of Library Science
Purdue University
Christopher C. Miller
Geographic Information Systems (GIS) Specialist, Assistant Professor of Library Science
Purdue University

Digitization projects that scan and abandon – that simply generate static, digital copies of analog pages and then move on – might be missing out on some of the more relevant capabilities of the Semantic Web, with its creeping xml veins and mashed-up interfaces. Purdue University Libraries is resurrecting a 1906 Soil Survey and mashing it with itself in order to add value, access, and interaction beyond the more traditional scan/describe/store model of collection recovery. It is an attempt to leverage the evolving technologies of librarianship (description, yes, but also text markup, web application-building, and geographic information systems [GIS]) toward the benefit of our users’ maturing needs and expectations. Soil surveys document soil types and locations via prose and maps, typically published as a single document. Modern soil surveys are born digital and used largely in electronic contexts (including GIS) but there are decades of rich comparative data being left to atrophy in the undigitized copies of aging paper surveys. This project digitizes both components of the 1906 survey – the text and map – and extracts both into useable, modern data formats. In addition to OCR’d, fully indexed, fully searchable full text, the soil zone data from the map will also be extracted into a useable, queryable, studyable GIS dataset. Search and query will be available for each component of the original document and each will be able to link and query the other, allowing the user to jump seamlessly between text and map and back based on common semantic elements. The session will update the status of the project, demonstrate a working draft of the map and document interface, and discuss the more macro implications of librarians recognizing new expectations for usability by applying new technologies to their collections.

Presentation (PDF)

 

Building an Online, Cross-Disciplinary Community-based Research (CBR) Learning Community: Initial Observations

Deanna Cooke
Director of Research, Center for Social Justice Research, Teaching and Service
Georgetown University
Joan Cheverie
Head, Digital Library Services
Georgetown University

At Georgetown University many students are engaged in collaborative research with community organizations that support social change. However, there is little interaction between students and faculty from different disciplines, and these researchers rarely have opportunities to document and share their work. Through support from the Provost’s Undergraduate Learning Initiative grant that supports teaching and learning, the Center for Social Justice and the University Library are collaborating to develop an online CBR community.

  • The online CBR learning community seeks to augment the community-based research being conducted by allowing students to do the following:
    Share research strategies and support each other’s projects. For example, psychology students can provide interviewing skills to chemists who want to assess residents’ attitudes about their research project on the polluted Anacostia River in Washington, DC.
  • Share resources across disciplines. For example, students who work with high school biology classes can learn about the housing community in which their students live from a sociology student working on affordable housing research.
  • Publicize their work through online research posters that give details about both the challenges and successes they face in their work.
  • Publish findings through the development of an online, peer reviewed CBR journal.

In this session, the principal investigators will describe the components of the online CBR learning community, the model they have developed that illustrates their vision for integrating scholarship, collaborative learning, and publication into the undergraduate research experience, and they will share their observations gained from the initial cohort.

Handout (MS Word)


 

Business Transformation: Building the Museum of the 21st Century

Bonnie Szirtes
Business Architect, Research and Business Intelligence
Canadian Heritage Information Network (CHIN)

Canadian museums, more than ever before, are responding to the challenges of adapting to an ever-changing environment to remain relevant to their communities, stakeholders and funders. They are responsible for engaging new audiences, dealing with repatriation issues, looking for new funding opportunities, competing with other venues for visitors and visitor dollars, and creating experiences for these visitors. New and emerging technology opportunities can help museums transform their business practices to adapt to this new environment. This session will examine what business transformation is and why it is important to museums. The results from two research studies commissioned by CHIN will be presented: 1) Transformational Technology Opportunities in the Next Decade and 2) Competitive Environment Scan.

 

Can We Afford to Preserve Large Databases?

David S.H. Rosenthal
Chief Scientist, LOCKSS Program
Stanford University
Victoria Reich
Director, LOCKSS Program
Stanford University

A single replica of a large database, such as the petabyte scale protein database, may cost millions of dollars. Minimizing the number of replicas needed to assure adequate preservation becomes the dominant design goal. As preserving large scientific datasets becomes a focus of the National Science Foundation’s cyber-infrastructure program, how well prepared are we to take rational investment decisions about systems in this area?

Drawing on research by the LOCKSS research team and others, this session surveys the state of engineering knowledge and points out the gaps that need to be filled by digital preservation research. These gaps include needs for better specification and characterization of media performance (recent papers show that everything you know about disk reliability is wrong), better models of the threats (the most frequent cause of data loss at large sites is operator error), better models of fault tolerance (recent papers show that both RAID and Byzantine Fault Tolerance are inappropriate models) and better ways of formulating the relationship between a preservation service and its customers (disclaiming all responsibility for the preserved data is not a suitable service level agreement).

PowerPoint Presentation (PDF)


 

Case Western Reserve University Digital Case: Repository of Digital Collections

Ben Bykowski
Senior Interactive Developer
Optiem, LLC
Nicholas Fischio
Development Manager
Case Western Reserve University

Digital Case is an electronic repository that archives, stores, disseminates, and preserves faculty research in digital formats (both “born digital” and materials of historic interest that have been digitized). Organizing and maintaining published materials has long been the domain of research libraries. With the Digital Case project, the Kelvin Smith Library assumes a more active role in the scholarly communication process, providing expertise in the form of a set of services (metadata tagging, authority control, secure environment, preservation over time) for access and distribution of Case’s collective intellectual product.

There are several unique aspects of the Digital Case project:

1. Exploring the potential for combining commercial software with open source software: Digital Case uses a commercial content management system (Ektron) as the administrative interface. Since neither Ektron nor Fedora were designed to communicate with each other, it took some creativity to integrate the two systems. Discussing the design process as well as demonstrating the live implementation should spark some creative thinking about building a complete, functioning digital library.

2. Designing a digital library / institutional repository from the customer perspective: It would seem that most organizations take the approach of starting with what they are given (i.e. Fedora, DSpace) and attempt to turn that into a complete system. With the help of a local professional design firm (Optiem), the Case Western team started the process by asking users how they would envision a digital library. By starting with no limits or preconceived notions, the team feels Digital Case breaks new ground, in particular because institutional repositories are technical in nature and technical staff typically drive the process. This project approaches the process from a different angle.

http://library.case.edu/digitalcase

Handout (PDF)

PowerPoint Presentation

 

Considering Community and Open Source: Decision Frameworks for Selecting Software

Lois Brooks
Director, Academic Computing
Stanford University
Terry Ryan
Associate University Librarian for the UCLA Electronic Library
University of California, Los Angeles

Community and open source software is sparking interest with promises of new choices for enterprise applications and a higher education-focused approach, while simultaneously raising questions for administrators about when and whether to adopt or devote resources to software development projects. Case studies of community and open source decisions by UCLA and Stanford demonstrate approaches for considering this emerging area of technology development, including product and architectural evaluations, campus engagement, understanding institutional priorities and campus readiness, interoperability with IT infrastructure and library systems, sustainability, and fit with the IT culture and environment.

http://www.oit.ucla.edu/ccle/default.htm
http://coursework-pilot.stanford.edu

Handout (MS Word)

PowerPoint Presentation

Crossing Boundaries: Exploring Ways to Share Content Across Educational Institutions

Gretchen Wagner
General Counsel & Secretary
ARTstor
Maureen Whalen
Associate General Counsel
J. Paul Getty Trust

Social software sites like Flickr and YouTube demonstrate the significant appetite among the community at large to engage in collective efforts, as well as the transformative potential of the web in terms of linking people and ideas. Flickr alone has over 1 billion photos, with an estimated 11,547 images “served” per second on busy days.

Yet despite these broader trends toward sharing on the open Internet, educational institutions often are hesitant to engage in online sharing of content because of the fears and uncertainty of infringing third party copyrights. Frequently this means that educational institutions try to build their own resources, replicating efforts undertaken at other institutions and creating redundant content, rather than sharing that content with other educational users. This has also meant that some unique materials that would be of value to the broader educational community are inaccessible because of the uncertainties surrounding copyright.

Through this project briefing, we will explore how the educational community might take a more collaborative approach to sharing content (and in particular visual images) for teaching and study. The session will include a discussion of how the broader copyright disputes in the commercial context are creating norms and laws that are being applied to the educational community, and how the laws – and the educational community – are failing to distinguish between teaching and study, on the one hand, and commercial uses, on the other. The session will also include a discussion of the current unwillingness among educational institutions to rely on the copyright doctrine of fair use in sharing their content with other educational institutions, and how some of those concerns might be addressed collectively. The final part of the project briefing will be a collaborative discussion with attendees exploring specific next steps.

Handout (MS Word)

 

DLF: Architectures of Collaboration

Peter Brantley
Executive Director
Digital Library Federation

The collective expertise of digital libraries in making available the diverse literatures of science and artistic expression, in concert with the increasing sophistication of commercial partners and the development of distributed, interactive forms of publishing, require libraries to chart the engineering of new architectures for teaching, learning, and research. Digital Libraries must work to forge the new collaborations required to enable and build these services.

http://www.diglib.org

 

DSpace’s Next Generation

John Mark Ockerbloom
Digital Library Architect and Planner
University of Pennsylvania
Ann J. Wolpert
Director of Libraries
Massachusetts Institute of Technology

Since its initial release five years ago, hundreds of research and educational institutions have adopted the DSpace open source repository software to collect, provide, and preserve their digital content. In order to meet the evolving needs of this growing community, DSpace’s architecture and governance need to adapt accordingly. In this session, we review the recently approved revised technical architecture for DSpace that will support a wide variety of uses and extensions, as well as an improved data model, while still supporting DSpace’s well-established “out of the box” functionality. We will also give an update on DSpace’s open source software community governance plan, including a separate 501(c)3 non-profit corporation, to support the implementation of this architecture and help sustain DSpace adoption, use, and development in the community.

http://www.dspace.org

PowerPoint Presentation Univ of Penn

PowerPoint Presentation MIT

 

Edition Production & Presentation Technology (EPPT)

Kevin Kiernan
Emeritus Professor of English
University of Kentucky
Ionut Emil Iacob
Edition Production & Presentation Technology
University of Kentucky

Edition Production & Presentation Technology (EPPT) is an integrated set of XML tools designed to help humanities editors prepare image-based electronic editions. EPPT is a free standalone application that editors can install and use on their own individual computers. EPPT makes image-based encoding, the laborious process of linking descriptive markup to material evidence through XML, a relatively easy and error-proof task.  Using automatically generated templates based on the data of each individual project, humanities scholars and their students, who typically have little or no prior knowledge of XML/TEI markup or encoding, can set to work with EPPT with very little training.  Prevalidation techniques alert encoders whenever markup is wrong, missing, or otherwise invalid, so that their markup operates seamlessly even in the presence of multiple or conflicting hierarchies. Following emerging standards (XSLT, XPath, XQuery), EPPT is testing its broad application to external projects that preserve texts in Old English, Middle English, Old French, Old Slovene, ancient Assyrian, Greek and Latin, on parchment, vellum, paper, papyrus, clay and stone.

All that one needs to start an image-based electronic edition using EPPT is a set of relevant images and the corresponding plain text, preferably but not necessarily with a project-specific DTD.  Thus scholars using EPPT are able to create (with permission of the online repositories) new image-based electronic editions using images and related data available through completely independent online archives.  EPPT can also help scholars working on established projects to prepare, collate, and search variant manuscript versions of texts preserved in many manuscripts.  EPPT does not in any way, however, take ownership of independent projects nor use any proprietary format for encoding.  Editors of these projects simply use EPPT to help them accomplish on their own computers highly complex image-based encoding to make it possible to search or display whatever they encode.  The presentation functionality of EPPT will let editors in their work-in-progress dynamically search and display any image details linked by this transparent encoding.  After encoding with EPPT, each project can use its own preferred way of publishing the results.

EPPT’s powerful generic workbench can serve a very wide range of projects.  EPPT is programmed to run on both Mac and PC platforms.  While anyone following our detailed installation guidelines may download, test, and use EPPT for their own image-based encoding, we currently provide free support only for the specified Trial projects on the EPPT website.  This website will as time permits add more guides and tutorials.   A preliminary explanation of its technology and editing methods is available on the Electronic Boethius project website.

http://www.eppt.org