Loading
 

The United States End-of-Term Web Archive

 

Abbie Grotke
Web Archiving Team Lead
Library of Congress
Kathleen Murray
Post-Doctoral Research Fellow
University of North Texas

In the spring of 2008 an ad-hoc collaboration was formed to build a comprehensive archive of the United States Federal Government Web domain before, during, and immediately after the transition to a new presidency. The Library of Congress, the Internet Archive, the California Digital Library, the University of North Texas and the Government Printing Office collaborated to assemble a comprehensive list of sites, provide a nomination tool to engage federal documents experts in site selection, and distribute the work of harvesting content. This presentation will include discussion of various aspects of the ongoing collaboration, including recent work to provide researchers access to the archive, which consists of over 3000 sites, and plans which are underway for collecting in 2012 and 2013. The archive will be demonstrated at this session. The speakers will also discuss a two-year grant from the Institute of Museum and Library Services (IMLS) funding research into comparing machine clustering of Web pages to classification by subject matter experts.

http://eotarchive.cdlib.org/index.html

 Presentation (PDF)

Oral History, METS and Fedora: Building a Standards-Compliant Audio Preservation Infrastructure

Janet Gertz
Director, Preservation and Digital Conversion
Columbia University

Stephen Paul Davis
Director, Libraries Digital Program
Columbia University

From 2008 to 2010 Columbia University Libraries preserved 1,200 hours of seriously endangered, high value, analog oral history recordings, in a project generously funded by the Andrew W. Mellon Foundation.  Challenges in the project included:

  • Working with older reel-to-reel and cassette recordings that were not well-inventoried or preserved
  • Reassembling longitudinal, multipart, not-necessarily-contiguous audio content
  • Working with an outside audio preservation vendor to develop effective workflows and standards-compliant metadata (including METS, MODS, and AES-X098B-draft)
  • Ingesting the digital files and metadata into our Fedora repository for asset management, preservation and access

The successful outcomes of this project have provided a standard, replicable approach to digitizing historic audio collections that other institutions can also use.

https://www1.columbia.edu/sec/cu/libraries/bts/mellon_audio/index.html

Presentation

High Performance Computing Conference, Rensselaer Polytechnic, Oct 26-28

There’s a very interesting multidisciplinary conference on High Performance Computing being held at RPI on October 26-28 that may be of interest to CNI-announce subscribers in the New York region. For more details see

http://www.rpi.edu/hpcw/index.html

Clifford Lynch
Director, CNI

2001-2002 Program Plan

CNI PROGRAM ACTIVITIES, 2001-2002
Overview
[Image: Blue Dot Icon!]   Background and History

The Coalition was founded in 1990 by the Association of Research Libraries (ARL), CAUSE and Educom. ARL represents the research libraries of North America. CAUSE and Educom were organizations concerned with the use of information technology in higher education. In 1998, CAUSE and Educom merged to create the EDUCAUSE organization, which has broad membership from the higher education community and their technology partners.

In establishing CNI, these sponsor organizations recognized the need to broaden the community’s thinking beyond issues of network connectivity and bandwidth to encompass networked information content and applications. Reaping the benefits of the Internet for scholarship, research, and education demands new partnerships, new institutional roles, and new technologies and infrastructure. The Coalition seeks to further these collaborations, to explore these new roles, and to catalyze the development and deployment of the necessary technology base.

Paul Evan Peters was the founding Executive Director of the Coalition, and served until his untimely death in 1996. Joan Lippincott, now CNI’s Associate Director, served as Interim Executive Director until the appointment of Clifford Lynch as the Executive Director in July 1997.

The Coalition is supported by a task force of about 200 dues-paying member institutions representing higher education, publishing, networking and telecommunications, information technology, and libraries and library organizations. Membership in the Coalition’s Task Force is open to all organizations — both for-profit and not-for-profit — that share CNI’s commitment to furthering the development of networked information.

The Task Force will meet twice in 2001-2002: once in San Antonio, Texas, on November 29-30, 2001, and again in Washington, DC, on April 15-16, 2002 in conjunction with the EDUCAUSE Net 2002 meeting.

The Coalition’s program is guided by a steering committee chaired by Richard West of the California State University system. As sponsor organizations, ARL and EDUCAUSE each appoint three representatives to the steering committee drawn from their member leadership; the steering committee is supplemented by “at-large” representatives providing additional perspectives.


[Image: Blue Dot Icon!]   Program Themes

The work of the Coalition is structured around three central themes that we believe are the essential foundations of the vision of advancing scholarship and intellectual productivity:

 

  • Developing and Managing Networked Information Content. A network that will play an integral role in scholarly discourse and productivity must be rich with content and information resources. The Coalition seeks to mobilize and bring together the many diverse communities that create and manage content. It works with these communities to develop methods of creating, organizing, evaluating, managing and preserving networked information resources. The Coalition also furthers the development of economic, policy, social, and legal frameworks that sustain the creation and management of networked information and facilitate its access.
  • Transforming Organizations, Professions, and Individuals.The use of networked information will transform institutions, professions, and the practices of learning and scholarship. For academic institutions, success in the new environment will require an unprecedented degree of collaboration among libraries, information technology groups, faculty, instructional technologists, museums, university presses, and other units; it will call for new alliances and partnerships with publishers, information technology and network service providers, scholarly societies, government, and other sectors. Organizations will need to develop and share new strategies, policies and best practices. Of equal importance is the need to assess and measure the impacts of the new environment on institutions and their activities as the transformation progresses. Professions will need to develop new competencies and enter into new dialogs that cross traditional disciplinary boundaries. The Coalition seeks to facilitate these collaborations and dialogs, and to help professions and institutions to work together both in program strategy formulation and impact assessment.
  • Building Technology, Standards, and Infrastructure. The networked information environment relies extensively on the development and deployment of standards and infrastructure components in order to enable the discovery, use, and management of networked information. The ability to use collections of resources in a unified, consistent fashion is essential: This requires a continuing focus on interoperability of services. At the same time, promising new technologies are constantly appearing that need to be explored, assessed and tested, and sometimes adapted to the needs of the CNI community. No one institution acting alone can build the needed infrastructure, or explore the full range of new technologies as they become available. Accomplishing these goals requires a coordinated community-wide effort; CNI seeks to provide leadership in this undertaking, to offer a context for collaborative experiments and testbeds, and to serve as a focal point for sharing knowledge about new technologies.

The specific program initiatives that further these themes evolve from year to year. The initiatives and strategies planned for 2001-2002 are described below; most build upon and continue earlier efforts already underway. Many of the initiatives seek to make strategic progress relevant to more than one theme. It is important to recognize that the networked information environment is changing very rapidly; CNI is continually adapting its activities in response to new developments and opportunities. Indeed, the Coalition believes agility is essential in the current environment and invites a continuous dialog with the members of the Task Force on the need for additional program initiatives. Because of this, the 2001-2002 program plan should be viewed as a snapshot of our thinking about priorities and opportunities as of November 2001 that will inevitably develop further during the coming year.

[Image: Blue Dot Icon!]   Advocacy and Consultative Activities

In addition to specific initiatives to address these overarching themes, the Coalition actively conducts an ongoing program of collaboration and advocacy to advance the development of networked information and its role in transforming organizations and scholarly activities. This is accomplished through:

 

  • Facilitation of print-based and network publications.
  • Participation in various conferences, meetings, workshops and committees on an institutional, regional, national and international basis.
  • Contributions to standards efforts; through collaboration with key funding agencies such as the National Science Foundation, the Institute of Museum and Library Services, the National Endowment for the Humanities, the Department of Education, and the Andrew W. Mellon Foundation.
  • Participation in organizations such as the World Wide Web Consortium and the Internet Society.

Of particular note in this area are our contributions to the Library of Congress‘s efforts to map out a National Digital Preservation Program, and to various studies and programs conducted by the US National Research Council. On an international level, we collaborate with other national organizations concerned with networked information, such as the UK Office of Library Networking (UKOLN) and the Joint Information Systems Committee (JISC) in the UK, DINI in Germany, and the newly formed Swedish Networked Information Association.

As well as contributing to the programs of our sponsor organizations, the Association of Research Libraries and EDUCAUSE, we also support, contribute to, and collaborate closely with other organizations that share in specific aspects of our programmatic interests and priorities as a strategic part of our own program work. These include:

 

  • The National Initiative for a Networked Cultural Heritage (NINCH). This broad coalition of arts, humanities and social science groups was founded by CNI, the American Council of Learned Societies (ACLS) and the Getty Information Institute in 1996. CNI is represented on its Board. NINCH initiatives of particular relevance include its Building Blocks conference program and the development of its guide to good practice for digitization of cultural heritage materials.
  • The University Corporation for Advanced Internet Development (UCAID). This organization manages the Internet2 initiative to promote advanced networking and applications within the higher education community. CNI is represented on the Internet2 Applications Strategy Council and works with UCAID on numerous interests, including video and multimedia applications and standards, and high-bandwidth content-intensive applications.
  • The Computer Interchange of Museum Information (CIMI).This project is focused on standards, pilot projects, and research to support network-based access and exchange of museum and cultural heritage information. CNI is a CIMI member and is represented on CIMI’s executive committee.
  • The Council on Library and Information Resources (CLIR). CLIR addresses a broad range of issues involving the scholarly communication system, higher education and libraries. The Digital Library Federation (DLF) is a CLIR program focused on the use of digital library technologies within research libraries. CNI collaborates extensively with CLIR and DLF on issues ranging from digital preservation to metadata.

The Coalition also contributes to the development of the networked information community by hosting electronic discussion groups, such as the CNI-COPYRIGHT forum, and acting as a distribution point for materials via its website and the CNI-ANNOUNCE e-mail list.

[Image: Blue Dot Icon!]   Meetings

The Coalition’s twice-annual Task Force meetings–scheduled for November 29-30, 2001, in San Antonio and April 15-16, 2002, in Washington, DC–not only allow CNI to highlight activities related to its program themes and to focus attention on significant new thinking and technology developments, but also provide a major opportunity for the membership to showcase and discuss a wide range of emerging issues and developments in networked information. For member organizations, who are invited to send two delegates — typically a senior information technologist and librarian — these meetings offer a unique opportunity to remain informed about new developments that may reshape institutional plans, and a forum in which to establish collaborations and dialogs with others sharing common interests.

On June 26-27, 2002, CNI will co-sponsor a conference in Edinburgh, Scotland, in partnership with the UK Joint Information Systems Committee (JISC) and the UK Office of Library Networking (UKOLN) as part of our ongoing collaboration with these programs.

In addition, CNI occasionally convenes invitational or public workshops to advance specific elements of its program plan, and acts as a sponsor or co-sponsor for other meetings relevant to the CNI agenda, such as the EDUCAUSE Net 2002 meeting, to be held in Washington DC on April 17-18, 2002, immediately following the spring 2002 CNI Task force meeting, or the ACM/IEEE Joint Conference on Digital Libraries scheduled for June 14-18, 2002, in Portland Oregon.


Developing and Managing
Networked Information Content
[Image: Blue Dot Icon!]   Digital Preservation

Preservation and long-term management of digital information has emerged as a central issue in the shift to network-based scholarly publishing, and more recently as a broad and fundamental social and public policy question for our society. CNI continues to work with ARL and other partner organizations such as the Council on Library and Information Resources (CLIR) and the Digital Library Federation (DLF) in developing economic, business and organizational models for preservation; in exploring technologies to manage the archiving of digital content, and in identifying priorities for preservation action. We are also collaborating with the Library of Congress in their efforts to develop a national digital preservation strategy. During 1999-2000 most of our work focused on strategies for preserving scholarly journals in digital form; this led to a number of pilot projects involving CNI member institutions (many funded through the Andrew W. Mellon Foundation); in the 2001-2002 program year the results of these projects will be available, and we will offer reports in both the Fall and Spring Task Force meetings, as well as help the community to consider how to proceed based on these results. We will also focus strongly on economic and legal issues involved in digital archiving during the coming year.

A second group of activities address the management of intellectual and institutional assets within higher education in the digital environment.

[Image: Blue Dot Icon!]   Management of Media Assets

In 2000, CNI started a project to try to understand the emerging practices and organizational issues in the management of non-instructional audio and video assets produced by institutions; this includes content that might be captured as part of special events like performances or symposia, or that might be generated through broadcasting activities that now may be moving to the net. In August, 2001 we sponsored a workshop on the management of video materials jointly with Internet2, the ViDe project, and SURA, and are currently planning follow-on activities. We will be planning follow-on workshops, scheduling sessions at our upcoming Task Force meetings reporting on and exploring these developments, and are also preparing a paper on institutional issues and strategies in this area.

[Image: Blue Dot Icon!]   Electronic Theses and Dissertations

Theses and dissertations are another key part of the content created by the higher education community; also, because the process of their creation is so integral to the process of higher education, they offer a unique opportunity to train new scholars in the creation of digital documents, and for institutions to formalize their management. Further, these materials represent a significant body of important information that has not historically been readily accessible. CNI is a member of the Networked Digital Library of Theses and Dissertations (NDLTD) program, and serves on the steering committee of this enterprise. The initiative, which is now finding broad international acceptance, seeks to improve graduate education by allowing students to produce electronic theses and dissertations, and to understand issues in publishing while increasing the availability of student research for scholars, and preserving these electronic materials. NTLTD is now maturing as an initiative, and is in the process of mapping its future, which is expected to include several areas of collaboration with CNI, such as a stronger emphasis on standards-related activities.

[Image: Blue Dot Icon!]   Learning Support and Management Systems

CNI will prepare a paper on learning support and management systems as information resources that will outline and frame policy issues raised by the large scale deployment of these systems in higher education institutions, with particular focus on records management, intellectual property and scholarly publishing issues. CNI believes that it is important to begin to view the content in these learning management systems as institutional scholarly assets.

Content from the arts, the humanities, and the cultural heritage community represents an important scholarly resource for the networked environment; indeed, making much of this information available in digital form should greatly increase its accessibility and usefulness. CNI has had a long-standing commitment to the development of such resources. While our program in this area relies heavily on collaborations and partnerships as described earlier, two particular program initiatives are highlighted here.

[Image: Blue Dot Icon!]   Computing and Humanities

CNI is participating with NINCH, the US National Research Council, and ACLS in a Steering Committee for Computer Science and the Humanities that seeks to promote the application of the information sciences to the understanding of the human record; currently, the work of this committee is focusing on knowledge representation and humanities informatics. The Steering Committee has obtained funding from the Carnegie Corporation for the first in a series of major conferences bringing together computer scientists and humanists to advance the use of information technologies in humanities research through collaborations between these disciplines, which will take place in 2002.

[Image: Blue Dot Icon!]   Strategies for Creating Large Scale Digital Content Resources

Several nations are investing heavily in the creation of digital content in the public interest; of particular note are programs in the United Kingdom and Canada, as well as initiatives taking place within the European Union framework. To date, similar investments in the United States have been extremely modest by comparison. However, proposals such as the “Digital Gift to the Nation” (discussed at the Spring 2001 Task Force meeting) have begun to raise the question of what the priorities for such investment might be in the United States – for example, how to balance the creation of new digital content against the retrospective digitization of existing materials. There is also a great deal to be learned from the experience of other nations in areas ranging from economic models and sustainability to best practices and technical standards. CNI will work with organizations such as the Institute for Museum and Library Services and with our international colleagues to pursue an exploration of these issues.

[Image: Blue Dot Icon!]   Metadata

Metadata to describe networked information resources is now recognized as a key component in organizing content to facilitate its discovery and use; it is an essential component in a wide range of other programmatic activities. CNI has been a partner in the OCLC Dublin Core Descriptive Metadata program on a continuing basis and recently helped to sponsor the 9th International Dublin Core Meeting in Tokyo, Japan in late 2001. Working with partners such as the National Information Standards Organization (NISO) and the Council on Library and Information Resources (CLIR) we will also continue our efforts to move work on metadata beyond descriptive information to support resource discovery; this includes work in metadata and supporting infrastructure to address the authenticity, provenance and support rights management, and to document the digitization or capture processes for electronic information.


Transforming Organizations,
Professions, and Individuals
One of the greatest challenges facing the CNI community is the development of effective strategies for collaboration among librarians, information technologists, instructional technologists and faculty to address new teaching and learning opportunities opened up by the networked digital medium. This calls for new organizational arrangements for service and support delivery. Many of our member organizations are also struggling with questions of how to create or renovate physical spaces to house these activities, and more broadly how to design spaces to support changing practices of scholarship in a digital world.

[Image: Blue Dot Icon!]   Collaborative Facilities

In partnership with Dartmouth College, CNI has developed a website featuring plans and related materials for collaborative facilities. A number of institutions are beginning to offer public service points or facilities where library and information technology staff share responsibilities to serve users; other institutions are establishing teaching and learning support centers that bring together instructional technologists, faculty, information technologists, and librarians. Typically, these service points and centers are developed in conjunction with building renovation, expansion, or new building projects.

There is great interest in sharing experiences and plans in this area, and the website hosted at Dartmouth includes planning documents, layouts, programmatic descriptions, and equipment information contributed by higher education institutions. In addition, project briefings at the Fall and Spring Task Force meetings and at the EDUCAUSE annual conference will highlight particular campus facilities and the experiences being gained through their operation.

 

[Image: Blue Dot Icon!]   Working Together

A fundamental goal of CNI is to foster dialog and collaboration among information professionals from all disciplinary backgrounds. The Coalition has offered Working Together, a structured workshop experience to help groups of professionals improve their ability to collaborate and build partnerships with colleagues, particularly on projects related to networked information resources and services. Over the years, these workshops have been repeatedly redesigned to focus on different types of inter-disciplinary collaborations.

Supporting the new initiative on collaborations for joint service points and teaching and learning centers, CNI will explore a new focus for Working Together workshops, one that will engage institutional teams involved in developing and delivering online instructional materials in higher education institutions. Such collaborations often involve faculty, students, instructional designers, information technologists, and librarians. CNI worked with instructional teams in its pioneering New Learning Communities program, and we believe that this will provide a valuable base for the development of a new Working Together program.

[Image: Blue Dot Icon!]   Transformative Assessment Project

Measuring the impacts and value of networking and networked information has been an important theme for CNI. In 2001-2002 we will also focus this work around joint service points and online teaching and learning. We are offering a Transformative Assessment Program, developed jointly with the EDUCAUSE National Learning Infrastructure Initiative (NLII) and the TLT Group. The program focuses on using assessment to assist in transforming teaching and learning using technology, and it consists of an in-person, team-oriented workshop (scheduled for early 2002), an online learning experience, and an online community of practice. Institutional teams will develop and implement assessment plans for their home institutions.

[Image: Blue Dot Icon!]   Electronic Records Management

In 2000-2001, CNI completed a Working Together workshop series, designed to address electronic records management issues by promoting institutional projects undertaken by teams of information technologists, records managers, and archivists. As a conclusion to this workshop series, CNI will develop a paper summarizing lessons learned by workshop participants, and highlighting some of the subsequent implementation experiences of the attendee teams. We continue to be concerned with electronic records management issues, and to work with the Arizona State University on its Electronic College and University Records (ECURE) program and conference series.


Building Technology, Standards, and Infrastructure
CNI continues to be actively engaged in key areas of standards and infrastructure development. The Coalition is particularly concerned with facilitating the difficult and delicate transition of standards and technologies into operational infrastructure within the CNI community. As well as the major program initiatives described here, CNI is closely tracking a wide range of technology and standards developments in areas as diverse as identifiers, digital books, efforts related to the realization of the semantic web, and recommender systems and personalization.

[Image: Blue Dot Icon!]   Architectural Contexts for New Academic Platforms

During the past year, the Association of Research Libraries has provided a focus for renewed interest from the library community in a cluster of ideas variously called “scholar’s portals,” “academic platforms,” or “scholar’s toolkits” to assist information seekers in locating, using, and contributing to the ever-growing diversity of academic and scholarly information resources. As these ideas have been refined, they are recognizing the limitations of services such as commercial web search engines, traditional library automation tools like online catalogs and stand-alone abstracting and indexing databases, and also the need to integrate with the emerging technologies of learning management systems.

The Coalition believes that it is now time to consider architectural and standards frameworks that can facilitate the development of interoperable and complementary prototype systems in this area, and contribute to the development of a vibrant marketplace in such systems as they are created by the private sector, by university-industry collaborations, or by university-based projects. We will sponsor a workshop in collaboration with ARL and other partner organizations to bring together not-for-profit organizations, including groups working on learning management system architectures like the Open Knowledge Initiative and the Instructional Management System effort. An important input to this work will be the excellent architectural and service modeling developed by groups such as JISC and UKOLN in the United Kingdom.

[Image: Blue Dot Icon!]   Open Archives Metadata Harvesting Initiative

In 2000 CNI launched a major new initiative in the infrastructure and standards area with its investment (jointly with the Digital Library Federation) in the Open Archives Initiative. The goal of this work, which grew out of a meeting held in Santa Fe in 1999 to federate e-print archives, is to develop the necessary standards and infrastructure to permit repository sites to expose metadata for harvesting and subsequent reuse by upper-layer applications. This can be used to federate e-print archives, publisher web sites, or collections of digital objects created from special collections or museum holdings, for example. A clearinghouse for the project was established at Cornell University under the management of Carl Lagoze, and a steering committee and technical committee have been set up to guide the work. The first release of the revised OAI technical specifications took place in December 2000, with meetings in the US and Europe in early 2001 to review this work. There are now a large number of implementation projects underway, including a group sponsored by the Andrew W. Mellon foundation in the United States and several European projects. The plan for the remainder of this initiative, which will conclude in late 2002, involves a review and updating of the technical specifications based on the implementation experience gained in 2001, and the distribution of these revised technical specifications.

CNI believes that this effort will yield not only critical infrastructure and standards to support a wide range of networked information applications, but will also stimulate the development of novel applications that build upon the growing body of digital content available to support scholarship.

[Image: Blue Dot Icon!]   Authentication, Authorization and Access Management

Authentication and authorization have emerged as essential infrastructure requirements for network-based access to information, and have become a particularly critical need as institutions enter into site-license arrangements with publishers and other information providers, implement online and distance education initiatives, or form consortia for resource sharing. The Coalition has been pursuing a program to define technology approaches, standards, best practices, and policy and business issues for such an inter-organizational authentication and authorization infrastructure, and to help early adopter Task Force member organizations share implementation experiences and explore interoperability issues.

Working in partnership with Internet2, EDUCAUSE’s Net@EDU, and the Digital Library Federation, we will continue to seek to illuminate many of the planning, operational and budgetary issues involved in implementing public key infrastructure (PKI). The year 2002 may be a watershed for efforts in authentication and access management; one of the key findings of our early work, which was motivated by the need to better manage access to information resources, emphasized that PKI systems within higher education needed to be considered as institutional, rather than library infrastructure and thus would represent complex, long-term organization-wide initiatives. Over the past few years considerable progress has been made in this area, and several of the projects are maturing to the point where it should be possible to launch access management pilots within the coming year. We will be participating in an invitational CREN and Mellon Foundation sponsored workshop to explore readiness for such projects immediately following the Fall Task Force meeting, and also working closely with the Internet2 middleware efforts. Another high priority for CNI in this area is to update our paper on authentication and access management to reflect current developments and provide our community with an accessible summary of the state of the art.

[Image: Blue Dot Icon!]   The Future of Search Standards and Architectures

The Z39.50 Information Retrieval standard is currently undergoing its five year reaffirmation review through the National Information Standards Organization (NISO). While Z39.50 has a well-established user community and plays an important role in the networked information infrastructure, many of its fundamental design assumptions are more than a decade old. There have been a number of other efforts related to search standards development during the past few years, though most have not achieved wide adoption. Recently, we have seen a number of new initiatives related to search standards, including the World Wide Web Consortium XML query language work, “next-generation” Z39.50 experiments, and the Open Archives metadata harvesting initiative. CNI will partner with other interested organizations to host a workshop to look at the longer term, higher level issues involved in search architecture, functional requirements and search standards development as a means of focusing community thinking on key ideas that should guide future standards development.

[Image: Blue Dot Icon!]   Image Retrieval Benchmark Database

Another infrastructure initiative, launched in late 2000, addresses current problems involved in image retrieval systems for scholarly content. The Council on Library and Information Resources is underwriting this work, and CNI chairs the planning group. The fundamental problem is that there are a wide range of proposed metadata approaches for image content (many of which are very expensive to use), and many prototype systems for retrieving images based either on metadata or content analysis, or some combination of the two strategies. What seems to be needed is a benchmark database (including metadata) that can allow for system developers to explore both the retrieval effectiveness and cost-performance tradeoffs involved in various metadata approaches and system designs. The goal is to design a benchmark database resource that might serve as infrastructure for the communities that develop image databases and retrieval systems in much the same way as the TREC databases have served the text retrieval community.

During 2000-2001 we convened a workshop to explore design alternatives; we expect that this project will conclude in the spring of 2002 with presentation and discussion of a draft report at the Spring Task Force meeting, and the subsequent distribution of a final report.

German Information Infrastucture for Research Program

Earlier this week I was fortunate to be able to attend a briefing on a series of awards that the German DFG has made to support information Infrastructure for Research. These are discipline-specific and span the sciences, the social sciences, engineering and the humanities, since the DFG funding scope covers all of these disciplines. This is a very impressive program and I think it will be of interest to many CNI-announce readers, both as an example of a national approach to the challenges of data intensive scholarship, but also as a possible source of future collaborations. The announcement, which links to a number of other documents, can be found at

http://www.dfg.de/en/research_funding/announcements_proposals/info_wissenschaft_11_18/index.html

I am hoping that we may be able to have a session exploring some of these activities at the fall CNI member meeting.

Clifford Lynch
Director, CNI

National Hosting and Interoperability: The LuKII Project in Germany

National Hosting and Interoperability:
The LuKII Project in Germany

Michael Seadle
PI and Dean
Humboldt University of Berlin

David S. H. Rosenthal
Chief Scientist, LOCKSS
Stanford University

Interoperability gives digital archiving the chance to combine successful features of existing systems. The requirements for digital archiving have evolved differently in different regions. The last decade has seen multiple vendors offering their own proprietary solutions, often using commercial software, and often resulting in a lack of transparency about key technical aspects. Interoperation is easiest to establish between open source systems.

The LuKII (LOCKSS und kopal: Intrastruktur und Interoperabilität) project is building a network in Germany using LOCKSS (Lots of Copies Keep Stuff Safe), which is 100% open source, and the open source elements of kopal are found in its koLibRI software. The Deutsche Forschungsgemeinschaft (DFG) funded this project in order to:

  1. establish a cost-effective Private LOCKSS Network (PLN) within Germany
  2. implement interoperation with koLibRI (especially its metadata features)
  3. test it using open access materials from German institutional repositories

Shortly after LuKII began, Germany embarked on a study about potential national hosting solutions for scholarly data. A substantial study by Charles Beagrie Ltd helped to focus the choice between LOCKSS (LuKII) and Portico. The DFG explicitly involved the LuKII team in providing expert information about the German PLN that is being build as part of the project. A paper discussing technical questions posed within subcommittee (“Archiving in the Networked World: LOCKSS and National Hosting”) is available in Library Hi Tech. The requirements for a national hosting solution continue to evolve, but important elements are emerging from the discussion, including the ability to host all materials within Germany to simplify possible copyright issues involving Germany’s national licensing scheme. The ability to deliver usable content on the fly if originals become unavailable has also grown in importance, as has the awareness that bitstream maintenance is complex and matters at least as much as migration for the future use of digital content.

This presentation will discuss the policy and technical issues involved in this interoperability project and its implications for Germany’s national hosting decision.

https://docs.google.com/viewer?url=http://www.allianzinitiative.de/fileadmin/hosting_studie_e.pdf
http://www.emeraldinsight.com/journals.htm?issn=0737-8831&volume=28&issue=4&articleid=1886769&show=abstract
PROJECT WEBSITE

Handout (MS Word)

EDUCAUSE Live! Webcast 9/1/2010 on Princeton DataSpace for Research Data

Serge Goldstein of Princeton will be speaking on the institution’s DataSpacemodel for preserving and sharing research data on an EDUCAUSE webcast on September 1, 2010. This will offer a look at another major research university’s evolving strategy for addressing data stewardship and the emerging requirements from research funders. I’ve reproduced the EDUCAUSE announcement below; note that they require registration, and (virtual) space on these sessions is limited and often fills up. These sessions are also archived for replay.

The EDUCAUSE Live! webcasts are a wonderful resource that should be of very broad interest to CNI News readers; normally, we don’t cross-post their announcements unless they are very closely related to CNI’s program as this one is, so you may want to directly subscribe to their announcement list.

Clifford Lynch
Director, CNI
****************

EDUCAUSE Live! Web Seminar

September 1- DataSpace: A Funding and Operational Model for Long-Term Preservation and Sharing of Research Data

Speakers:

Serge Goldstein, Associate CIO and Director of Academic Services, Princeton University

Date: September 1, 2010 Time: 1:00 p.m. ET (12:00 p.m. CT, 11:00 a.m. MT, 10:00 a.m. PT). International participants: You may wish to visit this external time-conversion website to calculate the start time in your time zone. Abstract: Princeton University has developed a business model for managing research data on a long-term basis, a capability soon to be required for all NSF grants. Join us to learn about the model, including how you can replicate it on your campus, during this free, hour-long web seminar, “DataSpace: A Funding and Operational Model for Long-Term Preservation and Sharing of Research Data.”

REGISTER NOW-virtual seating is limited.

M-libraries – new publication and next conference

This international group has taken the lead in organizing programs related to use of mobile devices in library and information-oriented applications. The proceedings from last year’s conference are now available – see below – and plans are underway for the 2011 conference in Australia – save the date! I am on the international organizing committee for the conference.
Joan

The book of proceedings from the second International Conference has just been launched http://www.facetpublishing.co.uk/title.php?id=696-1 and the publishers have given us permission to make the book from the first conference freely available for download. This can be accessed via the m-libraries website http://www.usq.edu.au/m-libraries

Plans are progressing well for the third conference in Brisbane 11-13 May 2011 and we hope to be adding details of keynote speakers etc in the near future to the website.
http://library.open.ac.uk/mLibraries/2011/index.html

Finding One Piece of the Digital Preservation Puzzle

William Lund
Asssitant University Librarian for Information Technology
Brigham Young University

Randy Olsen
University Librarian
Brigham Young University

Chris L. Erickson
Digital Preservation Officer
Brigham Young University

How to preserve locally produced digital content is one of the most perplexing problems facing research libraries today. This session will report on an approach being tested by Brigham Young University in collaboration with Millenniata Inc. Presenters will review Brigham Young University’s digital preservation strategy, how use of Millenniata technology fits into that strategy, and report on early test results of Millenniata M-Writers and M-Arc disks. Millenniata Inc. has developed Write Once, Read Forever technology using discs made of rock-like materials. Once etched the M-Arc discs are compatible with current DVD readers but are expected to last for hundreds of years.

http://www.millenniata.com/index.html

http://www.lib.byu.edu/

http://lib.byu.edu/sites/scholarsarchive/

http://www.lib.byu.edu/digital/

Handout

Geographic Tools & Digital Collections

Natasha Smith
Head, Digital Publishing Group
Carolina Digital Library and Archives
University of North Carolina, Chapel Hill

Richard Szary
Director, Louis Round Wilson Library
and Associate University Librarian for Special Collections
University of North Carolina, Chapel Hill

Scott Eldredge
Digital Initiatives Program Manager
Brigham Young University

Spatial and Temporal History of North Carolina: Using GIS Technology in Digital Library Collections (Smith & Szary)

Carolina Digital Library and Archives (CDLA) and Documenting the American South (DocSouth) are a digital library laboratory that creates, develops, and maintains online digital collections regarding the history of the American South drawn primarily from the outstanding archival holdings of the University of North Carolina (UNC) library. Our recent experimental work with GIS technology helps with better understanding about how the use of digital technologies changes the way we do research in humanities. In this historiographic experiment, collaborators endeavor to use digital technologies in a variety of innovative ways to collect, organize, and display data and materials that illuminate temporal and spatial unfolding of historic events.

The wide array of issues (digitizing and geo-referencing of Sanborn and other historic maps; use of Google’s open-source map API for zooming and hotspot addition; layering and geo-tagging scholarly content) will be presented based on several completed and in progress collections built in close collaboration with UNC scholars: “Going to the Show” documents and illuminates the experience of movies and movie-going in North Carolina between the introduction of projected motion pictures (1896) and the end of the silent film era (1930). “North Carolina Maps” is a comprehensive, online collection of historic maps of the Tar Heel State, with options to view selected maps as Historic Overlay Maps, layered directly on top of current road maps or satellite images. “Main Street, Carolina” (private funding and NEH SUG) is a unique map-based digital history resource and framework that will allow a wide range of local organizations to preserve, document, interpret, display, and share the history of their downtowns. “Driving through Time: The Digital Blue Ridge Parkway in North Carolina” will present an innovative visually and spatially based model for illustrating North Carolina’s key role in creating the Parkway, representing the twentieth-century history of a seventeen-county section of the North Carolina mountains, and for understanding crucial elements of the development of the American National Park system.

http://cdla.unc.edu/index.html

http://docsouth.unc.edu/

http://docsouth.unc.edu/gtts/

http://www.lib.unc.edu/dc/ncmaps/

Process and Application for Geocoding and Presenting Digital Resources in an Academic Environment (Eldredge)

This project, based at Brigham Young University, focuses on processes and an application for locating and viewing digital objects from the institution’s CONTENTdm collections. The application, currently named MappifY, geographically arranges a variety of items from the digital collections including photographs, historical maps, and travel diaries. The photographs and diary pages are accurately pinpointed on Google™ maps and are found by navigating the map, browsing by geographical location or chronological order, or by performing simple keyword searches based on harvested metadata. Historical maps are overlaid on top of their corresponding locations in Google™ maps. This presentation will demonstrate the MappifY application, outline the process for integrating digital objects, review the lessons learned, and highlight a range of philosophical issues arising from the process.

http://lib.byu.edu/DigitalMaps/

http://lib.byu.edu/digital/

Handout