CNI: Coalition for Networked Information

  • About CNI
    • Membership
    • CNI Collaborations
    • Staff
    • Steering Committee
    • CNI Awards
    • History
    • CNI News
  • Program Plan
    • Current Program Plan
    • Program Plan Archive
  • Topics
  • Events & Projects
    • Membership Meetings
    • Workshops & Projects
    • Other Events
    • Event Calendar
  • Resources
    • Publications by CNI Staff
    • Program Plan
    • Pre-Recorded Project Briefing Series
    • Videos & Podcasts
    • Follow CNI
    • Historical Resources
  • Contact Us

Can We Afford to Preserve Large Databases?

Home / Project Briefing Pages / CNI Spring 2007 Project Briefings / Can We Afford to Preserve Large Databases?

April 14, 2007

David S.H. Rosenthal
Chief Scientist, LOCKSS Program
Stanford University
Victoria Reich
Director, LOCKSS Program
Stanford University

A single replica of a large database, such as the petabyte scale protein database, may cost millions of dollars. Minimizing the number of replicas needed to assure adequate preservation becomes the dominant design goal. As preserving large scientific datasets becomes a focus of the National Science Foundation’s cyber-infrastructure program, how well prepared are we to take rational investment decisions about systems in this area?

Drawing on research by the LOCKSS research team and others, this session surveys the state of engineering knowledge and points out the gaps that need to be filled by digital preservation research. These gaps include needs for better specification and characterization of media performance (recent papers show that everything you know about disk reliability is wrong), better models of the threats (the most frequent cause of data loss at large sites is operator error), better models of fault tolerance (recent papers show that both RAID and Byzantine Fault Tolerance are inappropriate models) and better ways of formulating the relationship between a preservation service and its customers (disclaiming all responsibility for the preserved data is not a suitable service level agreement).

PowerPoint Presentation (PDF)


 

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)

Filed Under: CNI Spring 2007 Project Briefings
Tagged With: CNI2007spring, Project Briefings & Plenary Sessions

Last updated:  Friday, March 1st, 2013

 

Contact Us

21 Dupont Circle
Suite 800
Washington, DC, 20036
202.296.5098

Contact us
Copyright © 2023 CNI

  • Copyright Policy
  • Privacy Policy
  • Site map

Keeping up with CNI

CNI-ANNOUNCE is a low-volume electronic forum used for information about the activities and programs of CNI, and events and documents of interest to the CNI community.
Sign up

Follow CNI

  • View cni.org’s profile on Facebook
  • View cni_org’s profile on Twitter
  • LinkedIn
  • YouTube
  • Vimeo

A joint project