CNI: Coalition for Networked Information

  • About CNI
    • Membership
    • CNI Collaborations
    • Staff
    • Steering Committee
    • CNI Awards
    • History
    • CNI News
  • Program Plan
    • Current Program Plan
    • Program Plan Archive
  • Topics
  • Events & Projects
    • Membership Meetings
    • Workshops & Projects
    • Other Events
    • Event Calendar
  • Resources
    • Publications by CNI Staff
    • Program Plan
    • Pre-Recorded Project Briefing Series
    • Videos & Podcasts
    • Follow CNI
    • Historical Resources
  • Contact Us

Blockchain Can Not Be Used To Verify Replayed Archived Web Pages

Home / Project Briefing Pages / CNI Fall 2018 Project Briefing / Blockchain Can Not Be Used To Verify Replayed Archived Web Pages

December 7, 2018

Michael L. Nelson
Professor of Computer Science
Old Dominion University

As the number of public web archives grows, so does our interest in verifying the integrity of archived web pages replayed from the archive. When web archives disagree when replaying a web page, we are unsure how to resolve the discrepancy. Adopting Segal’s law to web archives: “The person with an archive knows what the page looked like. The person with two archives is never sure.” At first glance, a distributed public ledger such as blockchain would seem like a good solution to detect damage or tampering of web pages: web pages could be replayed by third parties and their cryptographic hash and time stamps stored in the blockchain. However, we have found over the course of one year through continuously replaying over 17,000 web pages sampled from 20 different public web archives that approximately 75% of the replayed web pages have undergone some kind of change that would cause them to not hash to the same value. Some changes are significant, impacting the semantics of the page itself, but most changes would not be noticed by regular users. Nonetheless, if blockchain or other hash-based values techniques were used to detect tampering, the number of false positives generated by the normal operation of web archives would make detecting actual tampering almost impossible. We review the different kinds of changes with examples drawn from each of the 20 public web archives.

Presentation

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)

Filed Under: CNI Fall 2018 Project Briefing, Emerging Technologies, Information Access & Retrieval, Project Briefing Pages
Tagged With: cni2018fall, Project Briefings & Plenary Sessions

Last updated:  Tuesday, October 29th, 2019

 

Contact Us

21 Dupont Circle
Suite 800
Washington, DC, 20036
202.296.5098

Contact us
Copyright © 2023 CNI

  • Copyright Policy
  • Privacy Policy
  • Site map

Keeping up with CNI

CNI-ANNOUNCE is a low-volume electronic forum used for information about the activities and programs of CNI, and events and documents of interest to the CNI community.
Sign up

Follow CNI

  • View cni.org’s profile on Facebook
  • View cni_org’s profile on Twitter
  • LinkedIn
  • YouTube
  • Vimeo
  • Tumblr

A joint project