CNI: Coalition for Networked Information

  • About CNI
    • Membership
    • Staff
    • Steering Committee
    • CNI Awards
    • History
    • CNI News
  • Membership Meetings
    • Next Meeting
    • Past Meetings
    • Future Meetings
  • Topics
  • Events & Projects
    • Membership Meetings
    • Workshops & Projects
    • Other Events
    • Event Calendar
  • Resources
    • CNI Publications
    • Program Plan
    • Pre-Recorded Project Briefing Series
    • Videos & Podcasts
    • Executive Roundtables
    • Follow CNI
    • Historical Resources
  • Contact Us

Downloading Millions of Files from Internet Archive: Two Approaches

Home / Project Briefing Pages / CNI Spring 2026 Project Briefing / Downloading Millions of Files from Internet Archive: Two Approaches

March 25, 2026

Karen Knox
Head of Library Technology Services
Washington University

Leonard Augsburger
Newman Numismatic Portal Project Coordinator
Washington University

Eric Weig
Web Development Librarian
University of Kentucky 

Mitch Sumner
Head of Digital Preservation, Processing, and Reformatting
Washington University 

Washington University (WashU) Libraries and the University of Kentucky Libraries have recently completed downloading millions of previously uploaded digitized files from the Internet Archive. This process was part of larger migration projects at both institutions: WashU’s Newman Numismatic Portal’s 76,000 assets are migrating to AM Quartex, and the University of Kentucky Libraries’ Kentucky Digital Newspaper Program’s 86,000 assets were recently copied to a new local online discovery interface powered by Ex Libris Primo. This briefing will share background information on both institutions’ asset migration projects and details on the tools used, including the Internet Archive API, the command line interface from Internet Archive, and ChatGPT to assist with writing Python code to achieve this. Staff from both libraries will share how they independently developed workflows and discuss goals and desired outcomes that inspired the work, including preservation and access.

https://kdnp.uky.edu
https://journal.code4lib.org/articles/18510
https://nnp.wustl.edu/

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on X (Opens in new window) X

Filed Under: CNI Spring 2026 Project Briefing, Digital Libraries, Digital Preservation, Information Access & Retrieval, Metadata, Personal Archives, Project Briefing Pages, Repositories, Scholarly Communication
Tagged With: cni2026spr, Project Briefings & Plenary Sessions

Last updated:  Wednesday, March 25th, 2026

 

Contact Us

1025 Connecticut Ave, NW #1200
Washington, DC 20036
202.296.5098

Contact us
Copyright © 2026 CNI

  • Copyright Policy
  • Privacy Policy
  • Site map

Keeping up with CNI

CNI-ANNOUNCE is a low-volume electronic forum used for information about the activities and programs of CNI, and events and documents of interest to the CNI community.
Sign up

Follow CNI

LinkedInBlueSkyFacebookTwitterYouTubeVimeoMastodon

A joint project