CNI: Coalition for Networked Information

  • About CNI
    • Membership
    • Staff
    • Steering Committee
    • CNI Awards
    • History
    • CNI News
  • Membership Meetings
    • Next Meeting
    • Past Meetings
    • Future Meetings
  • Topics
  • Events & Projects
    • Membership Meetings
    • Workshops & Projects
    • Other Events
    • Event Calendar
  • Resources
    • CNI Publications
    • Program Plan
    • Pre-Recorded Project Briefing Series
    • Videos & Podcasts
    • Executive Roundtables
    • Follow CNI
    • Historical Resources
  • Contact Us

Supporting Computational Research on Large Digital Collections

Home / Topics / Artificial Intelligence / Supporting Computational Research on Large Digital Collections

November 16, 2022

Jefferson Bailey
Director, Archiving & Data Services
Internet Archive

Nick Ruest
Associate Librarian, Digital Scholarship Infrastructure Department
York University

Abigail Potter
Senior Innovation Specialist
Library of Congress

Meghan Ferriter
Senior Innovation Specialist
Library of Congress

Every year more and more scholars conduct research on terabytes and even petabytes of digital library and archive collections using computational methods such as data mining, natural language processing, and machine learning (ML), which poses many challenges for supporting research libraries. In 2020, Internet Archive Research Services and Archives Unleashed received funding to combine their tools enabling computational analysis of web and digital archives to support joint technology development, community building, and selected research projects by sponsored cohort teams. The session will feature programs that are building technologies, resources, and communities to support data-driven research, and it will review the beta platform, Archives Research Compute Hub, and discuss working with digital humanities, social and computer science researchers, and industry partners in support of large-scale digital research methods.

Concurrently, LC Labs are investigating computational research service models and infrastructure requirements for cloud-based access to data packages with Computing Cultural Heritage in the Cloud (CCHC), supported by the Mellon Foundation. When large digital collections are processed and analyzed, ML and other automated methods are often utilized. LC Labs have summarized four years of applied research into the applications of ML in library and archival contexts and developed a proposed framework to analyze the risks, benefits, and performance of artificial intelligence (AI) and ML with cultural and historic collections.

By considering and documenting the implications of AI and ML methods at the dataset, model, task, system, organizational, or sector level and developing standards of quality and shared technical frameworks for using AI/ML in libraries, archives, and museums, large-scale computational research can be transparent, practical, responsible, and coherent.

https://archivesunleashed.org/arch/
https://webservices.archive.org/pages/arch
https://labs.loc.gov/work/experiments/?st=gallery

Presentation

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on X (Opens in new window) X

Filed Under: Artificial Intelligence, CNI Fall 2022 Project Briefings, Cyberinfrastructure, Digital Humanities, Digital Libraries, Emerging Technologies, Project Briefing Pages, Scholarly Communication
Tagged With: cni2022fall, Project Briefings & Plenary Sessions, Videos

Last updated:  Sunday, November 30th, 2025

 

Contact Us

1025 Connecticut Ave, NW #1200
Washington, DC 20036
202.296.5098

Contact us
Copyright © 2025 CNI

  • Copyright Policy
  • Privacy Policy
  • Site map

Keeping up with CNI

CNI-ANNOUNCE is a low-volume electronic forum used for information about the activities and programs of CNI, and events and documents of interest to the CNI community.
Sign up

Follow CNI

LinkedInBlueSkyFacebookTwitterYouTubeVimeoMastodon

A joint project