CNI: Coalition for Networked Information

  • About CNI
    • Membership
    • Staff
    • Steering Committee
    • CNI Awards
    • History
    • CNI News
  • Membership Meetings
    • Next Meeting
    • Past Meetings
    • Future Meetings
  • Topics
  • Events & Projects
    • Membership Meetings
    • Workshops & Projects
    • Other Events
    • Event Calendar
  • Resources
    • CNI Publications
    • Program Plan
    • Pre-Recorded Project Briefing Series
    • Videos & Podcasts
    • Executive Roundtables
    • Follow CNI
    • Historical Resources
  • Contact Us

A Research Agenda for Historical and Multilingual OCR

Home / Topics / Assessment / A Research Agenda for Historical and Multilingual OCR

April 5, 2019

Ryan Cordell
Associate Professor of English
Northeastern University

This talk will outline the primary findings and recommendations of a report written for The Andrew W. Mellon Foundation that seeks to describe the current state of optical character recognition (OCR) for large-scale humanities collections and suggest the most fruitful avenues for future research in this domain. The report surveys the current state of OCR for historical documents and recommends concrete steps that researchers, implementers, and funders can take to make progress improving the quality and use of OCR collections over the next five to ten years. We find, for instance, that advances in artificial intelligence for image recognition, natural language processing, and machine learning will drive significant progress in this area. More importantly, however, we describe how sharing goals, techniques, and data among researchers in computer science, in book and manuscript studies, and in library and information sciences will open up exciting new problems and allow a broad community, including cohorts who rarely collaborate, to allocate resources and measure progress in improving OCR for historical typography and multilingual documents. This presentation will briefly outline the report’s findings about the current state of the art for humanistic OCR, but will devote the majority of his talk to detailing the report’s nine primary recommendations for future, collaborative OCR research.

https://ocr.northeastern.edu

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on X (Opens in new window) X

Filed Under: Assessment, CNI Spring 2019 Project Briefing, Digital Humanities, Emerging Technologies, Project Briefing Pages
Tagged With: cni2019spring, Project Briefings & Plenary Sessions, Videos

Last updated:  Tuesday, October 29th, 2019

 

Contact Us

1025 Connecticut Ave, NW #1200
Washington, DC 20036
202.296.5098

Contact us
Copyright © 2025 CNI

  • Copyright Policy
  • Privacy Policy
  • Site map

Keeping up with CNI

CNI-ANNOUNCE is a low-volume electronic forum used for information about the activities and programs of CNI, and events and documents of interest to the CNI community.
Sign up

Follow CNI

LinkedInBlueSkyFacebookTwitterYouTubeVimeoMastodon

A joint project