Head of Digital Preservation
Associate University Librarian for Discovery and Access
Copyright Advisor and Program Manager
The valuable, and often unique, collections stewarded by research libraries and other memory institutions are critical to successful pedagogy, public policy, economic innovation, and persistence of the cultural record. Widespread access to this material, however, is often restricted due to uncertainty regarding its intellectual property rights status. For example, of the Harvard Library’s 25 million bibliographic items, 7 million have been digitized but only 93,000 are shared through the Digital Public Library of America, even though 3 million items were published or created over 130 years ago and likely in the public domain. The limiting factor to broader release is the constraining nature of manual rights review. This presentation reports on Harvard’s investigation of automated determination of rights status at scale. This process reviews pertinent catalog metadata (publication status/date, individual/corporate authorship, creator birth/death dates, etc.) to assign individual items with rightsstatements.org standardized rights statements. Assessment criteria were developed in collaboration with legal counsel and external consultants and are parameterized so that they can be tailored to institutional risk tolerance profiles. An initial pilot project looking at 60,000 representative items has validated the ability of the algorithmic determination to match that of human review. Confident assignment of collection-wide rights status, particularly regarding public domain status, holds the promise to unlocking significant new bodies of stewarded material for free use and innovative reuse.