CNI: Coalition for Networked Information

  • About CNI
    • Membership
    • Staff
    • Steering Committee
    • CNI Awards
    • History
    • CNI News
  • Membership Meetings
    • Next Meeting
    • Past Meetings
    • Future Meetings
  • Topics
  • Events & Projects
    • Membership Meetings
    • Workshops & Projects
    • Other Events
    • Event Calendar
  • Resources
    • CNI Publications
    • Program Plan
    • Pre-Recorded Project Briefing Series
    • Videos & Podcasts
    • Executive Roundtables
    • Follow CNI
    • Historical Resources
  • Contact Us

Text Data Mining (TDM) Research Using Copyrighted and Use-Limited Text Data Sets: Developing an Agenda to Support Scholarly Use

Home / Project Briefing Pages / CNI Spring 2018 Membership Meeting Project Briefings / Text Data Mining (TDM) Research Using Copyrighted and Use-Limited Text Data Sets: Developing an Agenda to Support Scholarly Use

March 13, 2018

Beth Sandore Namachchivaya
University Librarian
University of Waterloo

Digital text corpora and text data mining (TDM) tools are enabling new discoveries through computational analytics. A high percentage of the texts, however, are protected by copyright, or subject to license agreements that limit access and use. These restrictions can complicate a researcher’s efforts to access texts and perform computational analysis, as well as to communicate the output and related methods to a broader audience. Increasingly, libraries are getting engaged as intermediaries between content providers and scholars to facilitate access to text datasets. Still, the process of interpreting or obtaining rights to perform computational analysis is arduous, and the results of the research often cannot be adequately documented to support reproducibility in a scholarly climate focused on evidence. The perceived high barrier to entry for TDM can lead to one of two outcomes: a scholar abandons the research project or moves ahead using unsanctioned approaches, such as screen-scraping, to assemble and mine the corpus. This project briefing provides an update on research funded by the Institute of Museum and Library Services to hold a national forum with key stakeholders to develop a research and implementation agenda for libraries that work with scholars and content providers to enable streamlined access to copyrighted and licensed texts for data mining research. In particular, we focus on the perspectives and the SWOT (Strengths, Weaknesses, Opportunities, and Threats) analyses provided by the National Forum attendees.

Project team:
Bertram Ludäscher, PI, School of Information Sciences, University of Illinois at Urbana-Champaign
Beth Sandore Namachchivaya, co-PI, University Library, University of Waterloo
Megan Senseney, co-PI, School of Information Sciences, University of Illinois at Urbana-Champaign
Eleanor Dickson, University Library, University of Illinois at Urbana-Champaign

http://publish.illinois.edu/limitedaccess-tdm/about-the-project

Presentation

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on X (Opens in new window) X

Filed Under: CNI Spring 2018 Membership Meeting Project Briefings, Information Access & Retrieval, Intellectual Property, Project Briefing Pages, User Services
Tagged With: cni2018spring, Project Briefings & Plenary Sessions, Videos

Last updated:  Sunday, November 30th, 2025

 

Contact Us

1025 Connecticut Ave, NW #1200
Washington, DC 20036
202.296.5098

Contact us
Copyright © 2025 CNI

  • Copyright Policy
  • Privacy Policy
  • Site map

Keeping up with CNI

CNI-ANNOUNCE is a low-volume electronic forum used for information about the activities and programs of CNI, and events and documents of interest to the CNI community.
Sign up

Follow CNI

LinkedInBlueSkyFacebookTwitterYouTubeVimeoMastodon

A joint project