Report on Access to Born-Digital Archival Collections

I wanted to share this announcement of an interesting preliminary report of a broad based study on access practices for born-digital collections in cultural memory organizations, which I think will be of interest to CNI-announce readers.

Clifford Lynch

Director, CNI


Dear colleagues,

For the past year, a research team has been working to on a project to map the landscape of born-digital access. The team surveyed over 200 cultural heritage institutions regarding their access policies and procedures.

The team is preparing to share initial findings at a session at the Society of American Archivists Annual Meeting (, and we thought that you might be interested, too. The document outlining our research is available here:

We welcome any feedback from your membership, and many thanks to those of you who participated in the survey. If you’d like to follow SAA and the session on Twitter, please keep an eye on the hashtags #saa15 #s110 next Thursday!



Wendy Hagenmaier

Digital Collections Archivist

Georgia Tech

Video: Challenges Presented by Institutional IDs

Institutions wish to enhance and promote their reputation to attract funders and faculty and to increase their ranking. Since universities change their official names as part of branding activities, academic departments change their names to reflect new curricular emphasis, and schools merge with or separate from parent institutions, institutional identifiers are crucial to accurately represent scholars’ affiliations both on their output and on grant applications. Institutions may not realize they already have such an institutional identifier, ISNI, and that this identifier has already been disseminated, used by ORCID and included in VIAF and Wikidata. In this presentation from CNI’s spring 2015 meeting, Karen Smith-Yoshimura of OCLC Research summarizes the current work of a task force on use cases and challenges of representing organizations in the ISNI database.

Challenges Presented by Institutional Identifiers is now available online:

and on Vimeo:

To see all CNI videos, visit our channels on YouTube ( and Vimeo (

Building Expertise to Support Digital Scholarship: A Global Perspective

In this presentation from CNI’s spring 2015 meeting, Jon Cawthorne (West Virginia), Vivian Lewis (McMaster) and Lisa Spiro (Rice) present key results from a pilot global benchmarking study on digital scholarship expertise. The project involved visiting leading digital humanities and digital social science organizations in several countries and conducting interviews with research staff, faculty, graduate students, and administrators in order to understand the core skills required for digital scholarship and the characteristics of organizations that cultivate these skills.

Building Expertise to Support Digital Scholarship: A Global Perspective is now available online:

and on Vimeo:

To see all CNI videos, visit our channels on YouTube ( and Vimeo (

PLoS ONE Paper on Sizing Discovery and Access Challenges for Datasets

I want to share a pointer to a paper published in PLoS ONE July  24, 2015 titled “Sizing the Problem of Improving Discovery and Access to NIH-Funded Data: A Preliminary Study” by Kevin Read et al.

This is an excellent example of work that is badly needed to help us ot better understand the scale of the challenge of managing research data to facilitate its discovery and reuse by other scholars, and to illuminate the roles that repositories of various types may play in this effort. I’ve reproduced the abstract below.

Clifford Lynch

Director, CNI



This study informs efforts to improve the discoverability of and access to biomedical datasets by providing a preliminary estimate of the number and type of datasets generated annually by research funded by the U.S. National Institutes of Health (NIH). It focuses on those datasets that are “invisible” or not deposited in a known repository.


We analyzed NIH-funded journal articles that were published in 2011, cited in PubMed and deposited in PubMed Central (PMC) to identify those that indicate data were submitted to a known repository. After excluding those articles, we analyzed a random sample of the remaining articles to estimate how many and what types of invisible datasets were used in each article.


About 12% of the articles explicitly mention deposition of datasets in recognized repositories, leaving 88% that are invisible datasets. Among articles with invisible datasets, we found an average of 2.9 to 3.4 datasets, suggesting there were approximately 200,000 to 235,000 invisible datasets generated from NIH-funded research published in 2011. Approximately 87% of the invisible datasets consist of data newly collected for the research reported; 13% reflect reuse of existing data. More than 50% of the datasets were derived from live human or non-human animal subjects.


In addition to providing a rough estimate of the total number of datasets produced per year by NIH-funded researchers, this study identifies additional issues that must be addressed to improve the discoverability of and access to biomedical research data: the definition of a “dataset,” determination of which (if any) data are valuable for archiving and preservation, and better methods for estimating the number of datasets of interest. Lack of consensus amongst annotators about the number of datasets in a given article reinforces the need for a principled way of thinking about how to identify and characterize biomedical datasets.

US National Strategic Computing Initiative

Last week, the Obama administration issued an executive order creating a National Strategic Computing Initiative to “maximize the benefits of high-performance computing research, development and deployment”. The executive order, which is not lengthy, is well worth reading; it both establishes a series of  objectives and defines roles and responsibilities among a large number of government agencies involved in the program.

The executive order is here:

And there is also a blog post from Tom Kalil and Jason Miller providing additional context here:

One point I found particularly interesting. While the “objectives” section of the order speaks, as one might expect, of exascale computing systems, it also specifically identifies as an objective “Increasing coherence beween the technology base used for modelling and simulation and that used for data analytic computing.” This is a disconnect that has been growing increasingly evident with the rise of”big data” and “data analytics” in recent years.

Clifford Lynch

Director, CNI

Jisc work on learning analytics code of practice

In the UK, the Jisc has been doing some great work on learning analytics that doesn’t seem to have gotten wide visibility beyond the UK yet; I wanted to particularly share their “Code of Practice for Learning Analytics” which addresses privacy and other ethical issues involved in the deployment of learning analytics. While of course some of this work is adapted for specific UK legal requirements, the broader principles are highly relevant. See

There’s also a very helpful literature review that they developed as part of the effort, which is at:

For a broad overview of Jisc’s work in the learning analytics area, and pointers to other material, see

Clifford Lynch

Director, CNI

Designing Libraries IV Program and Speakers Available

The program and list of speakers are now available for the 4th Designing Libraries for the 21st Century Conference, which will be held at the James B. Hunt Library in Raleigh, NC on September 20-22, 2015.  North Carolina State University Libraries will host the conference, and CNI is very pleased to be a co-sponsor along with the University of Calgary. We have a stimulating program planned and attendees will be able to tour the Hunt and Hill Libraries at NCSU. I encourage you to visit the conference website at for registration and additional information.

We are nearing the maximum number of attendees we can accommodate so if you are interested, I urge you to register as soon as possible. We will have a waiting list when the registration numbers have exceeded capacity.

–Joan Lippincott, CNI

4th Designing Libraries Conference Registration

NMC Survey on Online Professional Development Needs

The New Media Consortium (NMC) was recently awarded a planning grant from the Institute of Museum and Library Services (IMLS), under the Laura Bush 21st Century Librarian Program. For this Collaborative Planning Grant, the NMC is to assess the need for online professional development for academic and research library professionals. The goal is to identify sector-wide needs for in-service training for academic and research libraries that could be met with with high-quality online offerings. If the need is clear, the next steps are to develop a plan to deliver such training, and seek funding to design curricula and approaches that can be delivered to participants in any US academic or research library for free. CNI is actively involved in this work.

We hope that you will lend your perspectives to this survey and give us feedback about your professional development needs. The survey is available at

We encourage you to complete the survey whether you personally have needs for professional development or whether you can identify needs for your professional staff if you are a library administrator.

Please complete the survey by Friday, August 14.

If you would prefer to review the survey questions in a separate document before diving into the official survey here online, please view this link:

We estimate that it will take about 20-25 minutes to complete this survey.


Joan Lippincott, CNI

Personal Digital Archiving 2015 Videos and Presentations

Personal Digital Archiving 2015 was held in New York City April 24-25, 2015. The presentations from this meeting are now available at the conference web site, where they are linked to the individual day agendas, at

Video from the sessions can be found at the Internet Archive, at

CNI was delighted to serve as a collaborating organization for this latest in the series of Personal Digital Archiving Conferences, and we hope to continue to do so in future; I’ll share a hold the date message for the 2016 meeting when that information is available.

Clifford Lynch
Director, CNI

Last updated:  Friday, February 1st, 2013