Absolute Relevance? Ranking in the Scholarly Domain

Tamar Sadeh
Director of Marketing
Ex Libris Group

The greatest challenge for discovery systems is how to provide users with the most relevant search results, given the immense landscape of available content. In a manner that is similar to human interaction between two parties, in which each person adjusts to the other in tone, language, and subject matter, discovery systems would ideally be sophisticated and flexible enough to adjust their algorithms to individual users and each user’s information needs. When evaluating the relevance of an item to a specific user in a specific context, relevance-ranking algorithms need to take into account, in addition to the degree to which the item matches the query, information that is not embodied in the item itself.

Such information, which includes the item’s scholarly value, the type of search that the user is conducting (e.g., an exploratory search or a known-item search), and other factors, enables a discovery system to fulfill user expectations that have been shaped by experience with Web search engines. This session will focus on the challenges of developing and evaluating relevance-ranking algorithms for the scholarly domain. Examples will be drawn mainly from the relevance-ranking technology deployed by the Ex Libris Primo discovery solution.

 Presentation (PDF)

Advances in Discovery: An EBSCO Service

Michael Gorrell
Executive Vice President, CIO
EBSCO Publishing

Discovery Services have emerged to become a key element of libraries’ efforts to allow their patrons to satisfy their research needs. Harvesting and indexing millions of scholarly journal articles, books, biographies, reviews, and a vast array of other content types from thousands of sources, allowing users to find the best matches for their needs and presenting this information in a clear and understandable way is a tall order. Challenges include determining relevance for search results, providing users with ways to understand the depth and breadth of the collection being searched, and overall site usability. EBSCO has taken a data driven approach to solving these problems by testing various aspects of its Discovery Service, and using other data mining techniques. This session will describe the various methodologies that have been used and describe ways in which the service has evolved based on these efforts.

 Handout (PDF)

Archiving Large Swaths of User-Contributed Digital Content: Lessons from Archiving the Occupy Movement

Howard Besser
Director, Moving Image Archiving & Preservation MA Program
New York University

David Millman
Director, Digital Library Technology Services
New York University

Sharon M. Leon
Director of Public Projects, Center for History & New Media
George Mason University

Archiving born-digital content from the “Occupy” movement can serve as a prototype for archiving all kinds of user-contributed content. In this presentation, several organizations will discuss the tools and methods they have developed for ingesting, preserving, and offering discovery services to large numbers of digital works where they cannot really rely on the contributors to follow standards and metadata assignment. Topics covered will range from automatic extraction of time-stamp and location metadata (and an empirical analysis of which upload services strip these out), to app development for uploading content along with permission forms, to maintaining lists of frequently-changing URL nodes for web-crawling, to issues in educating content creators in best practices. Speakers will also discuss issues in trying to document a social movement while it is happening.



Presentation (Besser PPT)
Presentation (Millman PPT)
 (Leon PDF)
Presentation (Hanna PPTX)

Building the Grateful Dead Archive Online: The Golden Road to Unlimited Devotion

Virginia Steel
University Librarian
University of California, Santa Cruz
Robin Chandler
Project Manager
University of California, Santa Cruz

The University of California Santa Cruz (UCSC) Libraries, recipient of a 2009 two-year Institute for Museum and Library Services (IMLS) grant, is building the socially constructed Grateful Dead Archive Online (GDAO) website using Omeka open source software. The Grateful Dead Archive (GDA) represents one of the most significant popular culture collections of the 20th Century and documents the band’s activity and influence in contemporary music from 1965 to 1995.

Donated to the UCSC Library in 2008, the GDA contains over 600 linear feet of material including business records, photographs, posters, fan envelopes, tickets, video, audio (oral histories and interviews) and 3-dimensional objects such as stage props and band merchandise. With the release of GDAO in July 2012, the Archive will actively begin collecting artifacts from an enthusiastic community of Grateful Dead fans.

This presentation will discuss the donation of the collection to UCSC; the challenges of merging a traditional archive with a socially constructed one; rights clearances issues and the intellectual property strategy; crawling and harvesting strategies employed for collecting web resources; plugins and workflows supporting data exchange between CONTENTdm and Omeka; and integrating “the crowd” in the curation of user-submitted content preserved by the California Digital Library’s Merritt repository. Future directions, such as the integration/development of better curation tools and what the Libraries hope to learn from opening the archive to contributions from a large community of fans, will also be discussed.



Handout (PDF)
Presentation PDF)

The California Digital Library and the Public Knowledge Project Partnership: A New Model of Collaborative Institutional Repository Publishing Services Development

Lisa Schiff
Technical Lead, Access & Publishing Services
California Digital Library 
Brian Owen
Associate University Librarian
Simon Fraser University 
Catherine Mitchell
Director, Access & Publishing Services
California Digital Library

The California Digital Library (CDL) and the Public Knowledge Project (PKP) have recently joined forces, with the CDL signing on as a major PKP development partner. This relationship has grown out of CDL’s recent work to incorporate a customized version of PKP’s Open Journal System (OJS) into the back-end submission and publishing system for eScholarship, the University of California’s open access institutional repository and publishing platform. This development work marks an important step toward fully integrated, open-source institutional repository and journal publication services, and the CDL and PKP have ambitious plans for extending this work to the larger PKP community. This panel will describe:

• How OJS was customized to meet the needs of eScholarship journals (including user interface modifications, the extension of a single OJS instance to support almost 50 independent journals, PDF generation, and more)
• Which of these and other features may be available in a future release of OJS as a result of this new partnership
• How the PKP development partnership program is shaping the direction of OJS and other PKP scholarly communication services
• How the relationship with PKP is likely to affect future development and service directions for eScholarship
• How this work fits into the larger effort of both of these organizations to refine their services in support of new practices and opportunities within the scholarly publishing environment



Presentation (PowerPoint)