Todd Grappone
Associate University Librarian for Digital Initiatives and Information Technology
University of California, Los Angeles
Peter Broadwell
Academic Project Developer
University of California, Los Angeles
Martin Klein
Programmer Analyst
University of California, Los Angeles
Sharon E. Farb
Associate University Librarian for Scholarly Communication
University of California, Los Angeles
The University of California, Los Angeles Library is building an event-based global news service with archival depth. One key component of this vision is the development of social media collections that record local perspectives on world events. To provide such multi-perspective histories, we envisage a service model in which integrated tools for collecting and aggregating information are coupled with archival functions and software for analysis and data mining, as well as semantic cross-collection topic mapping. As a key element of the proposed news platform, we introduce “Snatch,” a global news archiving and analysis service for Twitter data. Currently deployed tools for the collection of social media data such as Social Feed Manager and twarc are limited in application and not well suited for use by non-technologists. In contrast to such implementations, Snatch is based on the following specifications:
- It operates on-demand, meaning that trusted clients specify parameters such as the temporal boundaries of the tweets to be collected as well as search terms, hash tags, and user handles that will identify topically relevant Tweets. This allows for near real-time collection building, a feature that is essential for capturing reactions to emerging world events as they are reflected in the similarly fast-paced information environments of social media.
- It allows for rapid analysis of collection contents at varying levels of detail. For example, tweet frequency counts, language and sentiment analysis, as well as graph analytics are possible. This integrated feature is unique to Snatch and, more importantly, enables research on the collections across disciplines, irrespective of the technological savvy of researchers.
- It fosters collection-agnostic linking of related resources. Rather than creating ever more numerous data silos, Snatch provides a means of connecting tweets and their embedded content to relevant resources from presently disjoint collections, such as Web archives, recorded television news media, and other digital ephemera. To enable multimodal research and storytelling based upon such connections, Snatch proactively archives all Tweets and their embedded resources upon collection, thereby preserving such important contextual linkages for the long term.