Association of Research Libraries; <http://www.arl.org/>EDUCAUSE; <http://www.educause.edu/>
   
CNI - Coalition for Networked Information; <http://www.cni.org/>
 
About CNI
Task Force Meetings
Conferences
Presentations and Publications
Projects
CNI Collaborations
Site Map
Google

www.cni.org
the web

Information about CNI RSS news feed.

 

Project Briefing: Spring 2003 Task Force Meeting
----------

 
Managing Unstructured Data with Latent Semantic Indexing

Maciej Ceglowski
Lead Developer
National Institute for Technology and Liberal Education

Clara Yu
Director
National Institute for Technology and Liberal Education/CET

John L. Cuadrado
Consultant
National Institute for Technology and Liberal Education


Much of the digital content becoming available online lacks meaningful metadata descriptors, but metadata creation is both time-consuming and expensive. Using latent semantic indexing (LSI) techniques, the National Institute for Technology and Liberal Education (NITLE) have developed a search and archiving tool that is able to make inferences about document similarity from patterns of word use across a collection. These similarity values, in turn, allow the tool to assign the documents to categories based on their content. This procedure is language-neutral and fully automatic. While the tool is able to make use of existing metadata, it also can sort and organize raw documents with a high degree of accuracy, across databases, in centralized or distributed mode.

Web Link:
http://www.nitle.org/lsi.php

Handout:
Managing Unstructured Data with Latent Semantic Indexing



Spring 2003 Meeting Home Page     Spring 2003 Project Briefings     Spring 2003 Schedule