ResearchIndex (CiteSeer): Autonomous Citation Indexing of Web Publication


Subject: ResearchIndex (CiteSeer): Autonomous Citation Indexing of Web Publication
Gerry Mckiernan (GMCKIERN@gwgate.lib.iastate.edu)
Date: Sat, 19 Jun 1999 15:29:15 -0500


Message-Id: <s76bb769.090@gwgate.lib.iastate.edu>
Date: Sat, 19 Jun 1999 15:29:15 -0500
From: "Gerry Mckiernan" <GMCKIERN@gwgate.lib.iastate.edu>
To: arl-ejournal@arl.org
Subject: ResearchIndex (CiteSeer): Autonomous Citation Indexing of Web Publication

Posted on behalf of Steve Lawrence. Apologies for cross-posting

/Gerry McKiernan
Iowa State University
Ames IA 50011
<gmckiern@gwgate.lib.iastate.edu>

On 06/13/99, Steve Lawrence <lawrence@research.nj.nec.com> wrote:
>
> ResearchIndex (formerly CiteSeer), a digital library of scientific
> literature that automatically performs citation indexing is available
> at:
>
> http://researchindex.com/
>
> ResearchIndex aims to improve the dissemination and feedback of
> scientific literature, and to provide improvements in functionality,
> usability, availability, cost, comprehensiveness, efficiency, and
> timeliness.
>
> The ResearchIndex software is available without cost for
> non-commercial use. The demonstration service indexes over 200,000
> computer science articles (containing over 2 million citations).
>
> Many digital libraries of scientific literature are available
> (e.g. LANL e-Print archive, ACM DL, IEEE DL, UCSTRI, CORR, ML Papers,
> NCSTRL, LTRS, HP Bib, CS Bibliographies, NZDL etc.). These services
> offer varying degrees of functionality, comprehensiveness, and
> freshness.
>
> Rather than creating just another digital library, ResearchIndex
> provides algorithms, techniques, and software that can be used in
> other digital libraries.
>
> ResearchIndex indexes Postscript and PDF research articles and
> provides:
>
> - Autonomous Citation Indexing (ACI). ResearchIndex uses ACI to
> autonomously create a citation index that can be used for
> literature search and evaluation. Compared to traditional
> citation indices, ACI provides improvements in cost,
> availability, comprehensiveness, efficiency, and
> timeliness.
>
> - Information on all cited documents, not just indexed documents.
> ResearchIndex computes citation statistics and related documents
> for all articles cited in the database, not just the indexed
> articles.
>
> - Reference linking. As with many online publishers, ResearchIndex allows
> browsing the database using citation links.
>
> - Citation context - ResearchIndex can show the context of
> citations to a given paper, allowing a researcher to quickly
> and easily see what other researchers have to say about an
> article of interest (useful for literature search and
> evaluation).
>
> - Awareness and tracking - ResearchIndex provides automatic
> notification of new citations to given papers, and new
> papers matching a user profile. Machine learning is used
> to automatically learn user profiles.
>
> - Related documents - ResearchIndex locates related documents
> using citation and word frequency measures and displays an
> active and continuously updated bibliography for each
> document.
>
> - Similar documents - ResearchIndex computes the percentage of
> matching sentences between documents, allowing, for
> example, the detection of minor revisions to a paper.
>
> - Full-text indexing - ResearchIndex indexes the full-text of the
> entire articles and citations. Full Boolean, phrase and
> proximity search is supported.
>
> - Query-sensitive summaries - ResearchIndex provides the context
> of how query terms are used in articles, instead of a generic
> summary, improving the efficiency of search.
>
> - Citation graph analysis - ResearchIndex analyzes the graph of
> citations, e.g. to identify authoritative and review style
> articles.
>
> - Page images - ResearchIndex allows quick and easy viewing of
> page images.
>
> - Up-to-date - ResearchIndex is continuously updated 24 hours
> a day.
>
> - Powerful search - e.g. ResearchIndex allows using author initials
> to narrow a citation search.
>
> - Autonomous location of articles - ResearchIndex uses search engines,
> crawling, and mailing list monitoring to efficiently locate
> papers on the Web. ResearchIndex can also be used on
> existing digital libraries.
>
> - Source code available - The full source code of ResearchIndex is
> available without cost for non-commercial use.
>
> A demonstration service is at: http://researchindex.com/
> For more details or to obtain the software see
> http://www.neci.nec.com/~lawrence/researchindex.html
> http://www.neci.nec.com/~lawrence/aci.html
>
> The following papers contain details of the system:
>
> "Digital libraries and Autonomous Citation Indexing", Volume 32,
> Number 6, 67-71, 1999.
>
> "CiteSeer: An automatic citation indexing system", Digital Libraries,
> June 1998 [shortlisted for best paper].
>
> "CiteSeer: An autonomous Web agent for automatic retrieval and
> identification of interesting publications", Autonomous Agents, May
> 1998.
>
> "CiteSeer: Autonomous Citation Indexing and Literature Browsing Using
> Citation Context", Technical Report, NEC Research, 1997.
>
> We currently only have a small capacity machine on our external
> network for demonstration. The demonstration service indexes over
> 200,000 computer science articles.
>
> Credits: We would like to thank Joshua Alspector, Jose Nelson Amaral,
> Anders Ardo, Shumeet Baluja, Arunava Banerjee, Eric Baum, Robert
> Cameron, Rich Caruana, Ingemar Cox, Scott Fahlman, Gary Flake, Bill
> Gear, Paul Ginsparg, Eric Glover, Alan Gottlieb, Steve Hanson, Haym
> Hirsh, Steve Hitchcock, Paul Kantor, Jon Kleinberg, Bob Krovetz,
> Andrea LaPaugh, Michael Lesk, Andrew McCallum, Steve Minton, Tom
> Mitchell, Michael Nelson, Craig Nevill-Manning, Andrew Ng, Max Ott,
> Brian Pinkerton, Alexandrin Popescul, Ben Schafer, Bruce Schatz,
> Terrence Sejnowski, Warren Smith, Dagobert Soergel, Amanda Spink,
> Harold Stone, Valerie Tucci, Lyle Ungar, David Waltz, Ian Witten, and
> Peter Yianilos for useful comments and suggestions.
>
> --
> Steve Lawrence - http://www.neci.nec.com/~lawrence/



This archive was generated by hypermail 2a16 : Mon Dec 20 1999 - 18:02:14 EST