Intelligent Information and Communication Systems (IICS)
University of Hagen
IRSAW is a semantically based question-answering framework being developed within the Deutsche Forschungsgemeinschaft (DFG)-funded project of the same name. IRSAW integrates modules for different tasks such as a deep natural language analysis producing semantic network representations of questions and documents (based on the knowledge representation paradigm MultiNet), the combination of different data streams containing answer candidates, logical answer validation, and natural language generation.
The project will result in two software components accessible via the Internet: the question-answering (QA) system IRSAW and a web service for the semantic annotation of texts. The QA system processes user questions in three phases accessing three different kinds of resources: two information retrieval (IR) phases in which web search engines and local databases are accessed, and a QA phase, in which a semantic network knowledge base is accessed. During the first phase, the user question is transformed into an IR query, which is delivered to dedicated web search engines and web portals. Results obtained from the web typically consist of pages with lists of URLs. The web documents referenced by these URLs are retrieved and converted into text. In the second phase, the text passages from the web are segmented and indexed in local databases, which provide access to units of textual information of certain types (chapters, paragraphs, sentences, or phrases). In the third phase, different modules are employed to create answer streams: Question Answering by Pattern (QAP) matching, Modified Information Retrieval Approach (MIRA), and, most prominently, the InSicht system. The latter uses a linguistic parser to analyze the text segments and return the representation of the meaning of a text as a semantic network. Finding answers with InSicht is based on logical inferences and textual entailments on the annotation of questions and documents with semantic networks.
In contrast to other QA systems, IRSAW aims to:
a) Provide a full semantic interpretation of questions and documents on which logical inferences are based
b) Treat linguistic phenomena in questions and document (e.g. idioms, metonymy, and temporal and spatial aspects)
c) Generate natural language answers (instead of extracting answers from the text)
The first prototype of IRSAW was evaluated in 2006 at the Cross Language Evaluation Forum (combining the answer streams produced by QAP and InSicht), achieving one of the best results in the monolingual German question-answering track. Future work will include further evaluations and realizing the web service for semantic annotation of web pages.