6 | | iDocument is an [http://en.wikipedia.org/wiki/Information_extraction| information extraction] system. It uses existing background knowledge such as business databases, concept maps, or other information models for improving extraction results. |
| 4 | iDocument is a generic ontology-based information extraction (OBIE) system that uses ontological background knowledge in |
| 5 | terms of existing vocabularies and instance knowledge. iDocument uses existing knowledge from personal or business domains (e.g. relational databases, concept maps, taxonomies, etc.). Following Semantic Web, iDocument exchanges and extracts |
| 6 | knowledge based on the W3C standard RDF. Existing knowledge is used as input in a serial IE pipeline of extraction tasks for extracting possible answers concerning user specified ad hoc queries on a given text collection. |
10 | | [[Image(WikiStart:scenario.png, center)]] |
| 10 | * Domain ontologies are exchangeable as long as they are written in RDFS. |
| 11 | * The MOBIE mapping vocabulary allows to define relevant classes, attributes and relations for extraction purpose . |
| 12 | * Existing instance knowledge is reused for information extraction purpose |
| 13 | * Extracted results are formalized in the same RDF scheme as the input domain ontology. |
| 14 | * SPARQL queries are used for defining extraction templates. |
| 15 | * All intermediate and final extraction results are weighted hypothesis according to Dempster –Shafer’s belief function. |
| 16 | |
| 17 | |
| 18 | = Table of Contents = |
| 19 | |
| 20 | * [http://idocument.opendfki.de/ System Summary] |
| 21 | * [http://idocument.opendfki.de/wiki/Evaluation/Corpus/OlympicGames2004 Olympic Corpus and Annotation Scheme (OCAS)] |
| 22 | * [http://idocument.opendfki.de/ Publications] |
| 23 | * [http://idocument.opendfki.de/ Reference Projects] |