= iDocument: Intelligent Document Information Extraction =
[[Image(WikiStart:logo.png, width=100px, right)]]

iDocument is a  generic  ontology-based information  extraction  (OBIE)  system  that  uses  ontological  background  knowledge  in 
terms  of  existing  vocabularies  and  instance  knowledge.  iDocument  uses  existing knowledge  from personal or business domains (e.g. relational databases, concept maps, taxonomies,  etc.).  Following  Semantic  Web,  iDocument  exchanges  and  extracts 
knowledge based on  the W3C  standard RDF. Existing knowledge  is used as  input  in a serial  IE  pipeline  of  extraction  tasks  for  extracting  possible  answers  concerning  user specified  ad  hoc  queries  on  a  given  text  collection. 

= Unique Feature =

 * Domain ontologies are exchangeable as long as they are written in RDFS.
 * The MOBIE mapping vocabulary allows to define relevant classes, attributes and relations for extraction purpose .
 * Existing instance knowledge is reused for information extraction purpose
 * Extracted results are formalized in the same RDF scheme as the input domain ontology.
 * SPARQL queries are used for defining extraction templates.
 * All intermediate and final extraction results are weighted hypothesis according to Dempster –Shafer’s belief function. 

[[Image(WikiStart:szenario.png, width=400px, center)]]


= Table of Contents =

 * [http://idocument.opendfki.de/ System Summary]
 * [http://idocument.opendfki.de/wiki/Evaluation/Corpus/OlympicGames2004 DFKI Olympic Corpus and Annotation Scheme 2008 (DFKI OCAS 2008)]
 * [wiki:Publications]
 * [http://idocument.opendfki.de/ Reference Projects]
 * [http://www.dfki.uni-kl.de/~adrian/2009/03/27/AMD2009.ppt Poster]
 * [http://www.dfki.uni-kl.de/~adrian/2009/idoc_flyerx.pdf Flyer]

For further information please contact [mailto:benjamin.adrian@dfki.de].

This project is developed at [http://www.dfki.de/km DFKI Knowledge Management Department]
 
[[Image(http://www.dfki.de/web/logo.jpg, width=200px, right)]]