| Version 1 (modified by horak, 18 years ago) (diff) | 
|---|
This page describes criterias for creating an evaluation corpus for a document and ontology-based information system.
Requirements
These are basic requirements
- document corpus 
- single domain
 - different lengths (pages)
 - different types (news ticker, article, book, website)
 - at least 100 documents
 - different creation dates (time aware)
 
 -  domain ontology 
- describes the domain of the document corpus
 - contains taxonomy of classes
 - contains taxonomy of possible relations between classes
 - allows creation of complex but speaking queries
 
 -  instance base 
- contains annotations of document corpus
 - high density of relations between instances
 - high and uniform covering of classes and relations
 
 
