Version 1 (modified by horak, 17 years ago) (diff) |
---|
This page describes criterias for creating an evaluation corpus for a document and ontology-based information system.
Requirements
These are basic requirements
- document corpus
- single domain
- different lengths (pages)
- different types (news ticker, article, book, website)
- at least 100 documents
- different creation dates (time aware)
- domain ontology
- describes the domain of the document corpus
- contains taxonomy of classes
- contains taxonomy of possible relations between classes
- allows creation of complex but speaking queries
- instance base
- contains annotations of document corpus
- high density of relations between instances
- high and uniform covering of classes and relations