| 1 | This page describes criterias for creating an evaluation corpus for a document and ontology-based information system. |
| 2 | |
| 3 | == Requirements == |
| 4 | |
| 5 | These are basic requirements |
| 6 | |
| 7 | * '''document corpus''' |
| 8 | * single domain |
| 9 | * different lengths (pages) |
| 10 | * different types (news ticker, article, book, website) |
| 11 | * at least 100 documents |
| 12 | * different creation dates (time aware) |
| 13 | * ''' domain ontology ''' |
| 14 | * describes the domain of the ''' document corpus ''' |
| 15 | * contains taxonomy of classes |
| 16 | * contains taxonomy of possible relations between classes |
| 17 | * allows creation of complex but speaking queries |
| 18 | * ''' instance base ''' |
| 19 | * contains annotations of document corpus |
| 20 | * high density of relations between instances |
| 21 | * high and uniform covering of classes and relations |