| | 1 | This page describes criterias for creating an evaluation corpus for a document and ontology-based information system. |
| | 2 | |
| | 3 | == Requirements == |
| | 4 | |
| | 5 | These are basic requirements |
| | 6 | |
| | 7 | * '''document corpus''' |
| | 8 | * single domain |
| | 9 | * different lengths (pages) |
| | 10 | * different types (news ticker, article, book, website) |
| | 11 | * at least 100 documents |
| | 12 | * different creation dates (time aware) |
| | 13 | * ''' domain ontology ''' |
| | 14 | * describes the domain of the ''' document corpus ''' |
| | 15 | * contains taxonomy of classes |
| | 16 | * contains taxonomy of possible relations between classes |
| | 17 | * allows creation of complex but speaking queries |
| | 18 | * ''' instance base ''' |
| | 19 | * contains annotations of document corpus |
| | 20 | * high density of relations between instances |
| | 21 | * high and uniform covering of classes and relations |