     1This page provides a subset of the OCAS 2008 corpus. 
     2The data can be downloaded [ here] as zip file. 
     4The archive is structured as follows: 
     6 * '''annotations''': RDF annotations about instances and facts of the ontology that were manually annotated in text.  
     7 * '''ontology''': The RDFS scheme and RDF instance base. It also contains a Protege 3.2 project file. 
     8 * '''rdf''':    RDF annotations about instances and facts of the ontology that were automatically inferred by taking the manual annotations and the ontology as base.  
     9 * '''txt''': The text documents. Originally, these document were published by BBC and ABC. Please consider the copyright at the end of each text file. 
     11Please refer to this publication when using this data set. 
     13Grothkast, Alexander; [ Adrian, Benjamin]; [ Schumacher, Kinga]; [ Dengel, Andreas]; Sebastian Blohm (Hrsg.); Ulf Brefeld (Hrsg.); Felix Jungermann (Hrsg.); Roman Yangarber (Hrsg.) [ OCAS: Ontology-Based Corpus and Annotation Scheme;] Proceedings of the High-level Information Extraction Workshop 2008; 
     14This paper presents strategies and lessons learned from the creation of a corpus. It suggests a gold standard for evaluating ontology-based information extraction (OBIE) systems. This OBIE gold stan... 
