Dataset. 2023

PANACEA English V-SUBCAT gold-standard for ENV domain

CORA.Repositori de Dades de Recerca
doi:10.34810/data371
CORA.Repositori de Dades de Recerca
  • University of Cambridge. Department of Theoretical and Applied Linguistics
This is a domain-specific gold-standard for English subcategorization frames, in the case, for environment (ENV) domain. This gold-standard was manually developed, choosing a set of 28 verbs and 200 senteces for each verb. For each sentence, the SCFs present for the studied verb were manually annotated. The sentences were selected from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the ENV domain. Data collection took place in the summer of 2011. This gold-standard was created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.
 
DOI: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371

HANDLE: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371
 
Ver en: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371

CORA.Repositori de Dades de Recerca
doi:10.34810/data370
Dataset. 2011

PANACEA ENGLISH V-SUBCAT GOLD-STANDARD FOR LAB DOMAIN

CORA.Repositori de Dades de Recerca
  • University of Cambridge. Department of Theoretical and Applied Linguistics
This is a domain-specific gold-standard for English subcategorization frames, in the case, for labour (LAB) domain. This gold-standard was manually developed, choosing a set of 29 verbs and 200 senteces for each verb. For each sentence, the SCFs present for the studied verb were manually annotated. The sentences were selected from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the LAB domain. Data collection took place in the summer of 2011. This gold-standard was created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.




CORA.Repositori de Dades de Recerca
doi:10.34810/data371
Dataset. 2023

PANACEA ENGLISH V-SUBCAT GOLD-STANDARD FOR ENV DOMAIN

CORA.Repositori de Dades de Recerca
  • University of Cambridge. Department of Theoretical and Applied Linguistics
This is a domain-specific gold-standard for English subcategorization frames, in the case, for environment (ENV) domain. This gold-standard was manually developed, choosing a set of 28 verbs and 200 senteces for each verb. For each sentence, the SCFs present for the studied verb were manually annotated. The sentences were selected from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the ENV domain. Data collection took place in the summer of 2011. This gold-standard was created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.