Resultados totales (Incluyendo duplicados): 6
Encontrada(s) 1 página(s)
CORA.Repositori de Dades de Recerca
doi:10.34810/data363
Dataset. 2023

PANACEA ENGLISH AUTOMATICALLY ACQUIRED LEXICON FOR ENV DOMAIN: SUBCATEGORIZATION FRAMES (V-SUBCAT)

  • University of Cambridge. Department of Theoretical and Applied Linguistics
-

Proyecto: //
DOI: https://doi.org/10.34810/data363
CORA.Repositori de Dades de Recerca
doi:10.34810/data363
HANDLE: https://doi.org/10.34810/data363
CORA.Repositori de Dades de Recerca
doi:10.34810/data363
PMID: https://doi.org/10.34810/data363
CORA.Repositori de Dades de Recerca
doi:10.34810/data363
Ver en: https://doi.org/10.34810/data363
CORA.Repositori de Dades de Recerca
doi:10.34810/data363

CORA.Repositori de Dades de Recerca
doi:10.34810/data364
Dataset. 2023

PANACEA ENGLISH AUTOMATICALLY ACQUIRED LEXICON FOR LAB DOMAIN: SUBCATEGORIZATION FRAMES (V-SUBCAT)

  • University of Cambridge. Department of Theoretical and Applied Linguistics
This lexicon was produced using an inductive SCF classifier, the tpc_subcat_inductive webservice in the PANACEA project. The lexicon was automatically produced from the PANACEA MCv2 crawled corpus, by parsing the data with the RASP parser (Third Release, Open-Source Version, February 2001, available from http://ilexir.co.uk; see also E. Briscoe, J. Carroll, and R. Watson, 2006, The Second Release of the RASP System, in Proceedings of COLING/ACL Interactive Presentation Sessions), and then processing the parsed data with tpc_subcat_inductive. Only verb lemmas with at least 200 instances in MCv2 were retained.

Proyecto: //
DOI: https://doi.org/10.34810/data364
CORA.Repositori de Dades de Recerca
doi:10.34810/data364
HANDLE: https://doi.org/10.34810/data364
CORA.Repositori de Dades de Recerca
doi:10.34810/data364
PMID: https://doi.org/10.34810/data364
CORA.Repositori de Dades de Recerca
doi:10.34810/data364
Ver en: https://doi.org/10.34810/data364
CORA.Repositori de Dades de Recerca
doi:10.34810/data364

CORA.Repositori de Dades de Recerca
doi:10.34810/data370
Dataset. 2011

PANACEA ENGLISH V-SUBCAT GOLD-STANDARD FOR LAB DOMAIN

  • University of Cambridge. Department of Theoretical and Applied Linguistics
This is a domain-specific gold-standard for English subcategorization frames, in the case, for labour (LAB) domain. This gold-standard was manually developed, choosing a set of 29 verbs and 200 senteces for each verb. For each sentence, the SCFs present for the studied verb were manually annotated. The sentences were selected from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the LAB domain. Data collection took place in the summer of 2011. This gold-standard was created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.

Proyecto: //
DOI: https://doi.org/10.34810/data370
CORA.Repositori de Dades de Recerca
doi:10.34810/data370
HANDLE: https://doi.org/10.34810/data370
CORA.Repositori de Dades de Recerca
doi:10.34810/data370
PMID: https://doi.org/10.34810/data370
CORA.Repositori de Dades de Recerca
doi:10.34810/data370
Ver en: https://doi.org/10.34810/data370
CORA.Repositori de Dades de Recerca
doi:10.34810/data370

CORA.Repositori de Dades de Recerca
doi:10.34810/data371
Dataset. 2023

PANACEA ENGLISH V-SUBCAT GOLD-STANDARD FOR ENV DOMAIN

  • University of Cambridge. Department of Theoretical and Applied Linguistics
This is a domain-specific gold-standard for English subcategorization frames, in the case, for environment (ENV) domain. This gold-standard was manually developed, choosing a set of 28 verbs and 200 senteces for each verb. For each sentence, the SCFs present for the studied verb were manually annotated. The sentences were selected from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the ENV domain. Data collection took place in the summer of 2011. This gold-standard was created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.

Proyecto: //
DOI: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371
HANDLE: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371
PMID: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371
Ver en: https://doi.org/10.34810/data371
CORA.Repositori de Dades de Recerca
doi:10.34810/data371

CORA.Repositori de Dades de Recerca
doi:10.34810/data375
Dataset. 2013

PANACEA ENGLISH AUTOMATICALLY ACQUIRED LEXICON FOR ENV DOMAIN: SUBCATEGORIZATION FRAMES AND LEXICAL SEMANTIC CLASSES FOR NOUNS

  • University of Cambridge. Department of Theoretical and Applied Linguistics
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is a domain-specific lexicon for English for environment (ENV) domain. This lexicon contain both, subcategorization frames for verbs and lexical semantic classes for nouns. This lexicon has been automatically created using PANACEA webservices using crawled data. The crawled data was obtained crawling web pages that were automatically detected to be in the English language and were automatically classified as relevant to the ENV domain. Data collection took place in the summer of 2011.

Proyecto: //
DOI: https://doi.org/10.34810/data375
CORA.Repositori de Dades de Recerca
doi:10.34810/data375
HANDLE: https://doi.org/10.34810/data375
CORA.Repositori de Dades de Recerca
doi:10.34810/data375
PMID: https://doi.org/10.34810/data375
CORA.Repositori de Dades de Recerca
doi:10.34810/data375
Ver en: https://doi.org/10.34810/data375
CORA.Repositori de Dades de Recerca
doi:10.34810/data375

CORA.Repositori de Dades de Recerca
doi:10.34810/data378
Dataset. 2013

PANACEA ENGLISH AUTOMATICALLY ACQUIRED LEXICON FOR LAB DOMAIN: SUBCATEGORIZATION FRAMES AND LEXICAL SEMANTIC CLASSES FOR NOUNS

  • University of Cambridge. Department of Theoretical and Applied Linguistics
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is a domain-specific lexicon for English for labour (LAB) domain. This lexicon contain both, subcategorization frames for verbs and lexical semantic classes for nouns. This lexicon has been automatically created using PANACEA webservices using crawled data. The crawled data was obtained crawling web pages that were automatically detected to be in the English language and were automatically classified as relevant to the LAB domain. Data collection took place in the summer of 2011.

Proyecto: //
DOI: https://doi.org/10.34810/data378
CORA.Repositori de Dades de Recerca
doi:10.34810/data378
HANDLE: https://doi.org/10.34810/data378
CORA.Repositori de Dades de Recerca
doi:10.34810/data378
PMID: https://doi.org/10.34810/data378
CORA.Repositori de Dades de Recerca
doi:10.34810/data378
Ver en: https://doi.org/10.34810/data378
CORA.Repositori de Dades de Recerca
doi:10.34810/data378

Buscador avanzado