Buscador | Buscador

Resultados totales (Incluyendo duplicados): 13
Encontrada(s) 2 página(s)

CORA.Repositori de Dades de Recerca

doi:10.34810/data266

Dataset. 2012

GRAF VERSION OF CATALAN PORTIONS OF WIKIPEDIA CORPUS

Universitat Politècnica de Catalunya. Research Group on Natural Language Processing
Gemma Boleda
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

This is the stand-off GrAF version of Catalan portions of the Wikipedia (based on a 2006 dump). This Wikipedia Catalan Corpus contains 122052 articles that contain about 47,3 million words in raw text format. It has been cleaned by erasing disambiguation pages, removing some XML tags and homogenizing lists ending tag. Then, the corpus has been processed for adding structural tagging (head, paragraph, sentence, list, etc.) and morphosyntactic information.

Proyecto: //

DOI: https://doi.org/10.34810/data266

CORA.Repositori de Dades de Recerca

doi:10.34810/data266

HANDLE: https://doi.org/10.34810/data266

CORA.Repositori de Dades de Recerca

doi:10.34810/data266

PMID: https://doi.org/10.34810/data266

CORA.Repositori de Dades de Recerca

doi:10.34810/data266

Ver en: https://doi.org/10.34810/data266

CORA.Repositori de Dades de Recerca

doi:10.34810/data266

CORA.Repositori de Dades de Recerca

doi:10.34810/data279

Dataset. 2011

ENGLISH-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

Universitat d'Alacant. Grup Transducens
Breen, Paul
O'Regan, Jimmy
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

This is the LMF version of the Apertium bilingual dictionary for English and Catalan languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //

DOI: https://doi.org/10.34810/data279

CORA.Repositori de Dades de Recerca

doi:10.34810/data279

HANDLE: https://doi.org/10.34810/data279

CORA.Repositori de Dades de Recerca

doi:10.34810/data279

PMID: https://doi.org/10.34810/data279

CORA.Repositori de Dades de Recerca

doi:10.34810/data279

Ver en: https://doi.org/10.34810/data279

CORA.Repositori de Dades de Recerca

doi:10.34810/data279

CORA.Repositori de Dades de Recerca

doi:10.34810/data280

Dataset. 2011

FRENCH-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

Universitat d'Alacant. Grup Transducens
Eleka Ingenieritza Linguistikoa S.L
Prompsit Language Engineering, S.L
Jimmy O'Regan
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

This is the LMF version of the Apertium bilingual dictionary for French and Catalan languags. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //

DOI: https://doi.org/10.34810/data280

CORA.Repositori de Dades de Recerca

doi:10.34810/data280

HANDLE: https://doi.org/10.34810/data280

CORA.Repositori de Dades de Recerca

doi:10.34810/data280

PMID: https://doi.org/10.34810/data280

CORA.Repositori de Dades de Recerca

doi:10.34810/data280

Ver en: https://doi.org/10.34810/data280

CORA.Repositori de Dades de Recerca

doi:10.34810/data280

CORA.Repositori de Dades de Recerca

doi:10.34810/data285

Dataset. 2023

CATALAN LMF APERTIUM DICTIONARY

Universitat d'Alacant. Grup Transducens
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

Proyecto: //

DOI: https://doi.org/10.34810/data285

CORA.Repositori de Dades de Recerca

doi:10.34810/data285

HANDLE: https://doi.org/10.34810/data285

CORA.Repositori de Dades de Recerca

doi:10.34810/data285

PMID: https://doi.org/10.34810/data285

CORA.Repositori de Dades de Recerca

doi:10.34810/data285

Ver en: https://doi.org/10.34810/data285

CORA.Repositori de Dades de Recerca

doi:10.34810/data285

CORA.Repositori de Dades de Recerca

doi:10.34810/data288

Dataset. 2023

ITALIAN-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

Toral, Antonio
Ginestí Rosell, Mireia
Tyers, Francis M.
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

Proyecto: //

DOI: https://doi.org/10.34810/data288

CORA.Repositori de Dades de Recerca

doi:10.34810/data288

HANDLE: https://doi.org/10.34810/data288

CORA.Repositori de Dades de Recerca

doi:10.34810/data288

PMID: https://doi.org/10.34810/data288

CORA.Repositori de Dades de Recerca

doi:10.34810/data288

Ver en: https://doi.org/10.34810/data288

CORA.Repositori de Dades de Recerca

doi:10.34810/data288

CORA.Repositori de Dades de Recerca

doi:10.34810/data289

Dataset. 2023

CATALAN LMF FREELING SENSE

Universitat Politècnica de Catalunya. TALP Research Center
Universitat d'Alacant. InterNostrum
Universitat Politècnica de Catalunya. TALP Research Center
Universitat de Barcelona. Centre de Llenguatge i Computació
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

This is the LMF version of the Catalan Freeling Sense. FreeLing is a developer-oriented library providing language analysis services. FreeLing is designed to be used as an external library from any application requiring this kind of services. Nevertheless, a simple main program is also provided as a basic interface to the library, which enables the user to analyze text files from the command line. The original Catalan and Spanish sense dictionaries are extracted from EuroWordNet, and the reduced subsets included in this FreeLing package are distibuted under GNU GPL license.

Proyecto: //

DOI: https://doi.org/10.34810/data289

CORA.Repositori de Dades de Recerca

doi:10.34810/data289

HANDLE: https://doi.org/10.34810/data289

CORA.Repositori de Dades de Recerca

doi:10.34810/data289

PMID: https://doi.org/10.34810/data289

CORA.Repositori de Dades de Recerca

doi:10.34810/data289

Ver en: https://doi.org/10.34810/data289

CORA.Repositori de Dades de Recerca

doi:10.34810/data289

CORA.Repositori de Dades de Recerca

doi:10.34810/data290

Dataset. 2023

OCCITAN-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

Universitat d'Alacant. Grup Transducens
Prompsit Language Engineering, S.L
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

This is the LMF version of the Apertium bilingual dictionary for Occitan and Catalan languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //

DOI: https://doi.org/10.34810/data290

CORA.Repositori de Dades de Recerca

doi:10.34810/data290

HANDLE: https://doi.org/10.34810/data290

CORA.Repositori de Dades de Recerca

doi:10.34810/data290

PMID: https://doi.org/10.34810/data290

CORA.Repositori de Dades de Recerca

doi:10.34810/data290

Ver en: https://doi.org/10.34810/data290

CORA.Repositori de Dades de Recerca

doi:10.34810/data290

CORA.Repositori de Dades de Recerca

doi:10.34810/data297

Dataset. 2023

CATALAN LMF PAROLE/SIMPLE LEXICON

Institut d'Estudis Catalans
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

Proyecto: //

DOI: https://doi.org/10.34810/data297

CORA.Repositori de Dades de Recerca

doi:10.34810/data297

HANDLE: https://doi.org/10.34810/data297

CORA.Repositori de Dades de Recerca

doi:10.34810/data297

PMID: https://doi.org/10.34810/data297

CORA.Repositori de Dades de Recerca

doi:10.34810/data297

Ver en: https://doi.org/10.34810/data297

CORA.Repositori de Dades de Recerca

doi:10.34810/data297

CORA.Repositori de Dades de Recerca

doi:10.34810/data302

Dataset. 2012

LMF VERSION OF THE SENSEM CATALAN DATA BASE

Grup de Recerca Interuniversitari en Aplicacions Lingüístiques (GRIAL)
Fernandez Montraveta, Ana
Castellón, Irene
Vázquez, Glòria
Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)

This is the LMF version of the SenSem database created by the Spanish Inter-University Research Group GRIAL. As part of SenSem project, a corpus of sentences annotated at the semantic and syntactic levels was created. The source corpus is made up of around 13 million words extracted from the online versions of a Spanish newspaper. From this corpus, 25.000 sentences have been randomly selected, 100 for each of the 250 more frequent verbs in current Spanish. Each sentence has been labeled according to the verb sense it exemplifies, the type of complements it takes (arguments or adjunts), their syntactic category and function, and finally each argument has been labelled with a semantic role. The sentence has also been annotated as to its semantics both in relation with aspectual information and the type of construction being expressed. From this annotated corpus a lexical data base of verbs was created in which all the previous information will be recollected. The unit of description of the verbs is the sense. In the description of the verbs, argument structure is included, incorporating subcategorization patterns, with the information of frequency of them, semantic roles and information regarding sentence semantics. The lexicon and the corpus are associated at sense level and together shape up what we call the data bank of the sentential semantic of the Spanish verbs. Both resources are available via web and will form a very important source of linguistic information which we hope will be of utility in different areas of the natural language processing and linguistic research in general. The LMF conversion has been done by the Universitat Pompeu Fabra.

Proyecto: //

DOI: https://doi.org/10.34810/data302

CORA.Repositori de Dades de Recerca

doi:10.34810/data302

HANDLE: https://doi.org/10.34810/data302

CORA.Repositori de Dades de Recerca

doi:10.34810/data302

PMID: https://doi.org/10.34810/data302

CORA.Repositori de Dades de Recerca

doi:10.34810/data302

Ver en: https://doi.org/10.34810/data302

CORA.Repositori de Dades de Recerca