Resultados totales (Incluyendo duplicados): 13
Encontrada(s) 2 página(s)
CORA.Repositori de Dades de Recerca
doi:10.34810/data266
Dataset. 2012

GRAF VERSION OF CATALAN PORTIONS OF WIKIPEDIA CORPUS

  • Universitat Politècnica de Catalunya. Research Group on Natural Language Processing
  • Gemma Boleda
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the stand-off GrAF version of Catalan portions of the Wikipedia (based on a 2006 dump). This Wikipedia Catalan Corpus contains 122052 articles that contain about 47,3 million words in raw text format. It has been cleaned by erasing disambiguation pages, removing some XML tags and homogenizing lists ending tag. Then, the corpus has been processed for adding structural tagging (head, paragraph, sentence, list, etc.) and morphosyntactic information.

Proyecto: //
DOI: https://doi.org/10.34810/data266
CORA.Repositori de Dades de Recerca
doi:10.34810/data266
HANDLE: https://doi.org/10.34810/data266
CORA.Repositori de Dades de Recerca
doi:10.34810/data266
PMID: https://doi.org/10.34810/data266
CORA.Repositori de Dades de Recerca
doi:10.34810/data266
Ver en: https://doi.org/10.34810/data266
CORA.Repositori de Dades de Recerca
doi:10.34810/data266

CORA.Repositori de Dades de Recerca
doi:10.34810/data279
Dataset. 2011

ENGLISH-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

  • Universitat d'Alacant. Grup Transducens
  • Breen, Paul
  • O'Regan, Jimmy
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the Apertium bilingual dictionary for English and Catalan languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //
DOI: https://doi.org/10.34810/data279
CORA.Repositori de Dades de Recerca
doi:10.34810/data279
HANDLE: https://doi.org/10.34810/data279
CORA.Repositori de Dades de Recerca
doi:10.34810/data279
PMID: https://doi.org/10.34810/data279
CORA.Repositori de Dades de Recerca
doi:10.34810/data279
Ver en: https://doi.org/10.34810/data279
CORA.Repositori de Dades de Recerca
doi:10.34810/data279

CORA.Repositori de Dades de Recerca
doi:10.34810/data280
Dataset. 2011

FRENCH-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

  • Universitat d'Alacant. Grup Transducens
  • Eleka Ingenieritza Linguistikoa S.L
  • Prompsit Language Engineering, S.L
  • Jimmy O'Regan
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the Apertium bilingual dictionary for French and Catalan languags. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //
DOI: https://doi.org/10.34810/data280
CORA.Repositori de Dades de Recerca
doi:10.34810/data280
HANDLE: https://doi.org/10.34810/data280
CORA.Repositori de Dades de Recerca
doi:10.34810/data280
PMID: https://doi.org/10.34810/data280
CORA.Repositori de Dades de Recerca
doi:10.34810/data280
Ver en: https://doi.org/10.34810/data280
CORA.Repositori de Dades de Recerca
doi:10.34810/data280

CORA.Repositori de Dades de Recerca
doi:10.34810/data285
Dataset. 2023

CATALAN LMF APERTIUM DICTIONARY

  • Universitat d'Alacant. Grup Transducens
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
-

Proyecto: //
DOI: https://doi.org/10.34810/data285
CORA.Repositori de Dades de Recerca
doi:10.34810/data285
HANDLE: https://doi.org/10.34810/data285
CORA.Repositori de Dades de Recerca
doi:10.34810/data285
PMID: https://doi.org/10.34810/data285
CORA.Repositori de Dades de Recerca
doi:10.34810/data285
Ver en: https://doi.org/10.34810/data285
CORA.Repositori de Dades de Recerca
doi:10.34810/data285

CORA.Repositori de Dades de Recerca
doi:10.34810/data288
Dataset. 2023

ITALIAN-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

  • Toral, Antonio
  • Ginestí Rosell, Mireia
  • Tyers, Francis M.
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
-

Proyecto: //
DOI: https://doi.org/10.34810/data288
CORA.Repositori de Dades de Recerca
doi:10.34810/data288
HANDLE: https://doi.org/10.34810/data288
CORA.Repositori de Dades de Recerca
doi:10.34810/data288
PMID: https://doi.org/10.34810/data288
CORA.Repositori de Dades de Recerca
doi:10.34810/data288
Ver en: https://doi.org/10.34810/data288
CORA.Repositori de Dades de Recerca
doi:10.34810/data288

CORA.Repositori de Dades de Recerca
doi:10.34810/data289
Dataset. 2023

CATALAN LMF FREELING SENSE

  • Universitat Politècnica de Catalunya. TALP Research Center
  • Universitat d'Alacant. InterNostrum
  • Universitat Politècnica de Catalunya. TALP Research Center
  • Universitat de Barcelona. Centre de Llenguatge i Computació
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the Catalan Freeling Sense. FreeLing is a developer-oriented library providing language analysis services. FreeLing is designed to be used as an external library from any application requiring this kind of services. Nevertheless, a simple main program is also provided as a basic interface to the library, which enables the user to analyze text files from the command line. The original Catalan and Spanish sense dictionaries are extracted from EuroWordNet, and the reduced subsets included in this FreeLing package are distibuted under GNU GPL license.

Proyecto: //
DOI: https://doi.org/10.34810/data289
CORA.Repositori de Dades de Recerca
doi:10.34810/data289
HANDLE: https://doi.org/10.34810/data289
CORA.Repositori de Dades de Recerca
doi:10.34810/data289
PMID: https://doi.org/10.34810/data289
CORA.Repositori de Dades de Recerca
doi:10.34810/data289
Ver en: https://doi.org/10.34810/data289
CORA.Repositori de Dades de Recerca
doi:10.34810/data289

CORA.Repositori de Dades de Recerca
doi:10.34810/data290
Dataset. 2023

OCCITAN-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

  • Universitat d'Alacant. Grup Transducens
  • Prompsit Language Engineering, S.L
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the Apertium bilingual dictionary for Occitan and Catalan languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //
DOI: https://doi.org/10.34810/data290
CORA.Repositori de Dades de Recerca
doi:10.34810/data290
HANDLE: https://doi.org/10.34810/data290
CORA.Repositori de Dades de Recerca
doi:10.34810/data290
PMID: https://doi.org/10.34810/data290
CORA.Repositori de Dades de Recerca
doi:10.34810/data290
Ver en: https://doi.org/10.34810/data290
CORA.Repositori de Dades de Recerca
doi:10.34810/data290

CORA.Repositori de Dades de Recerca
doi:10.34810/data297
Dataset. 2023

CATALAN LMF PAROLE/SIMPLE LEXICON

  • Institut d'Estudis Catalans
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
-

Proyecto: //
DOI: https://doi.org/10.34810/data297
CORA.Repositori de Dades de Recerca
doi:10.34810/data297
HANDLE: https://doi.org/10.34810/data297
CORA.Repositori de Dades de Recerca
doi:10.34810/data297
PMID: https://doi.org/10.34810/data297
CORA.Repositori de Dades de Recerca
doi:10.34810/data297
Ver en: https://doi.org/10.34810/data297
CORA.Repositori de Dades de Recerca
doi:10.34810/data297

CORA.Repositori de Dades de Recerca
doi:10.34810/data302
Dataset. 2012

LMF VERSION OF THE SENSEM CATALAN DATA BASE

  • Grup de Recerca Interuniversitari en Aplicacions Lingüístiques (GRIAL)
  • Fernandez Montraveta, Ana
  • Castellón, Irene
  • Vázquez, Glòria
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the SenSem database created by the Spanish Inter-University Research Group GRIAL. As part of SenSem project, a corpus of sentences annotated at the semantic and syntactic levels was created. The source corpus is made up of around 13 million words extracted from the online versions of a Spanish newspaper. From this corpus, 25.000 sentences have been randomly selected, 100 for each of the 250 more frequent verbs in current Spanish. Each sentence has been labeled according to the verb sense it exemplifies, the type of complements it takes (arguments or adjunts), their syntactic category and function, and finally each argument has been labelled with a semantic role. The sentence has also been annotated as to its semantics both in relation with aspectual information and the type of construction being expressed. From this annotated corpus a lexical data base of verbs was created in which all the previous information will be recollected. The unit of description of the verbs is the sense. In the description of the verbs, argument structure is included, incorporating subcategorization patterns, with the information of frequency of them, semantic roles and information regarding sentence semantics. The lexicon and the corpus are associated at sense level and together shape up what we call the data bank of the sentential semantic of the Spanish verbs. Both resources are available via web and will form a very important source of linguistic information which we hope will be of utility in different areas of the natural language processing and linguistic research in general. The LMF conversion has been done by the Universitat Pompeu Fabra.

Proyecto: //
DOI: https://doi.org/10.34810/data302
CORA.Repositori de Dades de Recerca
doi:10.34810/data302
HANDLE: https://doi.org/10.34810/data302
CORA.Repositori de Dades de Recerca
doi:10.34810/data302
PMID: https://doi.org/10.34810/data302
CORA.Repositori de Dades de Recerca
doi:10.34810/data302
Ver en: https://doi.org/10.34810/data302
CORA.Repositori de Dades de Recerca
doi:10.34810/data302

CORA.Repositori de Dades de Recerca
doi:10.34810/data303
Dataset. 2023

PORTUGUESE-CATALAN LMF APERTIUM BILINGUAL DICTIONARY

  • Universitat Politècnica de Catalunya
  • Universitat d'Alacant. Grup Transducens
  • Carmen Armentano Oller
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
-

Proyecto: //
DOI: https://doi.org/10.34810/data303
CORA.Repositori de Dades de Recerca
doi:10.34810/data303
HANDLE: https://doi.org/10.34810/data303
CORA.Repositori de Dades de Recerca
doi:10.34810/data303
PMID: https://doi.org/10.34810/data303
CORA.Repositori de Dades de Recerca
doi:10.34810/data303
Ver en: https://doi.org/10.34810/data303
CORA.Repositori de Dades de Recerca
doi:10.34810/data303

Buscador avanzado