Resultados totales (Incluyendo duplicados): 12
Encontrada(s) 2 página(s)
CORA.Repositori de Dades de Recerca
doi:10.34810/data169
Dataset. 2012

COLONOMICS: INTEGRATIVE OMICS DATA OF ONE HUNDRED PAIRED NORMAL-TUMORAL SAMPLES FROM COLON CANCER PATIENTS

  • Moreno Aguado, Víctor
  • Sanz Pamplona, Rebeca
  • Díez Villanueva, Anna
Colonomics (https://colonomics.org) is a multi-omics dataset that includes 250 samples: 100 paired samples from colon cancer patients (tumor/adjacent) and 50 samples from healthy colon mucosa donors. From these samples, data provided includes genotyping, DNA methylation, gene expression, and micro-RNAs (miRNAs) expression. It also includes data from copy number variation (CNV) from tumoral samples. Whole exome sequencing is available for a subset of 42 tumors. Data can be visualized in our browsers (https://colonomics.org/data-browser).

Proyecto: //
DOI: https://doi.org/10.34810/data169
CORA.Repositori de Dades de Recerca
doi:10.34810/data169
HANDLE: https://doi.org/10.34810/data169
CORA.Repositori de Dades de Recerca
doi:10.34810/data169
PMID: https://doi.org/10.34810/data169
CORA.Repositori de Dades de Recerca
doi:10.34810/data169
Ver en: https://doi.org/10.34810/data169
CORA.Repositori de Dades de Recerca
doi:10.34810/data169

CORA.Repositori de Dades de Recerca
doi:10.34810/data265
Dataset. 2012

IULA PENN TREEBANK

  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This treebank consists of a number of Spanish and English sentences that has been manually annotated with syntactical information. The sentences have been choosed from the Penn TreeBank corpus, a resource containing texts from Wall Street Journal and originally compiled by the University of Pennsylvania./nIt contains 805 sentences that have been human translated to Spanish. The original English and the translated Spanish sentences share the same identification number. Sentences in both languages have been processed using the DELPH-IN environment (http://www.delph-in.net/).

Proyecto: //
DOI: https://doi.org/10.34810/data265
CORA.Repositori de Dades de Recerca
doi:10.34810/data265
HANDLE: https://doi.org/10.34810/data265
CORA.Repositori de Dades de Recerca
doi:10.34810/data265
PMID: https://doi.org/10.34810/data265
CORA.Repositori de Dades de Recerca
doi:10.34810/data265
Ver en: https://doi.org/10.34810/data265
CORA.Repositori de Dades de Recerca
doi:10.34810/data265

CORA.Repositori de Dades de Recerca
doi:10.34810/data268
Dataset. 2012

IULA SPANISH-ENGLISH TECHNICAL CORPUS

  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
The corpus consists of a number of specialized texts (Law, Economics, Medicine, Environment and Computer Science domains) available in both Spanish and English languages. This LSP corpus has been compiled with articles from specialized Publications, PhD theses, etc./nIt contains about a total of about 2,1 M words in 127 documents in each language.

Proyecto: //
DOI: https://doi.org/10.34810/data268
CORA.Repositori de Dades de Recerca
doi:10.34810/data268
HANDLE: https://doi.org/10.34810/data268
CORA.Repositori de Dades de Recerca
doi:10.34810/data268
PMID: https://doi.org/10.34810/data268
CORA.Repositori de Dades de Recerca
doi:10.34810/data268
Ver en: https://doi.org/10.34810/data268
CORA.Repositori de Dades de Recerca
doi:10.34810/data268

CORA.Repositori de Dades de Recerca
doi:10.34810/data269
Dataset. 2012

ENGLISH-GALICIAN CLUVI DICTIONARY

  • Universidade de Vigo. Grupo de investigación TALG
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the English-Galician CLUVI Dictionary developed under the direction of Xavier Gómez Guinovart (2005-2012) from parallel texts in the CLUVI Corpus of the University of Vigo.

Proyecto: //
DOI: https://doi.org/10.34810/data269
CORA.Repositori de Dades de Recerca
doi:10.34810/data269
HANDLE: https://doi.org/10.34810/data269
CORA.Repositori de Dades de Recerca
doi:10.34810/data269
PMID: https://doi.org/10.34810/data269
CORA.Repositori de Dades de Recerca
doi:10.34810/data269
Ver en: https://doi.org/10.34810/data269
CORA.Repositori de Dades de Recerca
doi:10.34810/data269

CORA.Repositori de Dades de Recerca
doi:10.34810/data287
Dataset. 2012

TERMOTECA

  • Universidade de Vigo. Grupo de investigación TALG
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This lexical resource is the LMF version of the Termoteca, a multilingual terminological database based on the monolingual and parallel speciality texts collected in the corpora of the University of Vigo, namely in the CLUVI Corpus and in the Galician Technical Corpus.

Proyecto: //
DOI: https://doi.org/10.34810/data287
CORA.Repositori de Dades de Recerca
doi:10.34810/data287
HANDLE: https://doi.org/10.34810/data287
CORA.Repositori de Dades de Recerca
doi:10.34810/data287
PMID: https://doi.org/10.34810/data287
CORA.Repositori de Dades de Recerca
doi:10.34810/data287
Ver en: https://doi.org/10.34810/data287
CORA.Repositori de Dades de Recerca
doi:10.34810/data287

CORA.Repositori de Dades de Recerca
doi:10.34810/data293
Dataset. 2012

ENGLISH-SPANISH LMF APERTIUM BILINGUAL DICTIONARY

  • Universitat d'Alacant. Grup Transducens
  • Universitat Politècnica de Catalunya
  • O'Regan, Jimmy
  • Breen, Paul
  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This is the LMF version of the Apertium bilingual dictionary for English and Spanish languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For each Apertium bilingual correspondence, the corresponding source and target monolingual entries (LexicalEntry) were generated in addition to the bilingual correspondence (SenseAxis) element. Apertium is a free/open-source machine translation platform, initially aimed at related-language pairs but recently expanded to deal with more divergent language pairs (such as English-Catalan). The platform provides: a language-independent machine translation engine; tools to manage the linguistic data necessary to build a machine translation system for a given language pair and linguistic data for a growing number of language pairs.

Proyecto: //
DOI: https://doi.org/10.34810/data293
CORA.Repositori de Dades de Recerca
doi:10.34810/data293
HANDLE: https://doi.org/10.34810/data293
CORA.Repositori de Dades de Recerca
doi:10.34810/data293
PMID: https://doi.org/10.34810/data293
CORA.Repositori de Dades de Recerca
doi:10.34810/data293
Ver en: https://doi.org/10.34810/data293
CORA.Repositori de Dades de Recerca
doi:10.34810/data293

CORA.Repositori de Dades de Recerca
doi:10.34810/data311
Dataset. 2012

APERTIUM MONOLINGUAL LEXICON TO LMF CONVERTER

  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
This tool generates the LMF version of Apertium monolingual lexicons. The script takes as input an expanded monolingual Apertium lexicon (generated using: lt-expand apertium.dix > apertium.expanded) and generates the corresponding LMF version. In the Apertium expended lexicons, the first tag corresponds to the part of speech. The rest of tags (all enclosed in angle brackets) encode additional information depending on the lemma and PoS tag. Run "perl ApertiumMonolingual2LMF.pl --help" to get more information.

Proyecto: //
DOI: https://doi.org/10.34810/data311
CORA.Repositori de Dades de Recerca
doi:10.34810/data311
HANDLE: https://doi.org/10.34810/data311
CORA.Repositori de Dades de Recerca
doi:10.34810/data311
PMID: https://doi.org/10.34810/data311
CORA.Repositori de Dades de Recerca
doi:10.34810/data311
Ver en: https://doi.org/10.34810/data311
CORA.Repositori de Dades de Recerca
doi:10.34810/data311

CORA.Repositori de Dades de Recerca
doi:10.34810/data332
Dataset. 2012

PANACEA ENVIRONMENT BILINGUAL GLOSSARY EL-EN (GREEK-ENGLISH)

  • Dublin City University. School of Computing
This folder contains files for bilingual glossary creation from factored phrase tables that include part of speech tagged text for EL-EN language pair. The tables are firstly filtered using part of speech tag sequences for each language so that entries with unsuitable part of speech sequences are filtered out. Then, feature scores from the phrase table are combined in a log-linear model to score each entry. The user specifies how large the output glossary should be (relative to the input) and the bottom ranking entries are discarded to produce the desired size glossary.

Proyecto: //
DOI: https://doi.org/10.34810/data332
CORA.Repositori de Dades de Recerca
doi:10.34810/data332
HANDLE: https://doi.org/10.34810/data332
CORA.Repositori de Dades de Recerca
doi:10.34810/data332
PMID: https://doi.org/10.34810/data332
CORA.Repositori de Dades de Recerca
doi:10.34810/data332
Ver en: https://doi.org/10.34810/data332
CORA.Repositori de Dades de Recerca
doi:10.34810/data332

CORA.Repositori de Dades de Recerca
doi:10.34810/data338
Dataset. 2012

PANACEA ENGLISH GOLD STANDARD FOR LEXICAL SEMANTIC CLASSIFICATION

  • Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
We present a set of English gold-standards for different noun classes created in PANACEA to train and test automatic classifiers. To create these gold-standards we used we the data from the SemEval 2007 workshop Task 07: Coarse Grained English All-Words (Navigli et al., 2007). The words used in this task were first automatically tagged with an automatic clustering method (Navigli, 2006) using senses based on the WordNet sense inventory and later manually validated by expert lexicographers. For our experiments, we extracted all of the words from this inventory that contained as their first sense a sense that corresponded to the lexical semantic classes, i.e. “people” in the case of the class HUMAN. These gold-standards were created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.

Proyecto: //
DOI: https://doi.org/10.34810/data338
CORA.Repositori de Dades de Recerca
doi:10.34810/data338
HANDLE: https://doi.org/10.34810/data338
CORA.Repositori de Dades de Recerca
doi:10.34810/data338
PMID: https://doi.org/10.34810/data338
CORA.Repositori de Dades de Recerca
doi:10.34810/data338
Ver en: https://doi.org/10.34810/data338
CORA.Repositori de Dades de Recerca
doi:10.34810/data338

CORA.Repositori de Dades de Recerca
doi:10.34810/data341
Dataset. 2012

PANACEA LABOUR LEGISLATION CORPUS N-GRAMS EN (ENGLISH)

  • Dublin City University. School of Computing
This data set contains English word n-grams and English word/tag/lemma n-grams in the "labour Legislation" (LAB) domain. N-grams are accompanied by their observed frequency counts. The length of the n-grams ranges from unigrams (single words) to five-grams. The data were collected in the context of PANACEA (http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064. The n-gram counts were generated from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the LAB domain. The LAB domain collection used consisted of approximately 46.4 million tokens.Data collection took place in the summer of 2011.

Proyecto: //
DOI: https://doi.org/10.34810/data341
CORA.Repositori de Dades de Recerca
doi:10.34810/data341
HANDLE: https://doi.org/10.34810/data341
CORA.Repositori de Dades de Recerca
doi:10.34810/data341
PMID: https://doi.org/10.34810/data341
CORA.Repositori de Dades de Recerca
doi:10.34810/data341
Ver en: https://doi.org/10.34810/data341
CORA.Repositori de Dades de Recerca
doi:10.34810/data341

Buscador avanzado