Set de datos (Dataset).

Dataset for "Correlated genetic effects on reproduction define a domestication syndrome in a forest tree"

citaREA. Repositorio Institucional del CITA
oai:citarea.cita-aragon.es:10532/2800
citaREA. Repositorio Institucional del CITA
  • Santos del Blanco, Luis
  • Alía Miranda, Ricardo
  • Notivol Paíno, Eduardo
  • González Martínez, Santiago C.
  • Sampredro, L.
  • Lario, F.
  • Climent, José
Datos que se publicarán en el artículo
 
DOI: http://hdl.handle.net/10532/2800
citaREA. Repositorio Institucional del CITA
oai:citarea.cita-aragon.es:10532/2800

HANDLE: http://hdl.handle.net/10532/2800
citaREA. Repositorio Institucional del CITA
oai:citarea.cita-aragon.es:10532/2800
 
Ver en: http://hdl.handle.net/10532/2800
citaREA. Repositorio Institucional del CITA
oai:citarea.cita-aragon.es:10532/2800

Biblos-e Archivo. Repositorio Institucional de la UAM
oai:repositorio.uam.es:10486/712941
Artículo científico (article). 2023

A DATASET FOR BENCHMARKING NEOTROPICAL ANURAN CALLS IDENTIFICATION IN PASSIVE ACOUSTIC MONITORING

Biblos-e Archivo. Repositorio Institucional de la UAM
  • Cañas, Juan Sebastián
  • Toro-Gómez, María Paula
  • Sugai, Larissa Sayuri Moreira
  • Benítez Restrepo, Hernán Darío
  • Rudas, Jorge
  • Posso Bautista, Breyner
  • Toledo, Luís Felipe
  • Dena, Simone
  • Domingos, Adão Henrique Rosa
  • de Souza, Franco Leandro
  • Neckel-Oliveira, Selvino
  • da Rosa, Anderson
  • Carvalho-Rocha, Vítor
  • Bernardy, José Vinícius
  • Sugai, José Luiz Massao Moreira
  • dos Santos, Carolina Emília
  • Bastos, Rogério Pereira
  • Llusia Genique, Diego
  • Ulloa, Juan Sebastian
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires automatic identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy. All our experiments and resources have been made available at https://soundclim.github.io/anuraweb/, The authors acknowledge financial support from the intergovernmental Group on Earth Observations (GEO) and Microsoft, under the GEO-Microsoft Planetary Computer Programme (October 2021); São Paulo Research Foundation (FAPESP #2016/25358–3; #2019/18335–5); the National Council for Scientific and Technological Development (CNPq #302834/2020–6; #312338/2021–0, #307599/2021–3); National Institutes for Science and Technology (INCT) in Ecology, Evolution, and Biodiversity Conservation, supported by MCTIC/CNpq (proc. 465610/2014–5), FAPEG (proc. 201810267000023); CNPQ/MCTI/CONFAP-FAPS/PELD No 21/2020 (FAPESC 2021TR386); Comunidad de Madrid (2020-T1/AMB-20636, Atracción de Talento Investigador, Spain) and research projects funded by the European Commission (EAVESTROP–661408, Global Marie S. Curie fellowship, program H2020, EU); and the Ministerio de Economía, Industria y Competitividad (CGL2017–88764-R, MINECO/AEI/FEDER, Spain). We also thank Tom Denton for machine learning evaluation suggestions, dataset revision, and comments on the manuscript





CORA.Repositori de Dades de Recerca
doi:10.34810/data863
Set de datos (Dataset). 2023

ERC ARTSOUNDSCAPES PROJECT – DATASET FOR "ARTE RUPESTRE: UNA PERSPECTIVA DESDE LOS PAISAJES SONOROS"

CORA.Repositori de Dades de Recerca
  • Moreno Iglesias, Diego
  • Álvarez Morales, Lidia
  • Santos da Rosa, Neemias
  • Díaz-Andreu, Margarita
Data collected in the framework of the ERC project "The sound of special places: exploring rock art soundscapes and the sacred" (acronym: Artsoundscapes, Grant Agreement No. 787842). The project focuses on the relationship between archaeoacoustics and rock art. The attached files correspond to 1) the audio files of the impulse responses collected at the rock art sites of Sandra shelter (Kamberg, South Africa), and Bamboo Hollow III and Wilcox shelter (Giant’s Castle, South Africa), all of them in the Drakensberg mountain range, 2) the corresponding ambient sounds registered on each of the three sites, 3) the resulting audio tracks when the ambient sound is reproduced in the immersive audio laboratory of the Faculty of Psychology, and 4) a Matlab script for the extraction of the psycoacoustical parameters presented on the publication.




idUS. Depósito de Investigación de la Universidad de Sevilla
oai:idus.us.es:11441/153620
Set de datos (Dataset). 2024

DATASET FOR ARTICLE RELATION BETWEEN HLB NUMBER AND PREDOMINANT DESTABILIZATION PROCESS FOR MICROFLUIDIZED NANOEMULSIONS FORMULATED WITH LEMON ESSENTIAL OIL [DATASET]

idUS. Depósito de Investigación de la Universidad de Sevilla
  • Santos García, Jenifer
  • Alfaro Rodríguez, María del Carmen
  • Vega, Lilliam
  • Muñoz García, José
Lemon essential oil (LEO) is associated with a multitude of health benefits due to its anticancer, antioxidant, antiviral, anti-inflammatory and bactericidal properties. Its drawback is that it is very sensitive to oxidation by heat. For this reason, researchers are increasingly investigating the use of LEO in nanoemulsions. In this work, we used laser diffraction, rheology and multiple light scattering techniques to study the effects of different HLB numbers (indicating different mixtures of Tween 80 and Span 20) on the physical stability of nanoemulsions formulated with LEO. We found that different HLB numbers induced different destabilization mechanisms in these emulsions. An HLB number lower than 12 resulted in an Ostwald ripening effect; an HLB number higher than 12 resulted in coalescence. In addition, all the developed nanoemulsions exhibited Newtonian behavior, which could favor the mechanism of creaming. All emulsions exhibited not only a growth in droplet size, but also a creaming with aging time. These findings highlight the importance of selecting the right surfactant to stabilize nanoemulsions, with potential applications in the food industry




idUS. Depósito de Investigación de la Universidad de Sevilla
oai:idus.us.es:11441/162526
Set de datos (Dataset). 2024

DATASET FOR THE ARTICLE MODELING COPPER LEACHING FROM NON-PULVERIZED PRINTED CIRCUIT BOARDS AT HIGH CONCENTRATIONS OF BIOREGENERATED FERRIC SULFATE

idUS. Depósito de Investigación de la Universidad de Sevilla
  • Ramírez del Amo, Pablo
  • Iglesias González, María Nieves
  • Dorado Castaño, Antonio D.
Experimental data of copper leaching kinetics from non-pulverized printed circuit boards are compiled. They consist of the leached copper values as a function of time for the different tests performed. The copper concentrations were measured by atomic absorption spectrophotometry. Moreover, scanning electron microscope photomicrographs and energy dispersive X-ray spectroscopy data of a PCB sample has been included.




idUS. Depósito de Investigación de la Universidad de Sevilla
oai:idus.us.es:11441/153617
Set de datos (Dataset). 2022

DATASET FOR ARTICLE EFFECT OF A CHANGE IN THE CACL2/PECTIN MASS RATIO ON THE PARTICLE SIZE, RHEOLOGY AND PHYSICAL STABILITY OF LEMON ESSENTIAL OIL/W EMULGELS

idUS. Depósito de Investigación de la Universidad de Sevilla
  • Muñoz García, José
  • Prieto Vargas, Paula
  • García González, María del Carmen
  • Alfaro Rodríguez, María del Carmen
A three-step (rotor-stator-microfluidization-rotor stator) protocol was used to prepare 15% lemon essential oil in water emulgels using a mixture of Tween 80 and Span 20 surfactants as low molecular mass emulsifiers and 0.4% low-methoxyl citrus peel pectin as a gelling agent. Ca2+ was used as a gel-promoting agent. Different CaCl2/pectin mass ratio values from 0.3 to 0.7 were used. Emulgels showed a microstructure consisting of oil droplets embedded in a sheared gel matrix, as demonstrated by bright field optical microscopy. Laser diffraction tests showed multimodal particle size distributions due to the coexistence of oil droplets and gel-like particles. Multiple light scattering tests revealed that the physical stability of emulgels was longer as the CaCl2/pectin mass ratio decreased and that different destabilization mechanisms took place. Thus, incipient syneresis became more important with increasing CaCl2 concentration, but a parallel creaming mechanism was detected for CaCl2/pectin mass ratio values above 0.5. Dynamic viscoelastic and steady shear flow properties of the emulgels with the lowest and highest CaCl2/pectin mass ratio values were compared as a function of aging time. The lowest ratio yielded an emulgel with enhanced connectivity among fluid units as indicated by its wider linear viscoelastic region, higher storage modulus, loss modulus and viscosity values, and more shear thinning properties than those of the emulgel formulated with the highest CaCl2/pectin mass ratio. The evolution of the dynamic viscoelastic properties with aging time was consistent with the information provided by monitoring scans of backscattering as a function of sample height




Addi. Archivo Digital para la Docencia y la Investigación
oai:addi.ehu.eus:10810/71313
Contribución de congreso (ConferenceOutput). 2021

HSI-DRIVE: A DATASET FOR THE RESEARCH OF HYPERSPECTRAL IMAGE PROCESSING APPLIED TO AUTONOMOUS DRIVING SYSTEMS

Addi. Archivo Digital para la Docencia y la Investigación
  • Basterrechea Oyarzabal, Koldobika
  • Martínez González, María Victoria
  • Echanove Arias, Francisco Javier
  • Gutiérrez Zaballa, Jon
  • Del Campo Hagelstrom, Inés Juliana
We present a structured dataset for the research and development of automated driving systems (ADS) sup- ported by hyperspectral imaging (HSI). The dataset contains per-pixel manually annotated images selected from videos recorded in real driving conditions that have been organized according to four environment parameters: season, daytime, road type, and weather conditions. The aim is to provide high data diversity and facilitate the automatic generation of data subsets for the evaluation of machine learning (ML) techniques applied to the research of ADS in different driving scenarios and environmental conditions. The video sequences have been captured with a small-size 25-band VNIR (Visible- NearInfraRed) snapshot hyperspectral camera mounted on a driving automobile. The current selection of classes for image annotation is aimed to provide reliable data for the spectral analysis of the items in the scenes; it is thus based on material surface reflectance patterns (spectral signatures). It is foreseen that future versions of the dataset will also incorporate alternative dense semantic labeling of the annotated images. The first version of the dataset, named HSI-Drive v1.0, is publicly available for download, This work was supported by the Basque Government under grant PIBA-2018-1-0054 and partially supported by the University of the Basque Country under grant GIU 18/122.




Addi. Archivo Digital para la Docencia y la Investigación
oai:addi.ehu.eus:10810/56730
Artículo científico (JournalArticle). 2022

NOVEL PIXELWISE CO-REGISTERED HEMATOXYLIN-EOSIN AND MULTIPHOTON MICROSCOPY IMAGE DATASET FOR HUMAN COLON LESION DIAGNOSIS

Addi. Archivo Digital para la Docencia y la Investigación
  • Picón Ruiz, Artzai
  • Terradillos Fernández, Elena
  • Sánchez Peralta, Luisa F.
  • Mattana, Sara
  • Cicchi, Riccardo
  • Blover, Benjamin J.
  • Arbide del Río, Nagore
  • Velasco Arteche, Jacques
  • Etxezarraga Zuluaga, María Carmen
  • Pavone, Francesco S.
  • Garrote Contreras, Estíbaliz
  • López Saratxaga, Cristina
Colorectal cancer presents one of the most elevated incidences of cancer worldwide. Colonoscopy relies on histopathology analysis of hematoxylin-eosin (H&E) images of the removed tissue. Novel techniques such as multi-photon microscopy (MPM) show promising results for performing real-time optical biopsies. However, clinicians are not used to this imaging modality and correlation between MPM and H&E information is not clear. The objective of this paper is to describe and make publicly available an extensive dataset of fully co-registered H&E and MPM images that allows the research community to analyze the relationship between MPM and H&E histopathological images and the effect of the semantic gap that prevents clinicians from correctly diagnosing MPM images. The dataset provides a fully scanned tissue images at 10x optical resolution (0.5 m/px) from 50 samples of lesions obtained by colonoscopies and colectomies. Diagnostics capabilities of TPF and H&E images were compared. Additionally, TPF tiles were virtually stained into H&E images by means of a deep-learning model. A panel of 5 expert pathologists evaluated the different modalities into three classes (healthy, adenoma/hyperplastic, and adenocarcinoma). Results showed that the performance of the pathologists over MPM images was 65% of the H&E performance while the virtual staining method achieved 90%. MPM imaging can provide appropriate information for diagnosing colorectal cancer without the need for H&E staining. However, the existing semantic gap among modalities needs to be corrected., This work was supported by the PICCOLO project. This project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under grant agreement No. 732111. The sole re- sponsibility of this publication lies with the authors. The European Union is not responsible for any use that may be made of the information contained therein

Proyecto: EC/H2020/732111



RUC. Repositorio da Universidade da Coruña
oai:ruc.udc.es:2183/25502
Artículo científico (JournalArticle). 2020

A PUBLIC DOMAIN DATASET FOR REAL-LIFE HUMAN ACTIVITY RECOGNITION USING SMARTPHONE SENSORS

RUC. Repositorio da Universidade da Coruña
  • García-González, Daniel
  • Rivero, Daniel
  • Fernández-Blanco, Enrique
  • Rodríguez Luaces, Miguel
[Abstract] In recent years, human activity recognition has become a hot topic inside the scientific community. The reason to be under the spotlight is its direct application in multiple domains, like healthcare or fitness. Additionally, the current worldwide use of smartphones makes it particularly easy to get this kind of data from people in a non-intrusive and cheaper way, without the need for other wearables. In this paper, we introduce our orientation-independent, placement-independent and subject-independent human activity recognition dataset. The information in this dataset is the measurements from the accelerometer, gyroscope, magnetometer, and GPS of the smartphone. Additionally, each measure is associated with one of the four possible registered activities: inactive, active, walking and driving. This work also proposes asupport vector machine (SVM) model to perform some preliminary experiments on the dataset. Considering that this dataset was taken from smartphones in their actual use, unlike other datasets, the development of a good model on such data is an open problem and a challenge for researchers. By doing so, we would be able to close the gap between the model and a real-life application., This research was partially funded by Xunta de Galicia/FEDER-UE (ConectaPeme, GEMA: IN852A 2018/14), MINECO-AEI/FEDER-UE (Flatcity: TIN2016-77158-C4-3-R) and Xunta de Galicia/FEDER-UE (AXUDAS PARA A CONSOLIDACION E ESTRUTURACION DE UNIDADES DE INVESTIGACION COMPETITIVAS.GRC: ED431C 2017/58 and ED431C 2018/49), Xunta de Galicia; IN852A 2018/14, Xunta de Galicia; ED431C 2017/58, Xunta de Galicia; ED431C 2018/49




RUC. Repositorio da Universidade da Coruña
oai:ruc.udc.es:2183/26190
Artículo científico (JournalArticle). 2019

POPULATION SUBSET SELECTION FOR THE USE OF A VALIDATION DATASET FOR OVERFITTING CONTROL IN GENETIC PROGRAMMING

RUC. Repositorio da Universidade da Coruña
  • Rivero, Daniel
  • Fernández-Blanco, Enrique
  • Fernández-Lozano, Carlos
  • Pazos, A.
[Abstract] Genetic Programming (GP) is a technique which is able to solve different problems through the evolution of mathematical expressions. However, in order to be applied, its tendency to overfit the data is one of its main issues. The use of a validation dataset is a common alternative to prevent overfitting in many Machine Learning (ML) techniques, including GP. But, there is one key point which differentiates GP and other ML techniques: instead of training a single model, GP evolves a population of models. Therefore, the use of the validation dataset has several possibilities because any of those evolved models could be evaluated. This work explores the possibility of using the validation dataset not only on the training-best individual but also in a subset with the training-best individuals of the population. The study has been conducted with 5 well-known databases performing regression or classification tasks. In most of the cases, the results of the study point out to an improvement when the validation dataset is used on a subset of the population instead of only on the training-best individual, which also induces a reduction on the number of nodes and, consequently, a lower complexity on the expressions., Xunta de Galicia; ED431G/01, Xunta de Galicia; ED431D 2017/16, Xunta de Galicia; ED431C 2018/49, Xunta de Galicia; ED431D 2017/23, Instituto de Salud Carlos III; PI17/01826




RUC. Repositorio da Universidade da Coruña
oai:ruc.udc.es:2183/34142
Artículo científico (JournalArticle). 2023

DSUALMH-A NEW HIGH-RESOLUTION DATASET FOR NILM

RUC. Repositorio da Universidade da Coruña
  • Rodríguez-Navarro, C.
  • Alcayde, A.
  • Isanbaev, V.
  • Castro-Santos, Laura
  • Filgueira-Vizoso, Almudena
  • Montoya, F.G.
[Abstract]: The optimisation of energy consumption requires a reasonably accurate measurement, so an appropriate and advanced monitoring system of the relevant electrical variables in the electrical installations is of paramount importance. In this context, interoperable and highly configurable devices play a crucial role. A clear example is the OpenZMeter (OZM) which is an open source, open hardware, multi-purpose precision smart meter that can measure a wide range of electrical variables at a high sampling rate and provide processed data on power quality. The aim of this work is to show the use and possible applications of the new high sampling frequency data provided by the OZM device, which are much richer and more accurate than those obtained with other low-cost electrical meters. For this purpose, the opensource tool NILMTK has been used and adapted. Likewise, the use of two of the best known and most widely used algorithms such as Combinatorial Optimisation (CO) and the Factorial Hidden Markov Model (FHMM) has been considered, analysing the results obtained in the experimental study and offering a detailed comparison of the performance of the two different disaggregation algorithms using metrics for the different cases, as well as the incorporation of transients, and the comparison with other public Datasets, Ministerio de Ciencia, Innovación y Universidades; PGC2018-098813-B-C33, Universidad de Almería; UAL2020-TIC-A2080





2165