MODELOS GRAFICOS PROBABILISTICOS EN APRENDIZAJE AUTOMATICO Y OPTIMIZACION: IMPLEMENTACIONES EFICIENTES Y APLICACIONES

TIN2010-14931

Nombre agencia financiadora Ministerio de Ciencia e Innovación
Acrónimo agencia financiadora MICINN
Programa Programa Nacional de Investigación Fundamental
Subprograma Investigación fundamental no-orientada
Convocatoria Investigación fundamental no-orientada
Año convocatoria 2010
Unidad de gestión Subdirección General de Proyectos de Investigación
Centro beneficiario UNIVERSIDAD DEL PAIS VASCO EUSKAL HERRIKO UNIBERTSITATEA
Centro realización EUSKAL HERRIKO UNIBERTSITATEA (EHU) / UNIVERSIDAD DEL PAÍS VASCO (UPV)
Identificador persistente http://dx.doi.org/10.13039/501100004837

Publicaciones

Resultados totales (Incluyendo duplicados): 6
Encontrada(s) 1 página(s)

Classification of neocortical interneurons using affinity propagation

Archivo Digital UPM
  • Santana Hermida, Roberto
  • McGarry, Laura M.
  • Bielza Lozoya, María Concepción
  • Larrañaga Múgica, Pedro María
  • Yuste, Rafael
In spite of over a century of research on cortical circuits, it is still unknown how many classes of cortical neurons exist. In fact, neuronal classification is a difficult problem because it is unclear how to designate a neuronal cell class and what are the best characteristics to define them. Recently, unsupervised classifications using cluster analysis based on morphological, physiological, or molecular characteristics, have provided quantitative and unbiased identification of distinct neuronal subtypes, when applied to selected datasets. However, better and more robust classification methods are needed for increasingly complex and larger datasets. Here, we explored the use of affinity propagation, a recently developed unsupervised classification algorithm imported from machine learning, which gives a representative example or exemplar for each cluster. As a case study, we applied affinity propagation to a test dataset of 337 interneurons belonging to four subtypes, previously identified based on morphological and physiological characteristics. We found that affinity propagation correctly classified most of the neurons in a blind, non-supervised manner. Affinity propagation outperformed Ward's method, a current standard clustering approach, in classifying the neurons into 4 subtypes. Affinity propagation could therefore be used in future studies to validly classify neurons, as a first step to help reverse engineer neural circuits.




Network measures for information extraction in evolutionary algorithms

Archivo Digital UPM
  • Santana Hermida, Roberto
  • Armañanzas Arnedillo, Ruben
  • Bielza Lozoya, María Concepción
  • Larrañaga Múgica, Pedro María
Problem domain information extraction is a critical issue in many real-world optimization problems. Increasing
the repertoire of techniques available in evolutionary algorithms with this purpose is fundamental
for extending the applicability of these algorithms. In this paper we introduce a unifying information
mining approach for evolutionary algorithms. Our proposal is based on a division of the stages where
structural modelling of the variables interactions is applied. Particular topological characteristics induced
from different stages of the modelling process are identified. Network theory is used to harvest problem
structural information from the learned probabilistic graphical models (PGMs). We show how different
statistical measures, previously studied for networks from different domains, can be applied to mine the
graphical component of PGMs. We provide evidence that the computed measures can be employed for
studying problemdifficulty, classifying different probleminstances and predicting the algorithmbehavior.




A review on evolutionary algorithms in Bayesian network learning and inference tasks

A review on evolutionary algorithms in Bayesian network learning and inference tasks-->
Archivo Digital UPM
  • Larrañaga Múgica, Pedro María
  • Karshenas, Hossein
  • Bielza Lozoya, María Concepción
  • Santana Hermida, Roberto
Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Bayesian networks are one of the most widely used class of these models. Some of the inference and learning tasks in Bayesian networks involve complex optimization problems that require the use of meta-heuristic algorithms. Evolutionary algorithms, as successful problem solvers, are promising candidates for this purpose. This paper reviews the application of evolutionary algorithms for solving some NP-hard optimization tasks in Bayesian network inference and learning.




Regularized continuous estimation of distribution algorithms

Archivo Digital UPM
  • Karshenas, Hossein
  • Santana Hermida, Roberto
  • Bielza Lozoya, María Concepción
  • Larrañaga Múgica, Pedro María
Regularization is a well-known technique in statistics for model estimation which is used to improve the generalization ability of the estimated model. Some of the regularization methods can also be used for variable selection that is especially useful in high-dimensional problems. This paper studies the use of regularized model learning in estimation of distribution algorithms (EDAs) for continuous optimization based on Gaussian distributions. We introduce two approaches to the regularized model estimation and analyze their effect on the accuracy and computational complexity of model learning in EDAs. We then apply the proposed algorithms to a number of continuous optimization functions and compare their results with other Gaussian distribution-based EDAs. The results show that the optimization performance of the proposed RegEDAs is less affected by the increase in the problem size than other EDAs, and they are able to obtain significantly better optimization values for many of the functions in high-dimensional settings.




Wrapper positive Bayesian network classifiers

Archivo Digital UPM
  • Calvo Molinos, Borja
  • Inza Cano, Iñaki
  • Lozano Alonso, José Antonio
  • Larrañaga Múgica, Pedro María
In the information retrieval framework, there are problems where the goal is to recover objects of a particular class from big sets of un labelled objects. In some of these problems, only examples from the class we want to recover are available. For such problems, the machine learning community has developed algorithms that are able to learn binary classifiers in the absence of negative examples. Among them, we can find the positive Bayesian network classifiers, algorithms that induce Bayesian network classifiers from positive and un labelled examples. The main drawback of these algorithms is that they require some previous knowledge about the a priori probability distribution of the class. In this paper, we propose a wrapper approach to tackle the learning when no such information is available, setting this probability at the optimal value in terms of the recovery of positive examples. The evaluation of classifiers in positive un labelled learning problems is a non-trivial question. We have also worked on this problem, and we have proposed a new guiding metric to be used in the search for the optimal a priori probability of the positive class that we have called the pseudo F. We have empirically tested the proposed metric and the wrapper classifiers on both synthetic and real-life datasets. The results obtained in this empirical comparison show that the wrapper Bayesian network classifiers provide competitive results, particularly when the actual a priori probability of the positive class is high.




Peakbin selection in Mass Spectrometry data using a consensus approach with estimation of distribution algorithms

Peakbin selection in Mass Spectrometry data using a consensus approach with estimation of distribution algorithms-->
Archivo Digital UPM
  • Armañanzas Arnedillo, Ruben
  • Saeys, Yvan
  • Inza Cano, Iñaki
  • García Torres, Miguel
  • Bielza Lozoya, María Concepción
  • Peer, Yves van de
  • Larrañaga Múgica, Pedro María
Progress is continuously being made in the quest for stable biomarkers linked to complex diseases. Mass spectrometers are one of the devices for tackling this problem. The data profiles they produce are noisy and unstable. In these profiles, biomarkers are detected as signal regions (peaks), where control and disease samples behave differently. Mass spectrometry (MS) data generally contain a limited number of samples described by a high number of features. In this work, we present a novel class of evolutionary algorithms, estimation of distribution algorithms (EDA), as an efficient peak selector in this MS domain. There is a trade-of between the reliability of the detected biomarkers and the low number of samples for analysis. For this reason, we introduce a consensus approach, built upon the classical EDA scheme, that improves stability and robustness of the final set of relevant peaks. An entire data workflow is designed to yield unbiased results. Four publicly available MS data sets (two MALDI-TOF and another two SELDI-TOF) are analyzed. The results are compared to the original works, and a new plot (peak frequential plot) for graphically inspecting the relevant peaks is introduced. A complete online supplementary page, which can be found at http://www.sc.ehu.es/ccwbayes/members/ruben/ms, includes extended info and results, in addition to Matlab scripts and references.