APLICACION DE LAS TECNICAS DE SOFT COMPUTING EN MINERIA DE DATOS. NUEVAS APROXIMACIONES

TIN2011-28488

Nombre agencia financiadora Ministerio de Ciencia e Innovación
Acrónimo agencia financiadora MICINN
Programa Programa Nacional de Investigación Fundamental
Subprograma Investigación fundamental no-orientada
Convocatoria Investigación Fundamental No-Orientada
Año convocatoria 2011
Unidad de gestión Sin informar
Centro beneficiario UNIVERSIDAD DE GRANADA
Centro realización DPTO. CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL
Identificador persistente http://dx.doi.org/10.13039/501100004837

Publicaciones

Found(s) 8 result(s)
Found(s) 1 page(s)

A survey of fingerprint classification Part I: taxonomies on feature extraction methods and learning models

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • 0000-0003-2865-6549
  • Derrac, Joaquín
  • Peralta, Daniel
  • Triguero, Isaac
  • 0000-0002-5845-887X
  • 0000-0002-0904-9834
  • García, Salvador
  • Benítez, José Manuel
  • 0000-0003-4764-5298
  • 0000-0001-6657-948X
  • 0000-0002-1279-6195
  • Herrera, Francisco
This paper reviews the fingerprint classification literature looking at the problem from a double perspective. We first deal with feature extraction methods, including the different models considered for singular point detection and for orientation map extraction. Then, we focus on the different learning models considered to build the classifiers used to label new fingerprints. Taxonomies and classifications for the feature extraction, singular point detection, orientation extraction and learning methods are presented. A critical view of the existing literature have led us to present a discussion on the existing methods and their drawbacks such as difficulty in their reimplementation, lack of details or major differences in their evaluations procedures. On this account, an experimental analysis of the most relevant methods is carried out in the second part of this paper, and a new method based on their combination is presented., This work was supported by the Research Projects CAB(CDTI),
TIN2011-28488, and TIN2013-40765-P.D




A survey of fingerprint classification Part II: experimental analysis and ensemble proposal

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • 0000-0003-2865-6549
  • Derrac, Joaquín
  • Peralta, Daniel
  • Triguero, Isaac
  • 0000-0002-5845-887X
  • 0000-0002-0904-9834
  • García, Salvador
  • Benítez, José Manuel
  • 0000-0003-4764-5298
  • 0000-0001-6657-948X
  • 0000-0002-1279-6195
  • Herrera, Francisco
In the first part of this paper we reviewed the fingerprint classification literature from two different perspectives: the feature extraction and the classifier learning. Aiming at answering the question of which among the reviewed methods would perform better in a real implementation we end up in a discussion which showed the difficulty in answering this question. No previous comparison exists in the literature and comparisons among papers are done with different experimental frameworks. Moreover, the difficulty in implementing published methods was stated due to the lack of details in their description, parameters and the fact that no source code is shared. For this reason, in this paper we will go through a deep experimental study following the proposed double perspective. In order to do so, we have carefully implemented some of the most relevant feature extraction methods according to the explanations found in the corresponding papers and we have tested their performance with different classifiers, including those specific proposals made by the authors. Our aim is to develop an objective experimental study in a common framework, which has not been done before and which can serve as a baseline for future works on the topic. This way, we will not only test their quality, but their reusability by other researchers and will be able to indicate which proposals could be considered for future developments. Furthermore, we will show that combining different feature extraction models in an ensemble can lead to a superior performance, significantly increasing the results obtained by individual models., This work was supported by the Research Projects CAB(CDTI),
TIN2011-28488, and TIN2013-40765-P.




A survey on fingerprint minutiae-based local matching for verification and identification: taxonomy and experimental evaluation

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • Peralta, Daniel
  • 0000-0003-2865-6549
  • Triguero, Isaac
  • 0000-0002-5845-887X
  • García, Salvador
  • 0000-0001-6657-948X
  • Benítez, José Manuel
  • 0000-0002-1279-6195
  • Herrera, Francisco
Fingerprint recognition has found a reliable application for verification or identification of people in biometrics. Globally, fingerprints can be viewed as valuable traits due to several perceptions observed by the experts; such as the distinctiveness and the permanence on humans and the performance in real applications. Among the main stages of fingerprint recognition, the automated matching phase has received much attention from the early years up to nowadays. This paper is devoted to review and categorize the vast number of fingerprint matching methods proposed in the specialized literature. In particular, we focus on local minutiae-based matching algorithms, which provide good performance with an excellent trade-off between efficacy and efficiency. We identify the main properties and differences of existing methods. Then, we include an experimental evaluation involving the most representative local minutiae-based matching models in both verification and evaluation tasks. The results obtained will be discussed in detail, supporting the description of future directions., This work was supported by the Research Projects CAB (CDTI), TIN2011-28488, and TIN2013-40765-P. D.




INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • Sáez, José Antonio
  • 0000-0003-2865-6549
  • Luengo, Julián
  • Herrera, Francisco
In classification, noise may deteriorate the system performance and increase the complexity of the models built. In order to mitigate its consequences, several approaches have been proposed in the literature. Among them, noise filtering, which removes noisy examples from the training data, is one of the most used techniques. This paper proposes a new noise filtering method that combines several filtering strategies in order to increase the accuracy of the classification algorithms used after the filtering process. The filtering is based on the fusion of the predictions of several classifiers used to detect the presence of noise. We translate the idea behind multiple classifier systems, where the information gathered from different models is combined, to noise filtering. In this way, we consider the combination of classifiers instead of using only one to detect noise. Additionally, the proposed method follows an iterative noise filtering scheme that allows us to avoid the usage of detected noisy examples in each new iteration of the filtering process. Finally, we introduce a noisy score to control the filtering sensitivity, in such a way that the amount of noisy examples removed in each iteration can be adapted to the necessities of the practitioner. The first two strategies (use of multiple classifiers and iterative filtering) are used to improve the filtering accuracy, whereas the last one (the noisy score) controls the level of conservation of the filter removing potentially noisy examples. The validity of the proposed method is studied in an exhaustive experimental study. We compare the new filtering method against several state-of-the-art methods to deal with datasets with class noise and study their efficacy in three classifiers with different sensitivity to noise., Acknowledgment supported by the projects TIN2011-28488, TIN2013-40765-P, P10-TIC-06858 and P11-TIC- 7765. J. A. Sáez was supported by EC under FP7, Coordination and Support Action, Grant Agreement Number 316097, ENGINE European Research Centre of Network Intelligence for Innovation Enhancement (http://engine.pwr.wroc.pl/).




IIVFDT: ignorance functions based interval-valued fuzzy decision tree with genetic tuning

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • 0000-0002-1427-9909
  • Fernández, Alberto
  • 0000-0002-1279-6195
  • Herrera, Francisco
Electronic version of an article published as International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems Vol. 20, Suppl. 2 (October 2012) 1–30 DOI: 10.1142/S0218488512400132 © World Scientific Publishing Company http://www.worldscientific.com/worldscinet/ijufks, The choice of membership functions plays an essential role in the success of fuzzy systems. This is a complex problem due to the possible lack of knowledge when assigning punctual values as membership degrees. To face this handicap, we propose a methodology called Ignorance functions based Interval-Valued Fuzzy Decision Tree with genetic tuning, IIVFDT for short, which allows to improve the performance of fuzzy decision trees by taking into account the ignorance degree. This ignorance degree is the result of a weak ignorance function applied to the punctual value set as membership degree.

Our IIVFDT proposal is composed of four steps: (1) the base fuzzy decision tree is generated using the fuzzy ID3 algorithm; (2) the linguistic labels are modeled with Interval-Valued Fuzzy Sets. To do so, a new parametrized construction method of Interval-Valued Fuzzy Sets is defined, whose length represents such ignorance degree; (3) the fuzzy reasoning method is extended to work with this representation of the linguistic terms; (4) an evolutionary tuning step is applied for computing the optimal ignorance degree for each Interval-Valued Fuzzy Set.

The experimental study shows that the IIVFDT method allows the results provided by the initial fuzzy ID3 with and without Interval-Valued Fuzzy Sets to be outperformed. The suitability of the proposed methodology is shown with respect to both several state-of-the-art fuzzy decision trees and C4.5. Furthermore, we analyze the quality of our approach versus two methods that learn the fuzzy decision tree using genetic algorithms. Finally, we show that a superior performance can be achieved by means of the positive synergy obtained when applying the well known genetic tuning of the lateral position after the application of the IIVFDT method.
The choice of membership functions plays an essential role in the success of fuzzy systems. This is a complex problem due to the possible lack of knowledge when assigning punctual values as membership degrees. To face this handicap, we propose a methodology called Ignorance functions based Interval-Valued Fuzzy Decision Tree with genetic tuning, IIVFDT for short, which allows to improve the performance of fuzzy decision trees by taking into account the ignorance degree. This ignorance degree is the result of a weak ignorance function applied to the punctual value set as membership degree. Our IIVFDT proposal is composed of four steps: (1) the base fuzzy decision tree is generated using the fuzzy ID3 algorithm; (2) the linguistic labels are modeled with Interval-Valued Fuzzy Sets. To do so, a new parametrized construction method of Interval-Valued Fuzzy Sets is defined, whose length represents such ignorance degree; (3) the fuzzy reasoning method is extended to work with this representation of the linguistic terms; (4) an evolutionary tuning step is applied for computing the optimal ignorance degree for each Interval-Valued Fuzzy Set. The experimental study shows that the IIVFDT method allows the results provided by the initial fuzzy ID3 with and without Interval-Valued Fuzzy Sets to be outperformed. The suitability of the proposed methodology is shown with respect to both several state-of-the-art fuzzy decision trees and C4.5. Furthermore, we analyze the quality of our approach versus two methods that learn the fuzzy decision tree using genetic algorithms. Finally, we show that a superior performance can be achieved by means of the positive synergy obtained when applying the well known genetic tuning of the lateral position after the application of the IIVFDT method., This work was supported in part by the Spanish Ministry of Science and Technology
under projects TIN2011-28488 and TIN2010-15055.




IVTURS: A linguistic fuzzy rule-based classification system based on a new interval-valued fuzzy reasoning method with tuning and rule selection

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • 0000-0002-1427-9909
  • Fernández, Alberto
  • 0000-0002-1279-6195
  • Herrera, Francisco
Interval-valued fuzzy sets have been shown to be a useful tool for dealing with the ignorance related to the definition of the linguistic labels. Specifically, they have been successfully applied to solve classification problems, performing simple modifications on the fuzzy reasoning method to work with this representation and making the classification based on a single number. In this paper we present IVTURS, a new linguistic fuzzy rule-based classification method based on a new completely interval-valued fuzzy reasoning method. This inference process uses interval-valued restricted equivalence functions to increase the relevance of the rules in which the equivalence of the interval membership degrees of the patterns and the ideal membership degrees is greater, which is a desirable behaviour. Furthermore, their parametrized construction allows the computation of the optimal function for each variable to be performed, which could involve a potential improvement in the system’s behaviour. Additionally, we combine this tuning of the equivalence with rule selection in order to decrease the complexity of the system. In this paper we name our method IVTURS-FARC, since we use the FARC-HD method to accomplish the fuzzy rule learning process. The experimental study is developed in three steps in order to ascertain the quality of our new proposal. First, we determine both the essential role that interval-valued fuzzy sets play in the method and the need for the rule selection process. Next, we show the improvements achieved by IVTURS-FARC with respect to the tuning of the degree of ignorance when it is applied in both an isolated way and when combined with the tuning of the equivalence. Finally, the significance of IVTURS-FARC is further depicted by means of a comparison by which it is proved to outperform the results of FARC-HD and FURIA, which are two high performing fuzzy classification algorithms., This work was supported in part by the Spanish Ministry of Science and Technology under projects TIN2011-28488 and TIN2010-15055 and the Andalusian Research Plan P10-TIC-6858 and P11-TIC-7765.




A compact evolutionary interval-valued fuzzy rule-based classification system for the modeling and prediction of real-world financial applications with imbalanced data

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • 0000-0002-1427-9909
  • Bernardo, Darío
  • Herrera, Francisco
  • 0000-0002-1279-6195
  • Hagras, Hani
The current financial crisis has stressed the need of obtaining more accurate prediction models in order to decrease the risk when investing money on economic opportunities. In addition, the transparency of the process followed to make the decisions in financial applications is becoming an important issue. Furthermore, there is a need to handle the real-world imbalanced financial data sets without using sampling techniques which might introduce noise in the used data. In this paper, we present a compact evolutionary interval-valued fuzzy rule-based classification system, which is based on IVTURSFARC-HD (Interval-Valued fuzzy rule-based classification system with TUning and Rule Selection) [22]), for the modeling and prediction of real-world financial applications. This proposed system allows obtaining good predictions accuracies using a small set of short fuzzy rules implying a high degree of interpretability of the generated linguistic model. Furthermore, the proposed system deals with the financial imbalanced datasets with no need for any preprocessing or sampling method and thus avoiding the accidental introduction of noise in the data used in the learning process. The system is also provided with a mechanism to handle examples that are not covered by any fuzzy rule in the generated rule base. To test the quality of our proposal, we will present an experimental study including eleven real-world financial datasets. We will show that the proposed system outperforms the original C4.5 decision tree, type-1 and interval-valued fuzzy counterparts which use the SMOTE sampling technique to preprocess data and the original FURIA, which is a fuzzy approximative classifier. Furthermore, the proposed method enhances the results achieved by the cost sensitive C4.5 and it gives competitive results when compared with FURIA using SMOTE, while our proposal avoids pre-processing techniques and it provides interpretable models that allow obtaining more accurate results., This work was supported in part by the Spanish Ministry of Science and Technology under Project TIN2011-28488 and Project TIN2013-40765.




Enhancing multi-class classification in FARC-HD fuzzy classifier: on the synergy between n-dimensional overlap functions and decomposition strategies

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • 0000-0001-7261-7868
  • 0000-0003-2865-6549
  • 0000-0002-1427-9909
  • Fernández, Alberto
  • 0000-0001-6657-948X
  • Herrera, Francisco
  • 0000-0002-1279-6195
There are many real-world classification problems involving multiple classes, e.g., in bioinformatics, computer vision or medicine. These problems are generally more difficult than their binary counterparts. In this scenario, decomposition strategies usually improve the performance of classifiers. Hence, in this paper we aim to improve the behaviour of FARC-HD fuzzy classifier in multi-class classification problems using decomposition strategies, and more specifically One-vs-One (OVO) and One-vs-All (OVA) strategies. However, when these strategies are applied on FARC-HD a problem emerges due to the low confidence values provided by the fuzzy reasoning method. This undesirable condition comes from the application of the product t-norm when computing the matching and association degrees, obtaining low values, which are also dependent on the number of antecedents of the fuzzy rules. As a result, robust aggregation strategies in OVO such as the weighted voting obtain poor results with this fuzzy classifier. In order to solve these problems, we propose to adapt the inference system of FARC-HD replacing the product t-norm with overlap functions. To do so, we define n-dimensional overlap functions. The usage of these new functions allows one to obtain more adequate outputs from the base classifiers for the subsequent aggregation in OVO and OVA schemes. Furthermore, we propose a new aggregation strategy for OVO to deal with the problem of the weighted voting derived from the inappropriate confidences provided by FARC-HD for this aggregation method. The quality of our new approach is analyzed using twenty datasets and the conclusions are supported by a proper statistical analysis. In order to check the usefulness of our proposal, we carry out a comparison against some of the state-of-the-art fuzzy classifiers. Experimental results show the competitiveness of our method., This work was supported in part by the Spanish Ministry of Science and
Technology under projects TIN2011-28488, TIN-2012-33856 and TIN-2013-
40765-P and the Andalusian Research Plan P10-TIC-6858 and P11-TIC-7765.