TECNICAS DE APRENDIZAJE PARA RESOLVER LA RECONSTRUCCION Y REGISTRO DEFORMABLES APLICADOS A IMAGENES DE LAPAROSCOPIA
PID2020-115995RB-I00
•
Nombre agencia financiadora Agencia Estatal de Investigación
Acrónimo agencia financiadora AEI
Programa Programa Estatal de I+D+i Orientada a los Retos de la Sociedad
Subprograma Programa Estatal de I+D+i Orientada a los Retos de la Sociedad
Convocatoria Proyectos I+D
Año convocatoria 2020
Unidad de gestión Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020
Centro beneficiario UNIVERSIDAD DE ALCALA
Identificador persistente http://dx.doi.org/10.13039/501100011033
Publicaciones
Resultados totales (Incluyendo duplicados): 16
Encontrada(s) 1 página(s)
Encontrada(s) 1 página(s)
A proposal for automatic evaluation of human functional limitations in activities of daily living
e_Buah Biblioteca Digital Universidad de Alcalá
- Nieva Suárez, Álvaro|||0000-0003-3414-6721
- Guardiola Luna, Irene|||0000-0002-5218-4361
- Melino Carrero, Alessandro
- Murillo Teruel, Marina
- Calvo Del Castillo, Pablo
- Fuente Lasa, Santiago de la
- Fuentes Jiménez, David|||0000-0001-6424-4782
- Martín Sánchez, José Luis|||0000-0001-9311-3511
- Palazuelos Cagigas, Sira Elena|||0000-0002-2311-441X
- Losada Gutiérrez, Cristina|||0000-0001-9545-327X
- Marrón Romera, Marta|||0000-0001-7723-2262
- Macías Guarasa, Javier|||0000-0002-3303-3963
- Obeso Benítez, Paula
- Martínez Piédrola, Rosa María
- Pérez de Heredia Torres, Marta
This work presents a novel proposal aimed at automating the assessment of human functional limitations. It represents an interdisciplinary effort to develop and clinically evaluate an automated system for objective functional assessment, and is being carried out within the EYEFUL research project. Functional evaluation is a complex task that requires a holistic approach, considering various components like cognition, motor skills, and executive functions. The EYEFUL project has created an intelligent space within a fully equipped apartment, which includes the deployment of advanced technology (cameras, microphones, and wearable devices), to capture data from subjects undergoing assessment. In this paper, we provide an extensive overview of the project's architecture, including the selection of Activities of Daily Living (ADLs), the recruitment of subjects, the validation methodology, and the experimental setup. We describe the specialized apartment and sensor hardware deployed for data collection, along with the initial machine learning systems for data analysis. Additionally, we analyze the existing state-of-the-art research in automatic functional performance evaluation and present our proposal. This encompasses ADL selection, database design, recording considerations, and an overview of technological modules to be implemented. We also discuss the current status of select modules, including quantitative and qualitative results and references to ongoing work. In conclusion, the EYEFUL project is pioneering automated functional assessment, integrating advanced technology and interdisciplinary collaboration. It aims to provide healthcare professionals with accurate, comprehensive, and clinically validated evaluations of human functionality during ADLs, addressing the complex nature of functional limitations., Agencia Estatal de Investigación, Universidad de Alcalá
Preliminary Unknown Appliance Detection using Convolutional Variational Auto-Encoders for AAL
e_Buah Biblioteca Digital Universidad de Alcalá
- Diego Otón, Laura De|||0000-0002-4939-2987
- Fuentes Jiménez, David|||0000-0001-6424-4782
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
- Mari, Simone
- Nieto Capuchino, Rubén|||0000-0002-8293-9665
2024 IEEE International Conference on Omni-layer Intelligent Systems (COINS), 29-31 July 2024, London, United Kingdom., The healthcare landscape is evolving, emphasising the importance of Active Assisted Living in addressing challenges presented by the ageing population. Integrating Non-Intrusive Load Monitoring provides insights into individual appliance usage patterns without additional sensors, enabling the detection of changes in daily activities that may reflect a person's health status. Within the effectiveness of this approach, adaptability is crucial as households evolve, discerning variations in energy consumption patterns caused by new or replaced electrical devices. For this reason, this work introduces a sophisticated hybrid approach combining deep learning and classic machine learning to classify household electrical loads and detect unknown loads using high-frequency electrical current signal data. Finally, while experimental results highlight areas for improvement, the findings suggest promising applications in household settings, marking a step forward in facing hurdles within the AAL context., Agencia Estatal de investigación, Universidad de Alcalá
Towards dense people detection with deep learning and depth images
e_Buah Biblioteca Digital Universidad de Alcalá
- Fuentes Jiménez, David|||0000-0001-6424-4782
- Losada Gutiérrez, Cristina|||0000-0001-9545-327X
- Casillas Pérez, David|||0000-0002-5721-1242
- Macías Guarasa, Javier|||0000-0002-3303-3963
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Martín López, Roberto|||0000-0002-5172-9403
- Luna Vázquez, Carlos Andrés
This paper describes a novel DNN-based system, named PD3net, that detects multiple people from a single depth image, in real time. The proposed neural network processes a depth image and outputs a likelihood map in image coordinates, where each detection corresponds to a Gaussian-shaped local distribution, centered at each person?s head. This likelihood map encodes both the number of detected people as well as their position in the image, from which the 3D position can be computed. The proposed DNN includes spatially separated convolutions to increase performance, and runs in real-time with low budget GPUs. We use synthetic data for initially training the network, followed by fine tuning with a small amount of real data. This allows adapting the network to different scenarios without needing large and manually labeled image datasets. Due to that, the people detection system presented in this paper has numerous potential applications in different fields, such as capacity control, automatic video-surveillance, people or groups behavior analysis, healthcare or monitoring and assistance of elderly people in ambient assisted living environments. In addition, the use of depth information does not allow recognizing the identity of people in the scene, thus enabling their detection while preserving their privacy. The proposed DNN has been experimentally evaluated and compared with other state-of-the-art approaches, including both classical and DNN-based solutions, under a wide range of experimental conditions. The achieved results allows concluding that the proposed architecture and the training strategy are effective, and the network generalize to work with scenes different from those used during training. We also demonstrate that our proposal outperforms existing methods and can accurately detect people in scenes with significant occlusions., Ministerio de Economía y Competitividad, Universidad de Alcalá, Agencia Estatal de Investigación
Estimating energy consumption in households for non-intrusive elderly monitoring
e_Buah Biblioteca Digital Universidad de Alcalá
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
- Diego Otón, Laura De|||0000-0002-4939-2987
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Pérez Rubio, María Del Carmen|||0000-0001-8271-6843
- Villadangos Carrizo, José Manuel|||0000-0001-5900-0978
- Nieto Capuchino, Rubén|||0000-0002-8293-9665
2023 IEEE International Workshop on Metrology for Living Environment (MetroLivEnv), 29-31 May 2023, Milano, Italy., Population ageing is becoming a key social issue
in recent decades, particularly in Western countries, where this
fact, together with the increase of life expectancy, has posed a
significant strain on public finances and health services. In this
context, many technological developments are often proposed to
promote and support the independent living of elderly at their
own homes, thus avoiding or postponing possible entries into
social residences. Among them, smart meters provide a nonintrusive
way to monitor and estimate the tenants" daily
activities, by only using a single-point measurement in the mains
at the entrance of the household. This work describes a
regression approach to estimate the energy consumption of a
house by means of a LSTM neural network. For that purpose, a
pilot has been run on a house during six months in order to
collect the electrical data, which will be used later to train the
neural network. After that training, the network tries to
estimate the energy consumption every 15 minutes, so any
deviation between the predicted sample and the measured one
might be used to detect anomalies in the daily routine of the tenant., Agencia Estatal de Investigación, Universidad de Alcalá
in recent decades, particularly in Western countries, where this
fact, together with the increase of life expectancy, has posed a
significant strain on public finances and health services. In this
context, many technological developments are often proposed to
promote and support the independent living of elderly at their
own homes, thus avoiding or postponing possible entries into
social residences. Among them, smart meters provide a nonintrusive
way to monitor and estimate the tenants" daily
activities, by only using a single-point measurement in the mains
at the entrance of the household. This work describes a
regression approach to estimate the energy consumption of a
house by means of a LSTM neural network. For that purpose, a
pilot has been run on a house during six months in order to
collect the electrical data, which will be used later to train the
neural network. After that training, the network tries to
estimate the energy consumption every 15 minutes, so any
deviation between the predicted sample and the measured one
might be used to detect anomalies in the daily routine of the tenant., Agencia Estatal de Investigación, Universidad de Alcalá
Audiovisual tracking of multiple speakers in smart spaces
e_Buah Biblioteca Digital Universidad de Alcalá
- Sanabria Macías, Frank
- Marrón Romera, Marta|||0000-0001-7723-2262
- Macías Guarasa, Javier|||0000-0002-3303-3963
This paper presents GAVT, a highly accurate audiovisual 3D tracking system based on particle filters and a probabilistic framework, employing a single camera and a microphone array. Our first contribution is a complex visual appearance model that accurately locates the speaker?s mouth. It transforms a Viola & Jones face detector classifier kernel into a likelihood estimator, leveraging knowledge from multiple classifiers trained for different face poses. Additionally, we propose a mechanism to handle occlusions based on the new likelihood?s dispersion. The audio localization proposal utilizes a probabilistic steered response power, representing cross-correlation functions as Gaussian mixture models. Moreover, to prevent tracker interference, we introduce a novel mechanism for associating Gaussians with speakers. The evaluation is carried out using the AV16.3 and CAV3D databases for Single- and Multiple-Object Tracking tasks (SOT and MOT, respectively). GAVT significantly improves the localization performance over audio-only and video- only modalities, with up to 50.3% average relative improvement in 3D when compared with the video-only modality. When compared to the state of the art, our audiovisual system achieves up to 69.7% average relative improvement for the SOT and MOT tasks in the AV16.3 dataset (2D comparison), and up to 18.1% average relative improvement in the MOT task for the CAV3D dataset (3D comparison)., Agencia Estatal de Investigación, Universidad de Alcalá
Comparative analysis of neural network implementations for NILM applications
e_Buah Biblioteca Digital Universidad de Alcalá
- Martín Catalán, Jorge
- Diego Otón, Laura De|||0000-0002-4939-2987
- Tapiador Luque, Miguel
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
- Nieto Capuchino, Rubén|||0000-0002-8293-9665
2023 38th Conference on Design of Circuits and Integrated Systems (DCIS), 15-17 November 2023, Málaga, Spain., Non-Intrusive Load Monitoring (NILM) comprises a set of techniques that try to disaggregate the energy consumption in a household or building, based on the measurements coming from a single-point smart meter. In this process, a key step is the load identification of the different appliances that may switched on/off during a certain interval under analysis. For that purpose, machine-learning techniques, including deep neural networks, have recently been involved. These classification algorithms may often imply a high computational cost, that might compromise a possible edge-computing implementation on the local smart meters. In this context, this work presents a preliminary comparison between two different real-time implementations of both neural networks, a dense one and a convolutional one, applied to load classification. The implementations are based on a FPGA approach and on a processor. In general terms, preliminary results show that the FPGA solution provides lower latencies than the processor one, at the expense of requiring a higher design effort and a quantization error coming from the fixed-point representation., Agencia Estatal de Investigación
Using Perspective-n-Point Algorithms for a Local Positioning System Based on LEDs and a QADA Receiver
e_Buah Biblioteca Digital Universidad de Alcalá
- Aparicio Esteve, Elena|||0000-0001-7886-312X
- Ureña Ureña, Jesús|||0000-0003-1408-6039
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Moltó Orozco, David|||0000-0002-2790-3758
The research interest on location-based services has increased during the last years ever since 3D centimetre accuracy inside intelligent environments could be confronted with. This work proposes an indoor local positioning system based on LED lighting, transmitted from a set of beacons to a receiver. The receiver is based on a quadrant photodiode angular diversity aperture (QADA) plus an aperture placed over it. This configuration can be modelled as a perspective camera, where the image position of the transmitters can be used to recover the receiver?s 3D pose. This process is known as the perspective-n-point (PnP) problem, which is well known in computer vision and photogrammetry. This work investigates the use of different state-of-the-art PnP algorithms to localize the receiver in a large space of 2 2m2 based on four co-planar transmitters and with a distance from transmitters to receiver up to 3.4 m. Encoding techniques are used to permit the simultaneous emission of all the transmitted signals and their processing in the receiver. In addition, correlation techniques (match filtering) are used to determine the image points projected from each emitter on the QADA. This work uses Monte Carlo simulations to characterize the absolute errors for a grid of test points under noisy measurements, as well as the robustness of the system when varying the 3D location of one transmitter. The IPPE algorithm obtained the best performance in this configuration. The proposal has also been experimentally evaluated in a real setup. The estimation of the receiver's position at three particular points for roll angles of the receiver of g ={0º, 120º, 210º and 300º} using the IPPE algorithm achieves average absolute errors and standard deviations of 4.33 cm, 3.51cm and 28.90 cm; and 1.84 cm, 1.17cm and 19.80cm in the coordinates x, y and z, respectively. These positioning results are in line with those obtained in previous work using triangulation techniques but with the addition that the complete pose of the receiver (x, y, z, a, b, g) is obtained in this proposal., Agencia Estatal de Investigación, Universidad de Alcalá
SoC Architecture for High-Frequency Acquisition of Household Electric Signals
e_Buah Biblioteca Digital Universidad de Alcalá
- Navarro Pérez, Víctor Manuel
- Diego Otón, Laura de|||0000-0002-4939-2987
- Nieto Capuchino, Rubén|||0000-0002-8293-9665
- Tapiador Luque, Miguel
- Ureña Ureña, Jesús|||0000-0003-1408-6039
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC 2024), 20-23 May 2024, Glasgow, United Kingdom., Accurate monitoring of current and voltage in residential power grids is crucial for effective energy manage-ment and the implementation of smart technologies for non-intrusive monitoring of persons, especially those with cognitive or physical impairments. The emergence of high-speed analog-digital converters (ADCs) and wireless Internet-of-Things (IoT) devices has significantly improved measurement rates and realtime data collection capabilities, but the acquisition rates for the correct recognition of some devices in a non-intrusive load monitoring (NILM) aplication might still be insufficient. In order to address this issue, this work describes a System-on-Chip (SoC) architecture based on a Field-Programmable Gate Array (FPGA) and a dedicated analog conditioning circuit for household elec-tric signal acquisition. The architecture simultaneously captures voltage with a bandwidth ranging from 500 Hz to 100 kHz; and current signals from 300 Hz to 100 kHz. It is based on a Zybo z7-10 development platform along with an AD7476A ADC and a conditioning circuit, which consist of a filtering stage, an isolation transformer and a voltage-level adapter. The system is adequate for realtime household electric signal acquisition and processing, being able to capture high frequency harmonics correctly, while enabling the possible implementation of further processing, such as the integration of artificial neural networks and/or machine learning algorithms for event classification, load identification, and the implementation of an alarm system for behaviour anomalies., Agencia Estatal de Investigación
Implementing a CNN in FPGA programmable logic for NILM application
e_Buah Biblioteca Digital Universidad de Alcalá
- Tapiador Luque, Miguel
- Diego Otón, Laura De|||0000-0002-4939-2987
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
- Nieto Capuchino, Rubén|||0000-0002-8293-9665
2023 38th Conference on Design of Circuits and Integrated Systems (DCIS), 15-17 November 2023, Málaga, Spain., Non-Intrusive Load Monitoring (NILM) techniques are gaining popularity in the field of energy savings. Generally implemented through the use of smart meters, the main challenge with these devices is that they operate at very low sampling rates. To address this issue, FPGA-based systems have been proposed to capture instantaneous currents and voltages at higher sampling frequencies in the kHz range. However, the limitation of these architectures lies in the fact that the acquired windows are often transmitted upstream to the cloud for the application of load classification algorithms based on machine learning, relying on high-bandwidth communications available onsite. This work proposes an alternative approach by implementing the classification algorithms in the same acquisition and processing system, by using custom convolutional neural networks on mid-range FPGA devices. Brevitas and FINN frameworks are used for the quantization-aware training, as well as for the generation of a peripheral that may be integrated into any FPGA-based SoC (System-on-Chip) architecture. The proposed approach allows the whole processing involved in NILM techniques to be integrated into a single embedded system. Preliminary experimental results demonstrate the effectiveness of the proposed approach., Agencia Estatal de Investigación
An Open-Source VLBI Digital Backend for Low-Cost FPGA-based SoCs
e_Buah Biblioteca Digital Universidad de Alcalá
- Cubero Vacas, Miguel
- Hernández Alonso, Álvaro|||0000-0001-9308-8133
- González, Javier
DCIS2024: 39th Conference on Design of Circuits and Integrated Systems. Catania, Italy, November 13-15, 2024., Very Long Baseline Interferometry (VLBI) techniques have become crucial in radio astronomy and geodesy due to the major improvements that they provide to the observations, outcomes, at the expense of requiring a significant computational load to implement the corresponding processing in real time. The devices responsible for these digital signal processing tasks in the application field, mainly based on Field-Programmable Gate Arrays (FPGAs), are known as Digital Backends. Currently, the costs coming from the state-of-the-art digital backends are high and their FPGA designs either require expensive software licenses to be involved, or they are released as proprietary solutions. This work proposes a novel SoC (System-on-Chip) architecture, based on a Xilinx Zynq device, for a digital backend focused on VLBI applications. The architecture is capable of managing the acquisition stage at the required data rates, and packing data in the well-known VDIF format to upload them. By offering a low-cost open-source architecture, it becomes a suitable solution for dissemination purposes, as well as for specific research fields. Furthermore, the system has been successfully validated with its implementation and test in some preliminary experimental setups., Agencia Estatal de Investigación
Label augmentation to improve generalization of deep learning semantic segmentation of laparoscopic images
e_Buah Biblioteca Digital Universidad de Alcalá
- Monasterio Expósito, Leticia
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Macías Guarasa, Javier|||0000-0002-3303-3963
Training Deep Neural Networks to solve semantic segmentation is a challenging problem with small-size labeled datasets, leading to overfitting. This is especially problematic in medical images, and in particular, in laparoscopic surgery images. In this context, ground-truth segmentation labels are available only for a small set of images with few patients. Besides, inter-patient variability is very high in practice. Models trained for a specific setup and a set of patients usually performs poorly when deployed in a new environment. This work proposes a new training strategy that improves the generalization accuracy of current state-of-the-art semantic segmentation methods applied to laparoscopic images. Our approach is based on training a discriminator network, which learns to detect segmentation errors, producing a dense segmentation error map. Unlike in adversarial networks, we train the discriminator offline by synthetically altering ground- truth segmentation labels with simple morphological and geometric operations. We then use the discriminator to train a segmentation neural network, by minimizing the discriminator predicted error jointly with a standard segmentation loss. This strategy results in segmentation models that are significantly more accurate when tested in unseen images than those only relying on data augmentation. This technique is very suitable to boost the performance of any state-of-the-art segmentation network and can be combined with other data augmentation strategies. This paper evaluates and validates our proposal by training and testing common state-of-the-art segmentation models in publicly available semantic segmentation datasets, specialized in laparoscopic and endoscopic surgery. The results show that our methods are effective, obtaining a significant improvement in terms of segmentation accuracy, especially in challenging small-size datasets., Agencia Estatal de Investigación, Ministerio de Economía y Competitividad, Universidad de Alcalá
Object Detection for Functional Assessment Applications
e_Buah Biblioteca Digital Universidad de Alcalá
- Melino Carrero, Alessandro
- Nieva Suárez, Álvaro|||0000-0003-3414-6721
- Losada Gutiérrez, Cristina|||0000-0001-9545-327X
- Marrón Romera, Marta|||0000-0001-7723-2262
- Guardiola Luna, Irene|||0000-0002-5218-4361
- Baeza Mas, Javier
Engineering Applications of Neural Networks, EANN 2023, This paper presents a proposal for object detection as a first stage for the analysis of Human-Object Interaction (HOI) in the context of automated functional assessment. The proposed system is based in a two-step strategy, thus, in the first stage there are detected the people in the scene, as well as large objects (table, chairs, etc.) using a pre-trained YOLOv8. Then, there is defined a ROI around each person that is processed using a custom YOLO to detect small elements (forks, plates, spoons, etc.). Since there are no large image datasets that include all the objects of interest, there has also been compiled a new dataset including images from different sets, and improving the available labels. The proposal has been evaluated in the novel dataset, and in different images acquired in the area in which the functional assessment is performed, obtaining promising results., Agencia Estatal de Investigación, Universidad de Alcalá
Deep Shape-from-Template: Single-image quasi-isometric deformable registration and reconstruction
e_Buah Biblioteca Digital Universidad de Alcalá
- Fuentes Jiménez, David|||0000-0001-6424-4782
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Casillas Pérez, David|||0000-0002-5721-1242
- Collins, Toby
- Bartoli, Adrien
Shape-from-Template (SfT) solves 3D vision from a single image and a deformable 3D object model, called a template. Concretely, SfT computes registration (the correspondence between the template and the image) and reconstruction (the depth in camera frame). It constrains the object deformation to quasi-isometry. Real-time and automatic SfT represents an open problem for complex objects and imaging conditions. We present four contributions to address core unmet challenges to realise SfT with a Deep Neural Network (DNN). First, we propose a novel DNN called DeepSfT, which encodes the template in its weights and hence copes with highly complex templates. Second, we propose a semi-supervised training procedure to exploit real data. This is a practical solution to overcome the render gap that occurs when training only with simulated data. Third, we propose a geometry adaptation module to deal with different cameras at training and inference. Fourth, we combine statistical learning with physics-based reasoning. DeepSfT runs automatically and in real-time and we show with numerous experiments and an ablation study that it consistently achieves a lower 3D error than previous work. It outperforms in generalisation and achieves great performance in terms of reconstruction and registration error with wide-baseline, occlusions, illumination changes, weak texture and blur., Agencia Estatal de Investigación, Ministerio de Universidades
Real-time weakly supervised anomaly detection with attention mechanism
e_Buah Biblioteca Digital Universidad de Alcalá
- Sarker, Mohammad Ibrahim
- Marrón Romera, Marta|||0000-0001-7723-2262
- Losada Gutiérrez, Cristina|||0000-0001-9545-327X
2023 International Conference on Information Network and Computer Communications (INCC), 27/10/2023-29/10/2023, Beijing, China., The use of surveillance cameras for ensuring the safety of people in public gatherings, e.g., roads, intersections, banks, shopping malls, etc., is being used increasingly in recent years. To detect violence and other abnormal behaviors from these surveillance videos, the monitoring, and detection need to be intelligent and precise. Manual monitoring of these videos is tiring and prone to errors. So automatic detection of video anomaly is a very important research topic nowadays. Although many methods have been proposed, research is still ongoing for a reliable method for violence detection. Most of the earlier state-of-the-art works suggest that supervised detection is better in anomaly detection than unsupervised one. However, given the difficulty of manually annotation and training large datasets for supervised learning, weakly supervised methods provide an excellent alternate solution. Generally, in weakly supervised learning, anomaly detection is constructed as a Multiple Instance Learning (MIL) problem. The base of that technique is to build the training model from labeled bags (comprising multiple instances) rather than labeled individual instances (in the computer vision branch: videos or frames). A bag is annotated as negative if all the instances are negative, while it is marked positive if at least one instance is positive. But the problem is that the recognition of positive instances is largely biased by the dominance of negative ones. As in the real world, abnormal events are very subtle, and the difference between normal and abnormal instances is minimal. Addressing this issue, this paper proposes a weakly supervised learning algorithm with an attention mechanism for anomalous events detection. Also, a novel ranking loss function is proposed to lessen the number of false negatives in the anomaly detection task and to extend the distance between classification scores of anomalous and normal videos in the learned model. Experimental results of this proposal within UCF-Crime and Shanghai-Tech datasets show that the proposed method outperforms state-of-the-art ones. Moreover, using an attention mechanism enhances feature extraction in the generated model inference. Finally, it is demonstrated that the proposal can be run in real-time, a contribution that can greatly impact real applications in the context of interest., Agencia Estatal de Investigación, Universidad de Alcalá
Weakly-Supervised Deep Shape-from-Template
e_Buah Biblioteca Digital Universidad de Alcalá
- Luengo Sánchez, Sara|||0000-0003-3942-3804
- Fuentes Jiménez, David|||0000-0001-6424-4782
- Losada Gutiérrez, Cristina|||0000-0001-9545-327X
- Pizarro Pérez, Daniel|||0000-0003-0622-4884
- Bartoli, Adrien
We propose WS-DeepSfT, a novel deep learning-based approach to the Shape-from-Template (SfT) problem, which aims at reconstructing the 3D shape of a deformable object from a single RGB image and a template. WS-DeepSfT addresses the limitations of existing SfT techniques by combining a weakly-supervised deep neural network (DNN) for registration and a classical As-Rigid-As-Possible (ARAP) algorithm for 3D reconstruction. Unlike previous deep learning-based SfT methods, which require extensive synthetic data and depth sensors for training, WS-DeepSfT only requires regular RGB video of the deforming object and a segmentation mask to discriminate the object from the background. The registration model is trained without synthetic data, using videos where the object undergoes deformations, while ARAP does not require training and infers the 3D shape in real-time with minimal overhead. We show that WSDeepSfT outperforms the state-of-the-art, in both accuracy and robustness, without requiring depth sensors or synthetic data generation. WS-DeepSfT thus offers a robust, efficient, and scalable approach to SfT, bringing it closer to applications such as augmented reality., Ministerio de Ciencia e Innovación, Ministerio de Universidades
A hybrid cascade-parallel discriminative-generative model for pipeline integrity threat detection in a smart fiber optic surveillance system
Digital.CSIC. Repositorio Institucional del CSIC
- Tejedor, Javier
- Macias-Guarasa, Javier
- Martins, Hugo F.
- Martin-Lopez, Sonia
- Gonzalez-Herraez, Miguel
25 pags., 10 figs., 8 tabs., This paper presents an advanced system for the continuous monitoring of potential threats in a long gas pipeline. For signal acquisition, phase-sensitive optical time domain reflectometry (ϕ-OTDR) technology is employed. Then, pattern recognition strategies are incorporated, which are aimed at identifying threats. To do so, the system integrates a random forest-based approach on top of a multiple-layer perceptron (MLP)-based discriminative approach for feature extraction within a parallel Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM) for pattern classification in a hybrid approach. Subsequently, a system combination strategy, which makes use of the decisions carried out by this hybrid approach, is also presented. This strategy is based on the so-called majority voting technique, which makes use of the output of the classification step from the different feature extraction strategies and the different number of states in the GMM-HMM-based classification. The system is tested on two tasks: (1) Identification of machine and activity, and (2) detection of threats for the pipeline. Compared with our previous system, the results of this advanced system show that the hybrid feature extraction and pattern classification achieve statistically significant improvements for both tasks (i.e., 5% of relative improvement for the machine and activity identification task, 1% of relative improvement in the threat detection rate, and 15% of relative improvement in the false alarm rate for the threat detection task)., This work was partially supported by the Ministry of Science, Innovation and Universities of Spain (grant number RTI2018-095324-B-I00).This work was also funded by the Spanish Ministry of Economy and Competitiveness with projects ARTEMISA (TIN2016-80939-R) and HEIMDAL-UAH (TIN201675982-C2-1-R), by the Spanish Ministry of Science and Innovation MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR program, with projects PSI (PLEC2021-007875),ATHENA (PID2020-115995RB-I00) and EYEFUL (PID2020-113118RB-C31), and by CAM and UAH under projects ARGOS+ (PIUAH21/IA-016) and CONDORDIA (CM/JIN/2021-015). The authors gratefully acknowledge the computer resources at Artemisa, funded by the European Union ERDF and Comunitat Valenciana as well as the technical support provided by the Instituto de Fisica Corpuscular, IFIC (CSIC-UV). The authors thank Sira E. Palazuelos-Cagigas for her participation in the writing and technical editing of the manuscript., Peer reviewed
DOI: http://hdl.handle.net/10261/373436, https://api.elsevier.com/content/abstract/scopus_id/85193752458