MODELOS ESTOCASTICOS PARA ESTIMACION Y PREDICCION EN MEDICINA Y EXTREMOS MEDIOAMBIENTALES

PID2020-116873GB-I00

Nombre agencia financiadora Agencia Estatal de Investigación
Acrónimo agencia financiadora AEI
Programa Programa Estatal de Generación de Conocimiento y Fortalecimiento Científico y Tecnológico del Sistema de I+D+i
Subprograma Subprograma Estatal de Generación de Conocimiento
Convocatoria Proyectos I+D
Año convocatoria 2020
Unidad de gestión Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020
Centro beneficiario UNIVERSIDAD DE ZARAGOZA
Identificador persistente http://dx.doi.org/10.13039/501100011033

Publicaciones

Found(s) 22 result(s)
Found(s) 1 page(s)

Designing experiments for estimating an appropriate outlet size for a silo type problem

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • López Fidalgo, Jesús
  • May, Caterina
  • Moler Cuiral, José Antonio
Jam formation is a problem that may occur when granular material is discharged by gravity from a silo. The estimation of the minimum outlet size, which guarantees that the time to the next jamming event is long enough, can be crucial in the industry. The time is modeled by an exponential distribution with two unknown parameters, and this goal translates to precise estimation of a nonlinear transformation of the parameters. We obtain c-optimum experimental designs with that purpose, applying the graphic Elfving method. Because the optimal experimental designs depend on the nominal values of the parameters, we conduct a sensitivity analysis on our dataset. Finally, a simulation study checks the performance of the approximations, first with the Fisher Information matrix, then with the linearization of the function to be estimated. The results are useful for experimenting in a laboratory and then translating the results to a real scenario. From the application we develop a general methodology for estimating a one-dimensional transformation of the parameters of a nonlinear model., The first author was sponsored by Ministerio de Ciencia e Innovación PID2020-113443RB-C21 and the third one by Ministerio de Ciencia e Innovación PID2020-116873GB-I00 and PID2020-114031RB-I00.




A multistate model and its standalone tool to predict hospital and ICU occupancy by patients with COVID-19

Academica-e. Repositorio Institucional de la Universidad Pública de Navarra
  • Lafuente, Miguel
  • López, Francisco Javier
  • Mateo, Pedro
  • Cebrián, Ana Carmen
  • Asín, Jesús
  • Moler Cuiral, José Antonio
  • Borque-Fernando, Ángel
  • Esteban, Luis Mariano
  • Pérez-Palomares, Ana
  • Sanz, Gerardo
Objective: This study aims to build a multistate model and describe a predictive tool for estimating the daily number of intensive care unit (ICU) and hospital beds occupied by patients with coronavirus 2019 disease (COVID-19). Material and methods: The estimation is based on the simulation of patient trajectories using a multistate model where the transition probabilities between states are estimated via competing risks and cure models. The input to the tool includes the dates of COVID-19 diagnosis, admission to hospital, admission to ICU, discharge from ICU and discharge from hospital or death of positive cases from a selected initial date to the current moment. Our tool is validated using 98,496 cases positive for severe acute respiratory coronavirus 2 extracted from the Aragón Healthcare Records Database from July 1, 2020 to February 28, 2021. Results: The tool demonstrates good performance for the 7- and 14-days forecasts using the actual positive cases, and shows good accuracy among three scenarios corresponding to different stages of the pandemic: 1) up-scenario, 2) peak-scenario and 3) down-scenario. Long term predictions (two months) also show good accuracy, while those using Holt-Winters positive case estimates revealed acceptable accuracy to day 14 onwards, with relative errors of 8.8%. Discussion: In the era of the COVID-19 pandemic, hospitals must evolve in a dynamic way. Our prediction tool is designed to predict hospital occupancy to improve healthcare resource management without information about clinical history of patients. Conclusions: Our easy-to-use and freely accessible tool (https://github.com/peterman65) shows good performance and accuracy for forecasting the daily number of hospital and ICU beds required for patients with COVID-19., This work was supported by Gobierno de Aragón [ E46-20R ] and Ministerio de Ciencia e Innovación [ PID2020-116873GB-I00 ].




Teaching Urology to Undergraduates: A Prospective Survey of What General Practitioners Need to Know

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Borque-Fernando, Ángel
  • Redondo-Redondo, Cristina
  • Orna-Montesinos, Concepción
  • Esteban, Luis Mariano
  • Denizón-Arranz, Sophia
  • Tejero-Sánchez, Arlanza
  • García-Ruiz, Ramiro
  • Sanchez-Zalabardo, José Manuel
  • Gracia-Romero, Jesús
  • Monreal-Híjar, Antonio
  • Gil-Sanz, María Jesús
  • Sanz, Gerardo
  • Sanz-Pozo, Mónica
  • Romero-Fernández, Francisco
Background: Higher education training in Medicine has considerably evolved in recent years. One of its main goals has been to ensure the training of students as future adequately qualified general practitioners (GPs). Tools need to be developed to evaluate and improve the teaching of Urology at the undergraduate level. Our objective is to identify the knowledge and skills needed in Urology for the real clinical practice of GPs. Methods: An anonymous self-administered survey was carried out among GPs of Primary Care and Emergencies which sought to evaluate urological knowledge and necessary urological skills. The results of the survey were exported and descriptive statistics were performed using IBM SPSS Statistics version 19.0. Results and limitations: A total of 127 answers were obtained, in which ‘Urological infections’, ‘Renal colic’, ‘PSA levels and screening for prostate cancer’, ‘Benign prostatic hyperplasia’, ‘Hematuria’, ‘Scrotal pain’, ‘Prostate cancer diagnosis’, ‘Bladder cancer diagnosis’, ‘Urinary incontinence’, and ‘Erectile dysfunction’ were rated as Very high or High formative requirements (>75%). Regarding urological skills, ‘Abdominal examination’, ‘Interpretation of urinalysis’, ‘Digital rectal examination’, ‘Genital examination’, and ‘Transurethral catheterization’ were assessed as needing Very high or High training in more than 80% of the surveys. The relevance of urological pathology in clinical practice was viewed as Very high or High in more than 80% of the responses. Conclusions: This study has shown helpful results to establish a differentiated prioritization of urological knowledge and skills in Primary Care and Emergencies. Efforts should be aimed at optimizing the teaching in Urology within the Degree of Medicine which consistently ensures patients’ proper care by future GPs.




Spatio-temporal analysis of the extent of an extreme heat event

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Cebrián, Ana C.
  • Asín, Jesús
  • Gelfand, Alan E.
  • Schliep, Erin M.
  • Castillo-Mateo, Jorge
  • Beamonte, María A.
  • Abaurrea, Jesús
Evidence of global warming induced from the increasing concentration of greenhouse gases in the atmosphere suggests more frequent warm days and heat waves. The concept of an extreme heat event (EHE), defined locally based on exceedance of a suitable local threshold, enables us to capture the notion of a period of persistent extremely high temperatures. Modeling for extreme heat events is customarily implemented using time series of temperatures collected at a set of locations. Since spatial dependence is anticipated in the occurrence of EHE’s, a joint model for the time series, incorporating spatial dependence is needed. Recent work by Schliep et al. (J R Stat Soc Ser A Stat Soc 184(3):1070–1092, 2021) develops a space-time model based on a point-referenced collection of temperature time series that enables the prediction of both the incidence and characteristics of EHE’s occurring at any location in a study region. The contribution here is to introduce a formal definition of the notion of the spatial extent of an extreme heat event and then to employ output from the Schliep et al. (J R Stat Soc Ser A Stat Soc 184(3):1070–1092, 2021) modeling work to illustrate the notion. For a specified region and a given day, the definition takes the form of a block average of indicator functions over the region. Our risk assessment examines extents for the Comunidad Autónoma de Aragón in northeastern Spain. We calculate daily, seasonal and decadal averages of the extents for two subregions in this comunidad. We generalize our definition to capture extents of persistence of extreme heat and make comparisons across decades to reveal evidence of increasing extent over time.




Machine learning algorithm to predict acidemia using electronic fetal monitoring recording parameters

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Esteban-Escaño, Javier
  • Castán, Berta
  • Castán, Sergio
  • Chóliz-Ezquerro, Marta
  • Asensio, César
  • Laliena, Antonio R.
  • Sanz-Enguita, Gerardo
  • Sanz, Gerardo
  • Esteban, Luis Mariano
  • Savirón, Ricardo
Background: Electronic fetal monitoring (EFM) is the universal method for the surveillance of fetal well-being in intrapartum. Our objective was to predict acidemia from fetal heart signal features using machine learning algorithms. Methods: A case–control 1:2 study was carried out compromising 378 infants, born in the Miguel Servet University Hospital, Spain. Neonatal acidemia was defined as pH < 7.10. Using EFM recording logistic regression, random forest and neural networks models were built to predict acidemia. Validation of models was performed by means of discrimination, calibration, and clinical utility. Results: Best performance was attained using a random forest model built with 100 trees. The discrimination ability was good, with an area under the Receiver Operating Characteristic curve (AUC) of 0.865. The calibration showed a slight overestimation of acidemia occurrence for probabilities above 0.4. The clinical utility showed that for 33% cutoff point, missing 5% of acidotic cases, 46% of unnecessary cesarean sections could be prevented. Logistic regression and neural networks showed similar discrimination ability but with worse calibration and clinical utility. Conclusions: The combination of the variables extracted from EFM recording provided a predictive model of acidemia that showed good accuracy and provides a practical tool to prevent unnecessary cesarean sections.




Testosterone recovery after androgen deprivation therapy in prostate cancer: building a predictive model

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Borque Fernando, Á.
  • Estrada-Domínguez, F.
  • Esteban, L. M.
  • Gil Sanz, M. J.
  • Sanz, G.
Purpose: To analyze the variability, associated actors, and the design of nomograms for individualized testosterone recovery after cessation of androgen deprivation therapy (ADT). Materials and Methods: A longitudinal study was carried out with 208 patients in the period 2003 to 2019. Castrated and normogonadic testosterone levels were defined as 0.5 and 3.5 ng/mL, respectively. The cumulative incidence curve described the recovery of testosterone. Univariate and multivariate analyzes were performed to predict testosterone recovery with candidate prognostic factors prostate-specific antigen at diagnosis, clinical stage, Gleason score from biopsy, age at cessation of ADT, duration of ADT, primary therapy and use of LHRH (luteinizing hormone-releasing hormone) agonists. Results: The median follow-up duration in the study was 80 months (interquartile range, 49–99 mo). Twenty-five percent and 81% of patients did not recover the castrate and normogonadic levels, respectively. Duration of ADT and age at ADT cessation were significant predictors of testosterone recovery. We built two nomograms for testosterone recovery at 12, 24, 36, and 60 months. The castration recovery model had good calibration. The C-index was 0.677, with area under the receiver operating characteristic curve (AUC-ROC) of 0.736, 0.783, 0.782, and 0.780 at 12, 24, 36, and 60 months, respectively. The normogonadic recovery model overestimated the higher values of probability of recovery. The Cindex was 0.683, with AUC values of 0.812, 0.711, 0.708 and 0.693 at 12, 24, 36, and 60 months, respectively. Conclusions: Depending on the age of the patient and the length of treatment, clinicians may stop ADT and the castrated testosterone level will be maintained or, if the course of treatment has been short, we can estimate if it will return to normogonadic levels.




A stepwise algorithm for linearly combining biomakers under Youden Index maximisation

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Aznar-Gimeno, Rocío
  • Esteban, Luis M.
  • Hoyo-Alonso, Rafael del
  • Borque-Fernando, Ángel
  • Sanz, Gerardo
Combining multiple biomarkers to provide predictive models with a greater discriminatory ability is a discipline that has received attention in recent years. Choosing the probability threshold that corresponds to the highest combined marker accuracy is key in disease diagnosis. The Youden index is a statistical metric that provides an appropriate synthetic index for diagnostic accuracy and a good criterion for choosing a cut-off point to dichotomize a biomarker. In this study, we present a new stepwise algorithm for linearly combining continuous biomarkers to maximize the Youden index. To investigate the performance of our algorithm, we analyzed a wide range of simulated scenarios and compared its performance with that of five other linear combination methods in the literature (a stepwise approach introduced by Yin and Tian, the min-max approach, logistic regression, a parametric approach under multivariate normality and a non-parametric kernel smoothing approach). The obtained results show that our proposed stepwise approach showed similar results to other algorithms in normal simulated scenarios and outperforms all other algorithms in non-normal simulated scenarios. In scenarios of biomarkers with the same means and a different covariance matrix for the diseased and non-diseased population, the min-max approach outperforms the rest. The methods were also applied on two real datasets (to discriminate Duchenne muscular dystrophy and prostate cancer), whose results also showed a higher predictive ability in our algorithm in the prostate cancer database




Spatial Modeling of Day-Within-Year Temperature Time Series: An Examination of Daily Maximum Temperatures in Aragon, Spain

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo-Mateo, Jorge
  • Lafuente, Miguel
  • Asin, Jesus
  • Cebrian, Ana C.
  • Gelfand, Alan E.
  • Abaurrea, Jesus
Acknowledging a considerable literature on modeling daily temperature data, we propose a multi-level spatiotemporal model which introduces several innovations in order to explain the daily maximum temperature in the summer period over 60 years in a region containing Aragon, Spain. The model operates over continuous space but adopts two discrete temporal scales, year and day within year. It captures temporal dependence through autoregression on days within year and also on years. Spatial dependence is captured through spatial process modeling of intercepts, slope coefficients, variances, and autocorrelations. The model is expressed in a form which separates fixed effects from random effects and also separates space, years, and days for each type of effect. Motivated by exploratory data analysis, fixed effects to capture the influence of elevation, seasonality, and a linear trend are employed. Pure errors are introduced for years, for locations within years, and for locations at days within years. The performance of the model is checked using a leave-one-out cross-validation. Applications of the model are presented including prediction of the daily temperature series at unobserved or partially observed sites and inference to investigate climate change comparison. Supplementary materials accompanying this paper appear online.




Personalized Model to Predict Small for Gestational Age at Delivery Using Fetal Biometrics, Maternal Characteristics, and Pregnancy Biomarkers: A Retrospective Cohort Study of Births Assisted at a Spanish Hospital

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Dieste-Pérez, Peña
  • Savirón-Cornudella, Ricardo
  • Tajada-Duaso, Mauricio
  • Pérez-López, Faustino R.
  • Castán-Mateo, Sergio
  • Sanz, Gerardo
  • Esteban, Luis Mariano
Small for gestational age (SGA) is defined as a newborn with a birth weight for gestational age < 10th percentile. Routine third-trimester ultrasound screening for fetal growth assessment has detection rates (DR) from 50 to 80%. For this reason, the addition of other markers is being studied, such as maternal characteristics, biochemical values, and biophysical models, in order to create personalized combinations that can increase the predictive capacity of the ultrasound. With this purpose, this retrospective cohort study of 12,912 cases aims to compare the potential value of third-trimester screening, based on estimated weight percentile (EPW), by universal ultrasound at 35–37 weeks of gestation, with a combined model integrating maternal characteristics and biochemical markers (PAPP-A and β-HCG) for the prediction of SGA newborns. We observed that DR improved from 58.9% with the EW alone to 63.5% with the predictive model. Moreover, the AUC for the multivariate model was 0.882 (0.873–0.891 95% C.I.), showing a statistically significant difference with EPW alone (AUC 0.864 (95% C.I.: 0.854–0.873)). Although the improvements were modest, contingent detection models appear to be more sensitive than third-trimester ultrasound alone at predicting SGA at delivery.




Distribution-free changepoint detection tests based on the breaking of records

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo Mateo, Jorge
The analysis of record-breaking events is of interest in fields such as climatology, hydrology or anthropology. In connection with the record occurrence, we propose three distribution-free statistics for the changepoint detection problem. They are CUSUM-type statistics based on the upper and/or lower record indicators observed in a series. Using a version of the functional central limit theorem, we show that the CUSUM-type statistics are asymptotically Kolmogorov distributed. The main results under the null hypothesis are based on series of independent and identically distributed random variables, but a statistic to deal with series with seasonal component and serial correlation is also proposed. A Monte Carlo study of size, power and changepoint estimate has been performed. Finally, the methods are illustrated by analyzing the time series of temperatures at Madrid, Spain. The R package RecordTest publicly available on CRAN implements the proposed methods.




Estudio sobre la intención emprendedora en estudiantes del grado en trabajo social

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • García Uceda, Esperanza
Se presenta un estudio de los determinantes de la intención emprendedora (IE) en estudiantes universitarios del Grado en Trabajo Social en el inicio de su educación para el emprendimiento. Los datos obtenidos de una muestra de estudiantes de la asignatura “Gestión de las Organizaciones”, incluida en 3er curso del Grado de la Universidad de Zaragoza (España). Participaron 139 estudiantes de los 184 matriculados. Se utilizó un modelo de regresión logística para analizar la relación entre esos factores y la variable binaria que expresa la IE del estudiante. Se identificó influencia significativa en Responsabilidad, Creatividad y Autoconocimiento, con efecto positivo.




Epidemiology, Diagnosis and Management of Penile Cancer: Results from the Spanish National Registry of Penile Cancer

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Borque-Fernando, Ángel
  • Gaya, Josep Maria
  • Esteban-Escaño, Luis Mariano
  • Gómez-Rivas, Juan
  • García-Baquero, Rodrigo
  • Agreda-Castañeda, Fernando
  • Gallioli, Andrea
  • Verri, Paolo
  • Ortiz-Vico, Francisco Javier
  • Amir-Nicolau, Balig Fawwaz
  • Osman-Garcia, Ignacio
  • Gil-Martínez, Pedro
  • Arrabal-Martín, Miguel
  • Gómez-Ferrer Lozano, Álvaro
  • Campos-Juanatey, Felix
  • Guerrero-Ramos, Félix
  • Rubio-Briones, José
Introduction: Penile cancer (PC) is a rare malignancy with an overall incidence in Europe of 1/100,000 males/year. In Europe, few studies report the epidemiology, risk factors, clinical presentation, and treatment of PC. The aim of this study is to present an updated outlook on the aforementioned factors of PC in Spain. Materials and Methods: A multicentric, retrospective, observational epidemiological study was designed, and patients with a new diagnosis of PC in 2015 were included. Patients were anonymously identified from the Register of Specialized Care Activity of the Ministry of Health of Spain. All Spanish hospitals recruiting patients in 2015 were invited to participate in the present study. We have followed a descriptive narration of the observed data. Continuous and categorical data were reported by median (p25th–p75th range) and absolute and relative frequencies, respectively. The incidence map shows differences between Spanish regions. Results: The incidence of PC in Spain in 2015 was 2.55/100,000 males per year. A total of 586 patients were identified, and 228 patients from 61 hospitals were included in the analysis. A total of 54/61 (88.5%) centers reported ≤ 5 new cases. The patients accessed the urologist for visually-assessed penile lesions (60.5%), mainly localized in the glans (63.6%). Local hygiene, smoking habits, sexual habits, HPV exposure, and history of penile lesions were reported in 48.2%, 59.6%, 25%, 13.2%, and 69.7%. HPV-positive lesions were 18.1% (28.6% HPV-16). The majority of PC was squamous carcinoma (95.2%). PC was ≥cT2 in 45.2% (103/228) cases. At final pathology, PC was ≥pT2 in 51% of patients and ≥pN1 in 17% of cases. The most common local treatment was partial penectomy (46.9% cases). A total of 47/55 (85.5%) inguinal lymphadenectomies were open. Patients with ≥pN1 disease were treated with chemotherapy in 12/39 (40.8%) of cases. Conclusions: PC incidence is relatively high in Spain compared to other European countries. The risk factors for PC are usually misreported. The diagnosis and management of PC are suboptimal, encouraging the identification of referral centers for PC management.




Bayesian variable selection in generalized extreme value regression: modeling annual maximum temperature

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo Mateo, Jorge
  • Asín, Jesús
  • Cebrián, Ana C.
  • Mateo Lázaro, Jesús
  • Abaurrea, Jesús
In many applications, interest focuses on assessing relationships between covariates and the extremes of the distribution of a continuous response. For example, in climate studies, a usual approach to assess climate change has been based on the analysis of annual maximum data. Using the generalized extreme value (GEV) distribution, we can model trends in the annual maximum temperature using the high number of available atmospheric covariates. However, there is typically uncertainty in which of the many candidate covariates should be included. Bayesian methods for variable selection are very useful to identify important covariates. However, such methods are currently very limited for moderately high dimensional variable selection in GEV regression. We propose a Bayesian method for variable selection based on a stochastic search variable selection (SSVS) algorithm proposed for posterior computation. The method is applied to the selection of atmospheric covariates in annual maximum temperature series in three Spanish stations.




A multistate model and its standalone tool to predict hospital and ICU occupancy by patients with COVID-19

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Lafuente, M.
  • López, F. J.
  • Mateo, P. M.
  • Cebrián, A. C.
  • Asín, J.
  • Moler, J. A.
  • Borque-Fernando, Á.
  • Esteban, L. M.
  • Pérez-Palomares, A.
  • Sanz, G.
Objective: This study aims to build a multistate model and describe a predictive tool for estimating the daily number of intensive care unit (ICU) and hospital beds occupied by patients with coronavirus 2019 disease (COVID-19). Material and methods: The estimation is based on the simulation of patient trajectories using a multistate model where the transition probabilities between states are estimated via competing risks and cure models. The input to the tool includes the dates of COVID-19 diagnosis, admission to hospital, admission to ICU, discharge from ICU and discharge from hospital or death of positive cases from a selected initial date to the current moment. Our tool is validated using 98,496 cases positive for severe acute respiratory coronavirus 2 extracted from the Aragón Healthcare Records Database from July 1, 2020 to February 28, 2021. Results: The tool demonstrates good performance for the 7- and 14-days forecasts using the actual positive cases, and shows good accuracy among three scenarios corresponding to different stages of the pandemic: 1) up-scenario, 2) peak-scenario and 3) down-scenario. Long term predictions (two months) also show good accuracy, while those using Holt-Winters positive case estimates revealed acceptable accuracy to day 14 onwards, with relative errors of 8.8%. Discussion: In the era of the COVID-19 pandemic, hospitals must evolve in a dynamic way. Our prediction tool is designed to predict hospital occupancy to improve healthcare resource management without information about clinical history of patients. Conclusions: Our easy-to-use and freely accessible tool (https://github.com/peterman65) shows good performance and accuracy for forecasting the daily number of hospital and ICU beds required for patients with COVID-19.




Comparing the Min–Max–Median/IQR Approach with the Min–Max Approach, Logistic Regression and XGBoost, maximising the Youden index

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Aznar-Gimeno, Rocío
  • Esteban, Luis M.
  • Sanz, Gerardo
  • Hoyo-Alonso, Rafael del
Although linearly combining multiple variables can provide adequate diagnostic performance, certain algorithms have the limitation of being computationally demanding when the number of variables is sufficiently high. Liu et al. proposed the min–max approach that linearly combines the minimum and maximum values of biomarkers, which is computationally tractable and has been shown to be optimal in certain scenarios. We developed the Min–Max–Median/IQR algorithm under Youden index optimisation which, although more computationally intensive, is still approachable and includes more information. The aim of this work is to compare the performance of these algorithms with well-known Machine Learning algorithms, namely logistic regression and XGBoost, which have proven to be efficient in various fields of applications, particularly in the health sector. This comparison is performed on a wide range of different scenarios of simulated symmetric or asymmetric data, as well as on real clinical diagnosis data sets. The results provide useful information for binary classification problems of better algorithms in terms of performance depending on the scenario.




Recordtest: an R package to analyze non-stationarity in the extremes based on record-breaking events

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo-Mateo, Jorge
  • Cebrián, Ana C.
  • Asín, Jesús
The study of non-stationary behavior in the extremes is important to analyze data in environmental sciences, climate, finance, or sports. As an alternative to the classical extreme value theory, this analysis can be based on the study of record-breaking events. The R package RecordTest provides a useful framework for non-parametric analysis of non-stationary behavior in the extremes, based on the analysis of records. The underlying idea of all the non-parametric tools implemented in the package is to use the distribution of the record occurrence under series of independent and identically distributed continuous random variables, to analyze if the observed records are compatible with that behavior. Two families of tests are implemented. The first only requires the record times of the series, while the second includes more powerful tests that join the information from different types of records: upper and lower records in the forward and backward series. The package also offers functions that cover all the steps in this type of analysis such as data preparation, identification of the records, exploratory analysis, and complementary graphical tools. The applicability of the package is illustrated with the analysis of the effect of global warming on the extremes of the daily maximum temperature series in Zaragoza, Spain.




Machine learning algorithms combining slope deceleration and fetal heart rate features to predict acidemia

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Esteban, Luis Mariano
  • Castán, Berta
  • Esteban-Escaño, Javier
  • Sanz-Enguita, Gerardo
  • Laliena, Antonio R.
  • Lou-Mercadé, Ana Cristina
  • Chóliz-Ezquerro, Marta
  • Castán, Sergio
  • Savirón-Cornudella, Ricardo
Electronic fetal monitoring (EFM) is widely used in intrapartum care as the standard method for monitoring fetal well-being. Our objective was to employ machine learning algorithms to predict acidemia by analyzing specific features extracted from the fetal heart signal within a 30 min window, with a focus on the last deceleration occurring closest to delivery. To achieve this, we conducted a case–control study involving 502 infants born at Miguel Servet University Hospital in Spain, maintaining a 1:1 ratio between cases and controls. Neonatal acidemia was defined as a pH level below 7.10 in the umbilical arterial blood. We constructed logistic regression, classification trees, random forest, and neural network models by combining EFM features to predict acidemia. Model validation included assessments of discrimination, calibration, and clinical utility. Our findings revealed that the random forest model achieved the highest area under the receiver characteristic curve (AUC) of 0.971, but logistic regression had the best specificity, 0.879, for a sensitivity of 0.95. In terms of clinical utility, implementing a cutoff point of 31% in the logistic regression model would prevent unnecessary cesarean sections in 51% of cases while missing only 5% of acidotic cases. By combining the extracted variables from EFM recordings, we provide a practical tool to assist in avoiding unnecessary cesarean sections.




Statistical analysis of extreme and record-breaking daily maximum temperatures in peninsular Spain during 1960–2021

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo-Mateo, Jorge
  • Cebrián, Ana C.
  • Asín, Jesús
This work analyses the effects of global warming in the upper extremes of daily temperature series over Spain. This objective implies specific analysis, since time evolution of mean temperature is not always parallel to evolution of the extremes. We propose the use of several record tests to study the behavior of the extreme and record-breaking events in different temperature signals, at different time and spatial scales. The underlying idea of the tests is to compare the occurrence of the extreme events in the observed series and the occurrence in a stationary climate. Given that under global warming, an increasing trend, or an increasing variability, can be expected, the alternative is that the probability of the extremes is higher than in a stationary climate. Some of the tests, based on a permutation approach, can be applied to sets of correlated series and this allows the analysis of short periods of time and regional analysis, where series are measured in close days and/or locations. Using these tests, we evaluate and compare the effects of climate change in temperature extreme and record-breaking events using 36 series of daily maximum temperature from 1960 to 2021, all over peninsular Spain. We also compare the behavior in different Spanish regions, in different periods of the year, and in different signals such as the annual maximum temperature. Significant evidences of the effect of an increasing trend in the occurrence of upper extremes are found in most of Spain. The effects are heterogeneous within the year, being autumn the season where the effects are weaker and summer where they are stronger. Concerning the spatial variability, the Mediterranean and the North Atlantic region are the areas where the effects are more and less clear, respectively.




Spatial quantile autoregression for season within year daily maximum temperature data

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo-Mateo, Jorge
  • Asín, Jesús
  • Cebrián, Ana C.
  • Gelfand, Alan E.
  • Abaurrea, Jesús
Regression is the most widely used modeling tool in statistics. Quantile regression offers a strategy for enhancing the regression picture beyond customary mean regression. With time-series data, we move to quantile autoregression and, finally, with spatially referenced time series, we move to space-time quantile regression. Here, we are concerned with the spatiotemporal evolution of daily maximum temperature, particularly with regard to extreme heat. Our motivating data set is 60 years of daily summer maximum temperature data over Aragón in Spain. Hence, we work with time on two scales—days within summer season across years—collected at geocoded station locations. For a specified quantile, we fit a very flexible, mixed-effects autoregressive model, introducing four spatial processes. We work with asymmetric Laplace errors to take advantage of the available conditional Gaussian representation for these distributions. Further, while the autoregressive model yields conditional quantiles, we demonstrate how to extract marginal quantiles with the asymmetric Laplace specification. Thus, we are able to interpolate quantiles for any days within years across our study region.




Assessing space and time changes in daily maximum temperature in the Ebro basin (Spain) using model-based statistical tools

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Cebrián, Ana C.
  • Asín, Jesús
  • Castillo-Mateo, Jorge
  • Gelfand, Alan E.
  • Abaurrea, Jesús
There is continuing interest in the investigation of change in temperature over space and time. For this analysis, we offer statistical tools to illuminate changes temporally, at desired temporal resolution, and spatially, using data generated from suitable space–time models. The proposed tools can be used with the output from any suitable model fitted to any set of spatially referenced time series data. The tools to assess space and time changes include spatial surfaces of probabilities and spatial extents for events defined by exceeding a threshold. The spatial surfaces capture the spatial variation in the probability or risk of an exceedance event, while the spatial extents capture the expected proportion of incidence of an event for a region of interest. This approach is used analyse the changes in daily maximum temperature in an inland Mediterranean region (NE of Spain) in the period 1956–2015. The area is very heterogeneous in orography and climate, including the central Ebro valley and part of the Pyrenees. We use a collection of daily temperature series obtained from simulation under a Bayesian daily temperature model fitted to 18 stations in that area. The results for the summer period show that, although there is an increasing risk in all the events used to quantify the effects of climate change, it is not spatially homogeneous, with the largest increase arising in the centre of the Ebro valley and the Eastern Pyrenees area. The risk of an increase in the average daily maximum temperature from 1966–1975 to 2006–2015 higher than 1°C is higher than 0.5 over all of the region, and close to 1 in the previous areas. The extent of daily maximum temperature higher than the reference mean has increased 3.5% per decade. The mean of the extent indicates that 95% of the area under study has suffered a positive increment of the average temperature, and almost 70% an increment higher than 1°C.




Bayesian joint quantile autoregression

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Castillo-Mateo, Jorge
  • Gelfand, Alan E.
  • Asín, Jesús
  • Cebrián, Ana C.
  • Abaurrea, Jesús
Quantile regression continues to increase in usage, providing a useful alternative to customary mean regression. Primary implementation takes the form of so-called multiple quantile regression, creating a separate regression for each quantile of interest. However, recently, advances have been made in joint quantile regression, supplying a quantile function which avoids crossing of the regression across quantiles. Here, we turn to quantile autoregression (QAR), offering a fully Bayesian version. We extend the initial quantile regression work of Koenker and Xiao (J Am Stat Assoc 101(475):980–990, 2006. https://doi.org/10.1198/016214506000000672) in the spirit of Tokdar and Kadane (Bayesian Anal 7(1):51–72, 2012. https://doi.org/10.1214/12-BA702). We offer a directly interpretable parametric model specification for QAR. Further, we offer a pth-order QAR(p) version, a multivariate QAR(1) version, and a spatial QAR(1) version. We illustrate with simulation as well as a temperature dataset collected in Aragón, Spain.




Routine results of an algorithm for managing the production of blood components

Zaguán. Repositorio Digital de la Universidad de Zaragoza
  • Pérez-Aliaga, Ana Isabel
  • Ayerra, Irene
  • Sánchez-Guillén, Javier
  • López, F. Javier
  • Puente, Fernando
  • Aranda, Alfonso
  • Domingo, José María
  • Garcés, Carmen
Background and Objectives
The variability in the number of donations together with a growing demand for platelet concentrates and plasma‐derived medicines make us seek solutions aimed at optimizing the processing of blood. Some mathematical models to improve efficiencies in blood banking have been published. The goal of this work is to validate and evaluate an algorithm's impact in the production of blood components in the Blood and Tissues Bank of Aragon (BTBA).

Materials and Methods
A mathematical algorithm was designed, implemented and validated through simulations with real data. It was incorporated into the fractionation area, which uses the Reveos® fractionation system (Terumo BCT) to split blood into its components. After 9 months of daily routine validation, retrospective activity data from the Blood Bank and Transfusion Services before and during the use of the algorithm were compared.

Results
Using the algorithm, the outdating rate of platelet concentrates (PC) decreased by 87.8% in the blood bank. The average shelf life remaining of PC supplied to Transfusion Services increased by almost 1 day. As a consequence, the outdating rate in the Aragon Transfusion Network decreased by 33%. In addition, extra 100 litres of plasma were obtained in 9 months.

Conclusions
The algorithm improves the blood establishment's workflow and facilitates the decision‐making process in whole blood processing. It resulted in a decrease in PC outdating rate, increase in PC shelf life and finally an increase in the volume of recovered plasma, leading to significant cost savings.