HUELLA DIGITAL, COMPETITIVIDAD Y DEMOGRAFIA EMPRESARIAL

PID2019-107765RB-I00

Nombre agencia financiadora Agencia Estatal de Investigación
Acrónimo agencia financiadora AEI
Programa Programa Estatal de Generación de Conocimiento y Fortalecimiento Científico y Tecnológico del Sistema de I+D+i
Subprograma Subprograma Estatal de Generación de Conocimiento
Convocatoria Proyectos I+D
Año convocatoria 2019
Unidad de gestión Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020
Centro beneficiario UNIVERSITAT POLITÈCNICA DE VALÈNCIA
Identificador persistente http://dx.doi.org/10.13039/501100011033

Publicaciones

Resultados totales (Incluyendo duplicados): 13
Encontrada(s) 1 página(s)

Digital footprint for tourism research

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Cebrián Cerdá, Eduardo
  • Doménech i de Soria, Josep|||0000-0002-7302-5810
[EN] Tourists leave some digital footprints spread across a wide variety of repositories that can be studied to observe and analyze their behavior. This paper exhaustively analyzes the use of Big Data sources for understanding and predicting the main variables affecting tourist behavior. Analyzed sources include those derived from the tourist activity in the Internet and also some other digital footprint data not related to the Internet activity. The classification of sources is grounded in a model of purchase consumption system applied to leisure travel behavior. This model defines potential predictors on travelers' choices and classifies them in three stages: pre-trip, during-trip and post-trip. Our work classifies the digital footprints left by tourists according to the stage in the model and the variable they help predict or understand. As a result, a complete map of Big Data sources for tourism research is presented. This map evidences not only complementarities among sources, but also potential applications of digital footprint analysis that have not been studied yet., This work has been partially supported by the Spanish National Research Agency (AEI) with the grants PID2019-107765RB-I00 and PEJ2018-003267-A-AR.




Predicting SME's default: Are their websites informative?

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Crosato, Lisa
  • Domenech, Josep|||0000-0002-7302-5810
  • Liberati, Caterina
[EN] We propose the use of online indicators, scraped from the firms¿ websites, to predict default risk for a sample of Spanish firms via nonlinear discriminant analysis and the logistic regression model., This work was partially supported by the Ca' Foscari University of Venice, Italy and by Agencia Estatal de Investigacion, Spain under grant PID2019107765RBI00. We also acknowledge helpful comments by an anonymous referee.




Changes in corporate websites and business activity: automatic classification of corporate webpages

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Valenzuela Rubilar, Joan Manuel
  • Domenech, Josep
  • Pont, Ana
[EN] Every time a firm or institution performs an activity on the Web, this is registered, leaving a "digital footprint”. Part this digital footprint is reflected on their websites as these officially represent them on the Web. We plan to automatically monitor the changes that periodically occur in a website to relate them with the business activity. The aim of this paper is to propose a theoretical classification of corporate webpages to associate changes that occur on them with the regular activity of the firms, and to evaluate the possibility of an automatic categorization using classification models. To generate the classification of corporate webpages, a significant number of today corporate webpages were analyzed and observed, distinguishing four theoretical types of corporate webpages. To evaluate the automatic categorization of corporate webpages, a dataset of 1005 today corporate pages was generated by manually labeling them and evaluating their automatic categorization using classification models., This work was partially supported by grants PID2019-107765RB-I00 and funded by MCIN/AEI/10.13039/501100011033.




Non-conventional data and default prediction: the challenge of companies’ websites

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Crosato, Lisa
  • Domenech, Josep|||0000-0002-7302-5810
  • Liberati, Caterina
[EN] Small and Medium Enterprises (SMEs) contribution to the European Union economy has always been relevant, for both value added and the creation of jobs. That is why the prediction of their survival is considered one of the economic pillars UE keeps under observation. Default prediction models, accounting for SMEs idiosyncratic traits, are based on several types of data, mainly accounting indicators. Balance sheet data, indeed, are considered the standard predictors for classification models in this field, although they do not allow to completely overcome the information opacity that is one of the main barriers preventing these firms from accessing credit. In our work, we explore the possibility of complementing accounting information with data scraped from the firms’ websites. We modeled the data using a nonlinear discriminant analysis and we benchmarked the results with the Logistic Regression. The evidence of our study is promising although the combination of online and offline data shows better results in case of survival firms than for defaulted companies., This work was partially supported by grants PID2019-107765RB-I00 and funded by MCIN/AEI/10.13039/501100011033.




Influence of popularity on the transfer fees of football players

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Malagón Selma, María del Pilar|||0000-0003-3776-9334
  • Debón Aucejo, Ana María|||0000-0002-5116-289X
  • Doménech i de Soria, Josep|||0000-0002-7302-5810
[EN] Search popularity, as reported by Google Trends, has previously been demonstrated to be useful when studying many time series. However, its use in cross-section studies is not straightforward because search popularity is not provided in absolute terms but as a normalized index that impedes comparisons. This paper proposes a novel methodology for calculating popularity indicators obtained from Google Trends to improve the prediction of football players' transfer fees. The database is formed by 1428 players who competed in LaLiga, Premier League, Bundesliga, Serie A, and Ligue 1 on the 2018-2019 season. Random forest algorithm and multiple linear regression are used to study the popularity indicators' importance and significativity, respectively. Results showed that the proposed popularity indicators provide significant information to predict players’ transfer fees, as models including such popularity indicators had lower prediction error than those without them.  This study's developed method could be used not only for analysts specialized in sports data analysis but for researchers of other fields., This work was partially supported by grants PID2019-107765RB-I00 and funded by MCIN/AEI/10.13039/501100011033.




Simulating the inconsistencies of Google Trends data

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Cebrián, Eduardo|||0000-0001-7244-424X
  • Doménech i de Soria, Josep|||0000-0002-7302-5810
[EN] Google Trends (GT) allows users to obtain reports of the evolution of the popularity of searchers made through the Google Search engine. Its main output is the Search Volume Index (SVI), a relative measure of the popularity of a term, which is computed using a sample of the searches. Due to the sampling error, the reports are not completely consistent, as the same query produces different time series that can widely change from day to day. This paper simulates the process of generating the SVI time series in the same way as GT does. By doing this, it has been shown that the sampling error could be an important issue if the popularity of the term under study is relatively low. Averaging multiple extractions from GT can only partially alleviate this., This work was partially supported by grants PID2019-107765RB-I00 and funded by
MCIN/AEI/10.13039/501100011033.




Is Google Trends a quality data source?

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Cebrián, Eduardo
  • Domenech, Josep|||0000-0002-7302-5810
[EN] Google Trends (GT) has become a popular data source among researchers in a wide variety of fields. In economics, its main use has been to forecast other economic variables such as tourism demand, unemployment or sales. This paper questions the quality of these data by discussing the main data quality aspects according to the literature. Our analysis evidences some non-negligible issues related to the measurement accuracy of GT, which potentially affects the results obtained with GT data and therefore the decisions made with this information. These issues are illustrated with an example in which some queries to GT are repeated on six different days., This work was supported by the Ministry of Finance, Industry and Competitiveness and the European Social Fund [PEJ2018-003267-A-AR]; Agencia Estatal de Investigacion [PID2019-107765RB-I00].




Website indicators on textile companies in the Comunitat Valenciana region

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Doménech i de Soria, Josep|||0000-0002-7302-5810
  • García-Bernabeu, Ana|||0000-0003-3181-7745
  • Díaz García, Pablo|||0000-0002-7093-6061
Sustainability website indicators after scraping the websites of textile companies in the Comunitat Valencia region.




Going online: Forecasting the impact of websites on productivity and market structure

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Rizov, Marian
  • Vecchi, Michela
  • Domenech, Josep|||0000-0002-7302-5810
[EN] We develop a unifying framework to investigate the effects of firms' internet presence on productivity and market structure. Using information on website adoption as an indicator of online trading, we treat the decision of entering an e-commerce market equivalent to the decision of entering a foreign market. Our theoretical framework draws from a dynamic model of international trade, which accounts for firms' heterogeneity in productivity levels and in the returns to productivity enhancing investments. We test the predictions of our model using UK and Spanish company account data, over the 1995¿2010 period merged with information of companies' online status. The period analysed is associated with the early stage of internet diffusion and our sample countries represent fast (the UK) and slow (Spain) diffusion. Our results show that website adoption is associated with higher productivity growth and with a reduction in market concentration in both countries. The increase in competition operates via a negative selection mechanism, whereby productivity growth is inversely related to the pre-entry productivity levels. We also find that productivity gains decline over time., Josep Domènech acknowledges that this research was partially funded by MCIN/AEI/10.13039/501100011033 under grant PID2019- 107765RB-I00




Productivity, Digital Footprint and Sustainability in the Textile and Clothing Industry

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Domenech, Josep|||0000-0002-7302-5810
  • Garcia-Bernabeu, Ana|||0000-0003-3181-7745
  • Diaz-Garcia, Pablo|||0000-0002-7093-6061
[EN] In recent years, there has been a shift from the linear economic model on which the textile and clothing industry is based to a more sustainable model. However, to date, limited research on the relationship between sustainability commitment and firm productivity has focused on the textile and clothing industry. This study addresses this gap and aims to explore whether the digital footprint of small and medium-sized textile companies in terms of their sustainable performance is related to their productivity. To this end, the paper proposes an innovative model to monitor the companies’ commitment to sustainable issues by analyzing online data retrieved from their corporate websites. This information is merged with balance sheet data to examine the impact of sustainability practices, capital and human capital on productivity. The estimated firm’s total factor productivity is explained as a function of the sustainability digital footprint measures and additional control variables for a sample of 315 textile firms located in the region of Comunidad Valenciana, Spain., This work was partially funded by MCIN/AEI/10.13039/501100011033 under grant PID2019-107765RB-I00.




Can websites reveal a firm’s innovativeness? Empirical evidence on Italian manufacturing SMEs

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Bottai, Carlo|||0000-0002-2878-9360
  • Crosato, Lisa|||0000-0002-3415-656X
  • Domenech, Josep|||0000-0002-7302-5810
  • Guerzoni, Marco|||0000-0001-7415-0771
  • Liberati, Caterina|||0000-0001-9910-4018
[EN] Research in innovation usually builds on conventional data such as balance sheets, surveys, patents, or product catalogs. This paper intends to explore unconventional data, specifically web-scraped data, as an information source for innovation studies, proposing a careful procedure to establish the veracity of the linkage between web-based data and firm-level information retrieved from conventional sources. The study regards a sample of Italian manufacturing small and medium enterprises active in 2016, comprehending both innovative and non-innovative firms. It is based on HTML tags, whilst most of the previous literature worked on the web-pages text and related semantics. Our paper provides evidence that the way HTML language is applied to build a corporate website unveils the capabilities of the owner firm, helping to distinguish innovative from non-innovative SMEs., We thank the Italian Ministry of University and Research (MUR) for sponsoring this work under the ‘Departments of Excellence 2018-2022’ funding schema, and the DEMS Data Science Lab of the University of Milano–Bicocca for computational resources. Josep Domenech acknowledges that this research was partially funded by MCIN/AEI/10.13039/501100011033 under grant PID2019-107765RB-I00.




An estimate of the Italian Consumer Confidence Index at regional level using Google Trends data

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Doménech i de Soria, Josep|||0000-0002-7302-5810
  • Marletta, Andrea
[EN] Data about consumer confidence indices are often used as a gauge of the entire economy of a country. In Italy, this information is collected by Istat and it is available at national level and at the first sub-level, the geographic area, but not at the regional level. Previous research has demonstrated that the volume of some Google searches are correlated with the consumer confidence. Since Google Trends data are available both at national and regional level, the aim of this paper is to explore they can be combined with the data offered by Istat to obtain an estimate for consumer confidence indices at the second sub-level, i.e., the regional area. To this end, a set of search topics and words have been selected as potential predictors to acquire more information about consumer confidence indices for 20 Italian regions from 2007 to 2022. The obtained regional estimates are in line with the geographic area being successful to identify the periods of economic crisis due to the 2008 financial crisis and the 2020 Covid-19 pandemic., This work was partially funded by MCIN/AEI/10.13039/501100011033 under grant PID2019-107765RB-I00.




Websites data: a new asset for enhancing credit risk modeling

RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
  • Crosato, Lisa
  • Domenech, Josep|||0000-0002-7302-5810
  • Liberati, Caterina
[EN] Recent literature shows an increasing interest in considering alternative sources of information for predicting Small and Medium Enterprises default. The usage of accounting indicators does not allow to completely overcome the information opacity that is one of the main barriers preventing these firms from accessing to credit. This complicates matters both for private lenders and for public institutions supporting policies. In this paper we propose corporate websites as an additional source of information, ready to be exploited in real-time. We also explore the joint use of online and offline data for enhancing correct prediction of default through a Kernel Discriminant Analysis, keeping the Logistic Regression and the Random Forests as benchmark. The obtained results shed light on the potentiality of these new data when accounting indicators lead to a wrong prediction., This work was supported by grant PID2019-107765RB-I00 (funded by MCIN/AEI/10.13039/501100011033). The Open Access publishing option was funded by the Ca Foscari University of Venice, Italy.