Literature DB >> 28054008

Infodemiological data concerning silicosis in the USA in the period 2004-2010 correlating with real-world statistical data.

Nicola Luigi Bragazzi1, Guglielmo Dini2, Alessandra Toletone3, Francesco Brigo4, Paolo Durando5.   

Abstract

This article reports data concerning silicosis-related web-activities using Google Trends (GT) capturing the Internet behavior in the USA for the period 2004-2010. GT-generated data were then compared with the most recent available epidemiological data of silicosis mortality obtained from the Centers for Disease Control and Prevention for the same study period. Statistically significant correlations with epidemiological data of silicosis (r=0.805, p-value <0.05) and other related web searches were found. The temporal trend well correlated with the epidemiological data, as well as the geospatial distribution of the web-activities with the geographic epidemiology of silicosis.

Entities:  

Keywords:  Infodemiology and infoveillance; Internet; Occupational medicine and hygiene; Web 2.0; Work-related diseases

Year:  2016        PMID: 28054008      PMCID: PMC5198853          DOI: 10.1016/j.dib.2016.11.021

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data Google Trends (GT)-based data (infodemiological data) could be useful for scientific community, researchers and occupational physicians in that they show good correlation with “real world” data obtained from the Centers for Disease Control and Prevention site, thus proving to be reliable. These data could be further statistically processed, analyzed, refined and validated in such a way to complement traditional surveillance of silicosis, providing data quicker and in real time. These data could be used to understand occupational diseases-related web activities. To our knowledge, this is the first analysis of web search behavior related to an occupational disease, namely silicosis, carried out both in quantitative and qualitative terms.

Data

This article contains infodemiological data on silicosis searched in the USA in the study period 2004–2010, obtained from Google Trends (GT) (Fig. 1). These data well correlated with “real-world” data obtained from the Centers for Disease Control and Prevention (CDC) site for the same study period (Table 1, Table 2, Table 3).
Fig. 1

Google Trends-generated heat-map showing the regional interest for silicosis in the USA. In particular, it can be noticed that silicosis-related web searches are concentrated in some counties (namely, California, Texas, New York, Pennsylvania, and Virginia).

Table 1

Pearson׳s correlation between Google Trends-based data and epidemiological data in the study period 2004–2010.

VariableGT-based silicosis (Disease)GT-based silicosis (search term)
Gender









FemaleCorrelation coefficient−0.145−0.144
Significance level P0.75620.7588









MaleCorrelation coefficient0.7780.765
Significance level P0.03940.0453



Ethnicities









WhiteCorrelation coefficient0.7130.696
Significance level P0.07200.0825









 FemaleCorrelation coefficient0.010−0.001
Significance level P0.98320.9980









 MaleCorrelation coefficient0.7670.755
Significance level P0.04410.0498









BlackCorrelation coefficient0.8410.847
Significance level P0.01770.0162









 FemaleCorrelation coefficient−0.176−0.162
Significance level P0.70660.7281









 MaleCorrelation coefficient0.8550.859
Significance level P0.01430.0132









OtherCorrelation coefficient−0.135−0.162
Significance level P0.77310.7286









 FemaleCorrelation coefficient−0.292−0.254
Significance level P0.52490.5833









 MaleCorrelation coefficient−0.019−0.055
Significance level P0.96760.9074



Adjusted white









 FemaleCorrelation coefficient−0.015−0.007
Significance level P0.97510.9876









 MaleCorrelation coefficient0.7870.778
Significance level P0.03570.0396



Adjusted black









 FemaleCorrelation coefficient−0.155−0.149
Significance Level P0.73960.7507









 MaleCorrelation coefficient0.8640.867
Significance level P0.01220.0116



Adjusted other









 FemaleCorrelation coefficient−0.292−0.254
Significance level P0.52490.5833









 MaleCorrelation coefficient0.0300.004
Significance level P0.94900.9939









Adjusted overallCorrelation coefficient0.8230.816
Significance level P0.02310.0253



Age









 age 15–24Correlation coefficient−0.070−0.018
Significance level P0.88130.9695









 age 25–34Correlation coefficient−0.657−0.656
Significance level P0.10910.1092









 age 35–44Correlation coefficient0.5010.533
Significance level P0.25200.2179









 age 45–54Correlation coefficient0.3080.278
Significance Level P0.50170.5466









 age 55–64Correlation coefficient0.4570.468
Significance level P0.30310.2898









 age 65–74Correlation coefficient0.6190.622
Significance level P0.13790.1357









 age 75–84Correlation coefficient0.7010.677
Significance level P0.07920.0949









 age 85-onCorrelation coefficient0.4620.442
Significance level P0.29660.3208









 No. >45Correlation coefficient0.7470.730
Significance level P0.05350.0623









 No. 15–44Correlation coefficient0.2910.334
Significance level P0.52620.4636









UnderlyingCorrelation coefficient−0.850−0.832
Significance level P0.01540.0203









Number of deathsCorrelation coefficient0.7590.746
Significance level P0.04770.0542









Death rateCorrelation coefficient0.8050.794
Significance level P0.02910.0329

Statistically significant, with p-value <0.05.

Table 2

Pearson׳s correlation between GT-based data and clinical symptoms/diseases associated with silicosis.

VariableGT-based silicosis (Disease)GT-based silicosis (search term)
Associated diseases









Lung cancerCorrelation coefficient0.7140.740
Significance level P0.07120.0574









Laryngeal cancerCorrelation coefficient−0.749−0.786
Significance level P0.05260.0360









Rheumatoid arthritisCorrelation coefficient0.7930.767
Significance level P0.03330.0443









Systemic Lupus ErythematosusCorrelation coefficient0.8690.865
Significance level P0.01120.0120









SclerodermaCorrelation coefficient0.9180.934
Significance level P0.00350.0021









TubercolosisCorrelation coefficient0.0830.106
Significance level P0.85880.8217



Symptoms









AnorexiaCorrelation coefficient0.2200.184
Significance level P0.63480.6931









CoughCorrelation coefficient−0.740−0.770
Significance level P0.05710.0429









DyspneaCorrelation coefficient−0.725−0.757
Significance level P0.06540.0490









FatigueCorrelation coefficient−0.576−0.612
Significance level P0.17560.1438









FeverCorrelation coefficient−0.848−0.869
Significance level P0.01580.0110









Respiratory failureCorrelation coefficient−0.939⁎⁎−0.939⁎⁎
Significance level P0.00170.0017









TachipneaCorrelation coefficient−0.937⁎⁎−0.941⁎⁎
Significance level P0.00180.0016

Statistically significant, with p-value <0.05;

Statistically significant, with p-value <0.01.

Table 3

Pearson׳s correlation between GT-based data concerning clinical symptoms/diseases associated with silicosis and silicosis epidemiological data (namely, death rate and number of deaths) in the study period 2004–2010.

VariableGT-based silicosis (Disease)GT-based silicosis (search term)
Associated diseases









Lung cancerCorrelation coefficient0.7360.697
Significance level P0.05950.0818









Laryngeal cancerCorrelation coefficient−0.680−0.628
Significance level P0.09290.1308









Rheumatoid arthritisCorrelation coefficient0.4760.445
Significance level P0.27970.3165









Systemic Lupus ErythematosusCorrelation coefficient0.4550.399
Significance level P0.30510.3755









SclerodermaCorrelation coefficient0.8610.823
Significance level P0.01290.0230









TubercolosisCorrelation coefficient−0.007−0.030
Significance level P0.98790.9484



Symptoms









AnorexiaCorrelation coefficient−0.161−0.175
Significance level P0.72990.7080









CoughCorrelation coefficient−0.817−0.784
Significance level P0.02470.0370









DyspneaCorrelation coefficient−0.790−0.754
Significance level P0.03470.0503









FatigueCorrelation coefficient−0.753−0.729
Significance level P0.05050.0632









FeverCorrelation coefficient−0.820−0.776
Significance level P0.02400.0401









Respiratory failureCorrelation coefficient−0.864−0.825
Significance level P0.01210.0225









TachipneaCorrelation coefficient−0.902⁎⁎−0.867
Significance level P0.00540.0115

Statistically significant, with p-value <0.05;

Statistically significant, with p-value <0.01.

Experimental design, materials and methods

GT (available at https://www.google.com/trends) was exploited in order to capture Internet activities and interest related to silicosis. GT was mined in the USA, looking for “silicosis” as keyword, and using both “search term” (data directly available at https://www.google.com/trends/explore?date=2004-01-01%202010-12-31&geo=US&q=Silicosis) and “search topic” [Disease] (data directly available at https://www.google.com/trends/explore?date=2004-01-01%202010-12-31&geo=US&q=%2Fm%2F02yw8n) as search strategy options, from 2004 to 2010. Data downloadable from GT are available as monthly data, in comma-separated values (CSV) format. “Real-world” statistical data, both raw and adjusted, were collected from the CDC site for the same study period 2004–2010 [1], [2], [3], [4], [5]. Correlational analysis was carried out between the GT-based search volumes and the “real-world” statistical data about silicosis. A list of silicosis-related terms (clinical symptoms and other associated diseases) was further searched and their flux volumes were correlated with the silicosis hit-search data and the epidemiological data (namely, death rate and number of deaths). All statistical analyses were carried out using the Statistical Package for Social Science version 23.0 (SPSS, IBM, IL, USA) and STATISTICA version 12 (StatSoft Inc., Tulsa, OK, USA). Figures with a p-value <0.05 were considered significant. For further details, the reader is referred to [6].
Subject areaMedicine
More specific subject areaOccupational medicine
Type of dataFigure, tables
How data was acquiredOutsourcing of Google Trends site and the Centers for Disease Control and Prevention (CDC) site
Data formatRaw, analyzed
Experimental factorsGoogle Trends search volumes were obtained through graphs and heat-maps
Experimental featuresValidation of Google Trends-based data with “real-world” data taken from the CDC site was performed by means of correlational analysis
Data source locationUSA
Data accessibilityData are within this article
  2 in total

1.  Silicosis mortality trends and new exposures to respirable crystalline silica - United States, 2001-2010.

Authors:  Ki Moon Bang; Jacek M Mazurek; John M Wood; Gretchen E White; Scott A Hendricks; Ainsley Weston
Journal:  MMWR Morb Mortal Wkly Rep       Date:  2015-02-13       Impact factor: 17.586

2.  Leveraging Big Data for Exploring Occupational Diseases-Related Interest at the Level of Scientific Community, Media Coverage and Novel Data Streams: The Example of Silicosis as a Pilot Study.

Authors:  Nicola Luigi Bragazzi; Guglielmo Dini; Alessandra Toletone; Francesco Brigo; Paolo Durando
Journal:  PLoS One       Date:  2016-11-02       Impact factor: 3.240

  2 in total
  5 in total

1.  An infodemiological investigation of the so-called "Fluad effect" during the 2014/2015 influenza vaccination campaign in Italy: Ethical and historical implications.

Authors:  Naim Mahroum; Abdulla Watad; Roberto Rosselli; Francesco Brigo; Valentina Chiesa; Anna Siri; Dana Ben-Ami Shor; Mariano Martini; Nicola Luigi Bragazzi; Mohammad Adawi
Journal:  Hum Vaccin Immunother       Date:  2018-02-15       Impact factor: 3.452

2.  Assessing the Methods, Tools, and Statistical Approaches in Google Trends Research: Systematic Review.

Authors:  Amaryllis Mavragani; Gabriela Ochoa; Konstantinos P Tsagarakis
Journal:  J Med Internet Res       Date:  2018-11-06       Impact factor: 5.428

3.  Integrated Analysis of lncRNA and mRNA Transcriptomes Reveals New Regulators of Ubiquitination and the Immune Response in Silica-Induced Pulmonary Fibrosis.

Authors:  Yao Zhou; Li He; Xiao-Dan Liu; Hua Guan; Ying Li; Rui-Xue Huang; Ping-Kun Zhou
Journal:  Biomed Res Int       Date:  2019-01-13       Impact factor: 3.411

4.  Big Data in occupational medicine: the convergence of -omics sciences, participatory research and e-health.

Authors:  Guglielmo Dini; Nicola Luigi Bragazzi; Alfredo Montecucco; Alessandra Toletone; Nicoletta Debarbieri; Paolo Durando
Journal:  Med Lav       Date:  2019-04-19       Impact factor: 1.275

5.  Discrepancies Between Classic and Digital Epidemiology in Searching for the Mayaro Virus: Preliminary Qualitative and Quantitative Analysis of Google Trends.

Authors:  Mohammad Adawi; Nicola Luigi Bragazzi; Abdulla Watad; Kassem Sharif; Howard Amital; Naim Mahroum
Journal:  JMIR Public Health Surveill       Date:  2017-12-01
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.