| Literature DB >> 34802267 |
Qingpeng Zhang1, Jianxi Gao2, Joseph T Wu3, Zhidong Cao4,5, Daniel Dajun Zeng4,5.
Abstract
During the COVID-19 pandemic, more than ever, data science has become a powerful weapon in combating an infectious disease epidemic and arguably any future infectious disease epidemic. Computer scientists, data scientists, physicists and mathematicians have joined public health professionals and virologists to confront the largest pandemic in the century by capitalizing on the large-scale 'big data' generated and harnessed for combating the COVID-19 pandemic. In this paper, we review the newly born data science approaches to confronting COVID-19, including the estimation of epidemiological parameters, digital contact tracing, diagnosis, policy-making, resource allocation, risk assessment, mental health surveillance, social media analytics, drug repurposing and drug development. We compare the new approaches with conventional epidemiological studies, discuss lessons we learned from the COVID-19 pandemic, and highlight opportunities and challenges of data science approaches to confronting future infectious disease epidemics. This article is part of the theme issue 'Data science approaches to infectious disease surveillance'.Entities:
Keywords: COVID-19; big data; data science; infectious disease; mathematical modelling
Mesh:
Year: 2021 PMID: 34802267 PMCID: PMC8607150 DOI: 10.1098/rsta.2021.0127
Source DB: PubMed Journal: Philos Trans A Math Phys Eng Sci ISSN: 1364-503X Impact factor: 4.226
Data-driven COVID-19 publications that we reviewed.
| section | data | publication |
|---|---|---|
| modelling human mobility | human movement data | [ |
| migration data | [ | |
| nationwide census mobility fluxes | [ | |
| open source anonymized human movement data (Baidu migration data) | [ | |
| aggregated mobile phone users data (provided by SafeGraph) | [ | |
| anonymized daily mobile phone location data | [ | |
| teralytics | [ | |
| national census data | [ | |
| mobile phone, census and demographic data | [ | |
| open government data and Google’s Community Mobility Report | [ | |
| digital transactions for transport | [ | |
| Google’s Community Mobility Reports | [ | |
| near-real-time Italian mobility dataset provided by Facebook | [ | |
| air transportation and ground mobility | [ | |
| global air travel data | [ | |
| mobile phone data | [ | |
| manual and digital contact tracing | manual contact tracing data in Shenzhen City, China | [ |
| survey data for Wuhan City and Shanghai City and manual contact tracing data in Hunan Province | [ | |
| digital contact tracing techniques | [ | |
| manual and digital contact tracing data | [ | |
| online panel survey with mobile tracking data | [ | |
| empirical evaluation of government responses | Governments’ response | [ |
| local/regional/national NPIs data | [ | |
| Governments’ response data in Germany | [ | |
| assessing the economic, trade and supply chain impact | Global Trade Analysis Project (GTAP) dataset | [ |
| UN Comtrade dataset | [ | |
| World input–output database | [ | |
| data provided by the Central Bank of the Republic of Turkey | [ | |
| data provided by a major bank in Denmark | [ | |
| mining patient data | individual-level patient data from official reports in China | [ |
| testing data provided by the Israeli Ministry of Health | [ | |
| screening data | [ | |
| EHR data | [ | |
| clinical and laboratory variables | [ | |
| chest X-ray images and routine clinical variables | [ | |
| computed tomography images | [ | |
| potential imaging biomarkers of the CXR radiographs | [ | |
| surveys and suicide records | [ | |
| drug repurposing and development | protein interaction map | [ |
| protein–protein interactions (PPI) dataset | [ | |
| experimentally derived PPI data | [ | |
| databases of drugs, genes, proteins, viruses, diseases, symptoms and their linkages | [ | |
| substructure-gene and gene-gene associations | [ | |
| mining scientific literature | COVID-19 scientific literature dataset | [ |
| information retrieval test collections TREC-COVID | [ | |
| question answering dataset | [ | |
| questions from FAQ sections of the Center for Disease Control | [ | |
| PubMed citation database | [ | |
| Social media analytics and Web mining | information-seeking behaviours | [ |
| Internet searches (Google Trends) | [ | |
| Internet searches and social media data | [ | |
| social media discussions | [ | |
| Google search | [ | |
| international survey of risk perception of COVID-19 | [ | |
| COVID-19 misinformation | [ |
Figure 1.Geographical distribution of the 7.55 million agents and facilities in Hong Kong. Layer 1 represents the distribution of schools. Layer 2 represents the population distribution. Layer 3 represents the locations of entertainment sites. Credit: Zhou et al. [23]. (Online version in colour.)
Figure 2Three typical digital contact tracing apps: (a) Apple’s Exposure Notification function (Bluetooth-based). (b) TraceTogether system in Singapore (Bluetooth-based). (c) Health Code system in Mainland China (Mandatory manual input), (d) LeaveHomeSafe system in Hong Kong (voluntary manual input). (Online version in colour.)
Figure 3.Motifs-of-interest for drug repurposing in a knowledge graph: a knowledge graph is a multi-relational graph composed of entities and relations. Each entity represents a specific protein, gene, drug, virus, disease or symptom and each relation represents a known existing linkage between any two entities. A motif is a connected subgraph representing fundamental building block of the knowledge graphs. Motifs-of-interest are defined based on their importance to the drug repurposing task. Motif-clique discovery algorithms are used to extract these defined motifs-of-interest. Credit: Yan et al./Wiley [62]. (Online version in colour.)
Figure 4.An example of the answers and summary provided by CAiRE-COVID. Screenshot taken by searching ‘What do we know about asymptomatic transmission of COVID-19?’ on CAiRE-COVID [72]. (Online version in colour.)
Figure 5.Knowledge transfer from the disciplines of the papers cited by the papers we reviewed (down) to the disciplines of papers citing the papers we reviewed (up). The size of arrows represents the frequency. (Online version in colour.)
Figure 6.The count of top 20 disciplines (excluding Multidisciplinary Sciences) of (a) the papers cited by the papers we reviewed, and (b) the papers citing the papers we reviewed. The orange bars represent disciplines other than medicine, biology and public health disciplines. (Online version in colour.)