| Literature DB >> 34112939 |
Ania Syrowatka1,2, Masha Kuznetsova3, Ava Alsubai4, Adam L Beckman5,3, Paul A Bain6, Kelly Jean Thomas Craig7, Jianying Hu8, Gretchen Purcell Jackson7,9, Kyu Rhee7,10, David W Bates4,5,11.
Abstract
Artificial intelligence (AI) represents a valuable tool that could be widely used to inform clinical and public health decision-making to effectively manage the impacts of a pandemic. The objective of this scoping review was to identify the key use cases for involving AI for pandemic preparedness and response from the peer-reviewed, preprint, and grey literature. The data synthesis had two parts: an in-depth review of studies that leveraged machine learning (ML) techniques and a limited review of studies that applied traditional modeling approaches. ML applications from the in-depth review were categorized into use cases related to public health and clinical practice, and narratively synthesized. One hundred eighty-three articles met the inclusion criteria for the in-depth review. Six key use cases were identified: forecasting infectious disease dynamics and effects of interventions; surveillance and outbreak detection; real-time monitoring of adherence to public health recommendations; real-time detection of influenza-like illness; triage and timely diagnosis of infections; and prognosis of illness and response to treatment. Data sources and types of ML that were useful varied by use case. The search identified 1167 articles that reported on traditional modeling approaches, which highlighted additional areas where ML could be leveraged for improving the accuracy of estimations or projections. Important ML-based solutions have been developed in response to pandemics, and particularly for COVID-19 but few were optimized for practical application early in the pandemic. These findings can support policymakers, clinicians, and other stakeholders in prioritizing research and development to support operationalization of AI for future pandemics.Entities:
Year: 2021 PMID: 34112939 PMCID: PMC8192906 DOI: 10.1038/s41746-021-00459-8
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Study selection flow diagram.
Modified Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram showing disposition of articles.
Number of manuscripts included in the in-depth review by use case and respiratory pandemic or SARS global outbreaka.
| Use case | Hypothetical pandemic | SARS (2003) | H1N1 (2009) | COVID-19 peer-reviewedb | COVID-19 preprintb | Total |
|---|---|---|---|---|---|---|
| (1) Forecasting infectious disease dynamics and effects of interventions | 1 | 2 | 4 | 4 | 29 | 40 |
| (2) Surveillance and outbreak detection | 1 | 1 | 11 | – | 3 | 16 |
| (3) Real-time monitoring of adherence to public health recommendations | – | – | – | – | 1 | 1 |
| (4) Real-time detection of influenza-like illness | – | 6 | – | – | 2 | 8 |
| (5) Triage and timely diagnosis of infections | – | 4 | 4 | 8 | 71 | 87 |
| (6) Prognosis of illness and response to treatment | – | – | – | 4 | 27 | 31 |
| Structured reviews covering multiple use cases | – | – | – | 2 | 1 | 3 |
| Emerging areas beyond management of infection | – | – | – | 2 | 4 | 6 |
COVID-19 coronavirus disease 2019, H1N1 pandemic influenza A subtype H1N1, SARS severe acute respiratory syndrome.
aA manuscript may cover multiple use cases.
bAt the time of database searches (May 4, 2020).
Areas where machine learning could be leveraged for improving the accuracy of estimations or projections and potential data sources identified through the review of traditional approaches.
| Use cases | Data sources |
|---|---|
| Infectious disease dynamics: R0, reproduction number1; peak time, intensity, and duration2; proportion of asymptomatic infections3; transmission rate4; household transmission5; spatiotemporal dynamics with GIS6; population immunity7 | Published literature: seroconversion40; vaccine effectiveness42; estimates from other diseases (e.g., SARS)46 |
| Publicly available data: COVID-19 global cases from the Center for Systems Science and Engineering at Johns Hopkins University47; Worldometer COVID-19 pandemic updates48 | |
| Health outcomes: infections8; severe cases9; susceptibility10; deaths11; latency and infectious period12; undetected cases13; infections among healthcare workers14; infections among incarcerated populations15; infections among homeless populations16 | |
| Government data: government reported case data1; census data49; World Health Organization case reports and data on respiratory risk factors50; Ministry of Civil Aviation (i.e., airline passenger data)6; airport transportation data51; Armed Forces health surveillance data52; board of education school data (e.g., absentee reports that document H1N1)53; hospital/outpatient sentinel surveillance21 | |
| Impact on healthcare systems and demand for: inpatient beds17; intensive care unit beds18; N95 respirators and surgical masks19; ventilators18; medical supplies20; staffing21 | |
| Effects of NPIs: social distancing22; self-isolation/quarantining2; workplace closures23; contact tracing24; wearing masks by general population25; handwashing26; optimal assay test pooling strategies for efficient testing27; frequency of routine testing for COVID-19 in high-risk environments to reduce workplace outbreaks28; mass testing/using drones to deliver tests29; periodic testing of health workforce30; impact of NPIs in residential care facilities31; closing borders32; effectiveness of airport thermal screening33; restrictions to sea, land and air travel34; school closures35; control measures for children if schools open (e.g., better ventilation, mask wearing)36; public risk communication37; media/news reports37; impact of available information on behavior and vaccination uptake38; ventilation of indoor spaces39 | |
| Mobile phone mobility data54 | |
| Daily news reports37 | |
| Purchasing information55 | |
| Google mobility reports using five different categories: retail and recreation, grocery and pharmacy, transit stations, workplace, and residential56 | |
| Google query and news data57 | |
| Google Flu Trends data58 | |
| Apple Maps COVID-19 mobility trends59 | |
| Publicly available data on infection rates from cruise companies60,61 | |
| Impact of lifting NPI-related restrictions40 | |
| Effects of pharmaceutical interventions: use of antiviral medications41; vaccination strategies42; disease spread given vaccination availability constraints43; transmission risk at immunization clinics44 | National Oceanic and Atmospheric Administration: temperature, humidity, wind speed62,63 |
| Home television watching (i.e., proxy for time spent at home)64 | |
| Humanitarian assistance such as food distribution planning45 |
1-64Provided in supplementary references.
COVID-19 coronavirus disease 2019, GIS geographic information system, H1N1 pandemic influenza A subtype H1N1, NPI non-pharmaceutical intervention, SARS severe acute respiratory syndrome.
Fig. 2Example of projected hospital resource use for COVID-19 patients in the USA using traditional modeling approaches.
The image shows forecasts for use of hospital beds, intensive care beds and ventilators over the next four months from the Institute for Health Metrics and Evaluation[124]. Image courtesy of the University of Washington, available under Public License and used with permission.
Machine learning approaches explored in response to past or hypothetical pandemics and the SARS global outbreak by use case.
| ➢ In 2005, following the SARS global outbreak, two studies used neural networks to project the number of infections, deaths, hospital admissions, and/or effects of non-pharmaceutical interventions[ |
| ➢ Between 2014 and 2018, three studies applied fuzzy logic to forecast effects of non-pharmaceutical interventions or optimal vaccination strategies using historical H1N1 data[ |
| ➢ In 2018, various types of neural networks were used to estimate the reproductive number for a compartmental model using H1N1 data[ |
| ➢ In 2012, a support vector machine was used to analyze Tweets posted during the H1N1 pandemic to inform forecasts of influenza-like illness[ |
| ➢ In response to SARS, a 2004 conference proceeding described a prototype of a system to mine global news, weblogs, and other websites, then analyze and summarize the content to provide timely outbreak alerts[ |
| ➢ Two studies applied various ML algorithms to mine emergency department electronic health record data for surveillance of influenza-like illness during the H1N1 pandemic using historical data[ |
| ➢ Various studies used ML to mine and analyze Tweets from 2009–2010[ |
| ➢ A similar approach was applied in 2015 to monitor the re-emergence of the H1N1 pandemic strain based on Tweets and web searches[ |
| ➢ A systematic review summarized the literature on the use of social media to track pandemics and highlighted that support vector machines were commonly used to classify Tweets for syndromic surveillance[ |
| ➢ In 2016, an article described a smartphone application that used infrared thermal images for fever detection and interpreted audio using a k-nearest neighbors approach to identify coughing in public spaces[ |
| ➢ No relevant studies were identified. |
| ➢ In 2005, neural network-based algorithms were developed in response to SARS to interpret thermal imaging for mass temperature screening[ |
| ➢ In 2017, a study used radar to measure heart and respiratory rates coupled with thermal imaging interpreted using neural networks and k-means clustering to identify febrile air passengers with 98% sensitivity[ |
| ➢ In 2005, a study applied various ML algorithms to distinguish between SARS and typical pneumonia using chest X-rays. The best performing model was able to appropriately classify 71% of SARS patients[ |
| ➢ In 2011, a National Institutes of Health study used a support vector machine to differentiate H1N1 pneumonia from healthy lungs as well as other infectious and chronic lung conditions based on CT images with AUCs >0.99; however, the study only included images from four patients diagnosed with H1N1[ |
| ➢ In 2006, a hierarchical fuzzy signature was used to accurately distinguish SARS patients from healthy individuals as well as those with pneumonia or hypertension based on vital signs and clinical symptoms[ |
| ➢ In 2014, two studies applied case-based reasoning using a neural network and/or nearest neighbors approach to identify patients with H1N1 based on symptoms with accuracies of 95% and 86%[ |
| ➢ In 2014, a study used random forests and boosted regression trees to identify risk factors associated with contracting the H1N1 pandemic strain in the following 2010–2011 influenza season[ |
| ➢ In 2019, a recurrent neural network and decision tree-based model was proposed for detection of SARS. The system was able to identify cases using user-generated text and geospatial information with 91% accuracy[ |
| ➢ No relevant studies were identified. |
AUC area under the curve, COVID-19 coronavirus disease 2019, CT computed tomography, H1N1 pandemic influenza A subtype H1N1; SARS severe acute respiratory syndrome.
Fig. 3Example of a computer vision solution for real-time monitoring of adherence to social distancing[26].
The image shows the movement of people through a public space and estimates compliance with distancing by at least 6 feet. Image courtesy of Aura Vision, used with permission.
Characteristics of studies that developed machine learning-based algorithms and tools for COVID-19 diagnosis or estimation of disease severity based solely on chest imaging (n = 65).
| Characteristic | % | |
|---|---|---|
| Purposea | ||
| Image segmentation only | 2 | 3 |
| Detection of COVID-19 | 56 | 86 |
| Estimation of COVID-19 severity | 9 | 14 |
| Type of chest imaginga | ||
| CT | 35 | 54 |
| X-ray | 32 | 49 |
| Data sources | ||
| Private medical imaging data | 23 | 35 |
| Publicly available data | 42 | 65 |
| Machine learning modelsa,b | ||
| Convolutional neural network | 58 | 89 |
| Support vector machine | 9 | 14 |
| Otherc | 11 | 17 |
| Proprietary | 2 | 3 |
| Publication statusd | ||
| Peer-reviewed | 7 | 11 |
| Preprint | 58 | 89 |
COVID-19 coronavirus disease 2019, CT computed tomography.
aTotals add to more than 65 and percentages add to more than 100; some studies cover multiple categories.
bSix studies also covered the Prognosis of Illness and Response to Treatment use case; in these cases, some of the models may have been applied for prediction of disease severity, response to treatment, or death.
cAdaptive boosting (3); autoencoder (2); decision tree (6); explainable deep neural network (1); fuzzy neural network (1); gradient boosted trees (2); k-nearest neighbors (3); multi-layer perceptron (4); random forest (5); unspecified neural network (1).
dAt the time of database searches (May 4, 2020).
Fig. 4Example of a ML solution for detecting and estimating the extent of COVID-19 infection based on CT images.
The image on the left shows a patient infected with COVID-19. The image on the right shows a patient negative for COVID-19. Images courtesy of RADLogics, Inc., used with permission.
Characteristics of studies that developed machine learning-based algorithms and tools to predict COVID-19-related deterioration (n = 30).
| Characteristic | % | |
|---|---|---|
| Predictiona | ||
| Severity of COVID-19 | 8 | 27 |
| Hospitalization | 1 | 3 |
| Length of hospital stay | 2 | 7 |
| Intensive care unit admission | 7 | 23 |
| Ventilator use | 7 | 23 |
| Discharge to hospice | 1 | 3 |
| Death | 18 | 60 |
| Response to treatment | 1 | 3 |
| Patients at higher risk of severe complications if infected | 1 | 3 |
| Type of chest imaging | ||
| CT | 8 | 27 |
| X-ray | 6 | 20 |
| None | 16 | 53 |
| Data sourcesa | ||
| Health records | ||
| Electronic | 8 | 27 |
| Unclear if electronic | 12 | 40 |
| Publicly available data | 13 | 43 |
| Machine learning modelsa,b | ||
| Random forest | 14 | 47 |
| Convolutional neural network | 12 | 40 |
| Gradient boosted trees | 9 | 30 |
| Support vector machine | 8 | 27 |
| Otherc | 9 | 30 |
| Proprietary | 2 | 7 |
| Publication statusd | ||
| Peer-reviewed | 3 | 10 |
| Preprint | 27 | 90 |
COVID-19 coronavirus disease 2019, CT computed tomography.
aTotals add to more than 30 and percentages add to more than 100; some studies cover multiple categories.
bSix studies also covered the Triage and Timely Diagnosis of Infections use case; in these cases, some of the models may have been applied for detection of COVID-19 or estimation of disease severity.
cAdaptive boosting (1); decision tree (5); k-nearest neighbors (4); multi-layer perceptron (2); natural language processing (1); recurrent neural network (2); unspecified neural network (3).
dAt the time of database searches (May 4, 2020).
Commonly used data sources and types of machine learning suitable for each use case based on studies from the in-depth review about past pandemics, SARS, and COVID-19.
| Key use case | Commonly used data sources | Suitable types of ML |
|---|---|---|
| Forecasting infectious disease dynamics and effects of interventions | Publicly available counts (e.g., Johns Hopkins COVID-19 map, Worldometer, World Health Organization), media reports, commercial publications, web searches (e.g., Google Trends), social media (e.g., Twitter), census data, population-level comorbidity statistics, data on outbreaks of similar pathogens | Augmenting traditional models: neural networks |
| Data-driven ML: recurrent neural networks | ||
| Surveillance and outbreak detection | Social media (e.g., Twitter), web searches, news reports, medical record data (structured and unstructured fields) | Text mining: natural language processing |
| Classification: support vector machines, transformer neural networks | ||
| Real-time monitoring of adherence to public health recommendations | Cameras in public spaces | Compliance with mandated quarantine: proprietary facial recognition |
| Adherence to mask wearing, social distancing, and sanitation: proprietary computer vision | ||
| Real-time detection of influenza-like illness | Cameras and sensors in public spaces | Interpretation of thermal imaging: neural networks, computer vision |
| Triage and timely diagnosis of infections | Exposure history, medical record data (structured and unstructured fields, laboratory results, chest imaging) | Interpretation of chest imaging: convolutional neural networks |
| Triage based on routinely collected medical record data: no standout ML | ||
| Interpretation of unstructured clinical notes: transformer neural networks | ||
| Prognosis of illness and response to treatment | Medical record data (structured and unstructured fields, laboratory results, chest imaging) | Based on chest imaging: combination of convolutional and recurrent neural networks |
| Based on routinely collected medical record data: no standout ML | ||
| Interpretation of unstructured clinical notes: natural language processing, neural networks |
ML machine learning, SARS severe acute respiratory syndrome.
| Complex machine learning techniquesb (in-depth review) | Traditional artificial intelligence and disease transmission models (limited review) |
|---|---|
| Adaptive boosting, decision trees, fuzzy logic, gradient boosting, k-means clustering, natural language processing, nearest neighbors, neural networks, random forests, support vector machines | Compartmental (e.g., Susceptible-Infected-Recovered), simulation (e.g., agent-based, Markov), statistical (e.g., Bayesian, exponential or logistic growth, linear or logistic regression), time series (e.g., auto regressive integrated moving average) |
aCategorization of models as complex or traditional was somewhat arbitrary. Models were categorized as complex if they were generally less explainable, required increased computing power, or could more effectively manage irregularly sampled or high-dimensional data. This distinction was necessary given the large body of literature identified through the scoping review.
bVarious publications have summarized these methods and offer insights about strengths and weaknesses[4–6].