Literature DB >> 32960917

A machine learning algorithm to increase COVID-19 inpatient diagnostic capacity.

David Goodman-Meza¹, Akos Rudas^2,3, Jeffrey N Chiang², Paul C Adamson¹, Joseph Ebinger⁴, Nancy Sun⁴, Patrick Botting⁴, Jennifer A Fulcher¹, Faysal G Saab⁵, Rachel Brook⁵, Eleazar Eskin^2,6,7, Ulzee An⁶, Misagh Kordi², Brandon Jew², Brunilda Balliu², Zeyuan Chen⁶, Brian L Hill⁶, Elior Rahmani⁶, Eran Halperin^2,6,7,8, Vladimir Manuel^9,10.

Abstract

Worldwide, testing capacity for SARS-CoV-2 is limited and bottlenecks in the scale up of polymerase chain reaction (PCR-based testing exist. Our aim was to develop and evaluate a machine learning algorithm to diagnose COVID-19 in the inpatient setting. The algorithm was based on basic demographic and laboratory features to serve as a screening tool at hospitals where testing is scarce or unavailable. We used retrospectively collected data from the UCLA Health System in Los Angeles, California. We included all emergency room or inpatient cases receiving SARS-CoV-2 PCR testing who also had a set of ancillary laboratory features (n = 1,455) between 1 March 2020 and 24 May 2020. We tested seven machine learning models and used a combination of those models for the final diagnostic classification. In the test set (n = 392), our combined model had an area under the receiver operator curve of 0.91 (95% confidence interval 0.87-0.96). The model achieved a sensitivity of 0.93 (95% CI 0.85-0.98), specificity of 0.64 (95% CI 0.58-0.69). We found that our machine learning algorithm had excellent diagnostic metrics compared to SARS-CoV-2 PCR. This ensemble machine learning algorithm to diagnose COVID-19 has the potential to be used as a screening tool in hospital settings where PCR testing is scarce or unavailable.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32960917 PMCID： PMC7508387 DOI： 10.1371/journal.pone.0239474

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Severe acute respiratory syndrome coronavirus-2 (SARS-CoV2) is a worldwide public health emergency [1, 2]. Polymerase chain reaction (PCR) testing for SARS-CoV-2 is critical to the public health response to coronavirus disease 2019 (COVID-19). PCR testing capacity is especially important in the hospital setting for clinical decision making and infection control procedures [3]. Yet, the inability to scale up testing has been one of the most discussed topics in both the scientific and popular literature [3, 4]. In many hospital settings, PCR testing capacity remains limited. Many PCR assays have short analysis time; however, many hospitals lack on-site PCR capabilities and are tasked with sending samples to centralized laboratories. Transport times and queues lengthen the turnaround time and results can be delayed up to 48 to 96 hours [5-7]. This wait time slows the clinical decision-making process and wastes scarce personal protective equipment. Machine learning could help fill this gap. Ancillary laboratory values in blood samples of patients with COVID-19 demonstrate a distinct pattern to that of other diseases [3, 8–11]. These changes include elevations in inflammatory markers (ferritin, lactate dehydrogenase [LDH], C-reactive protein, among others) and decreases in certain blood cell counts (absolute lymphocyte count) and an increase in the neutrophil to lymphocyte ratio. Since the SARS-CoV-2 epidemic reached pandemic status, research groups developed prediction algorithms applicable to their particular context [12-16]. One of the major limitations of these previous approaches is that the datasets that were used to train and test the approaches were small. Our aim was to develop a machine learning algorithm using the largest dataset to date, to serve as a COVID-19 diagnostic proxy to be useful in hospitals where SARS-CoV-2 specific PCR testing is unavailable or scarce. We hypothesized that a machine learning-based algorithm based on a parsimonious set of blood markers that include inflammatory markers could predict the presence or absence of COVID-19 with high sensitivity and potentially be used as a screening tool in clinical practice.

Methods

Study design

We used electronic health data from the UCLA Health System (Los Angeles, California, USA) to develop a machine learning algorithm to serve as a proxy to diagnose COVID-19 in the hospital setting. Our set of features were selected based on prior studies reporting a difference in these features between patients with and without COVID-19, and higher values in those with severe COVID-19 compared to mild COVID-19 [3, 8–11]. This study was deemed non-human-subjects research by the institutional review board (IRB) at UCLA as all analyses used de-identified data. We report our findings based on STARD-2015 guidelines [17].

Data sources

We retrospectively considered all cases that were tested for SARS-CoV-2 in the emergency room or inpatient setting within the UCLA Health System between 1 March 2020 and 24 May 2020. After constructing our initial pool of cases, we included only cases with complete blood counts and at least one inflammatory marker (C-reactive protein, ferritin, or LDH) within 48 hours of the sample collection for SARS-CoV2 PCR testing. All data were extracted from the electronic medical record. Features included in the models were age, gender, hemoglobin, red blood cell count, absolute neutrophil, absolute lymphocyte, absolute eosinophil and absolute basophil counts, the neutrophil to lymphocyte ratio, C-reactive protein, ferritin, and LDH. Prior to entering the model, all features were normalized to have zero mean and unit standard deviation. The normalization parameters (e.g., mean and standard deviation) were computed using the training set, and the features in the test set were scaled using these values. After scaling, missing lab values were imputed with zero, effectively inserting the mean feature value from the training set. Mean imputation was determined appropriate after evaluating several imputation methods (K-nearest neighbor and Iterative Imputation), which did not result in significant improvements.

Gold standard

Diagnosis of SARS-CoV-2 was confirmed by PCR testing assays performed at the UCLA Microbiology Laboratory. These assays included the 2019-nCoV Real-Time (RT)-PCR Diagnostic Panel (CDC, Atlanta, GA), the Diasorin Simplexa COVID-19 Direct RT-PCR (Diasorin Molecular LLC, Cypress, CA), the TaqPath COVID-19 Combo Kit (Thermo Fisher Scientific Inc., Waltham, MA).

Machine learning analysis

We compared seven machine learning models: Random forest, logistic regression, support vector machine, multilayer perceptron (neural network), stochastic gradient descent, XGBoost, and ADABoost. An ensemble (combined) model was then created based on those seven individually trained machine learning models. The final classification as positive or negative was decided using the majority vote of the classifiers calculated by averaging their respective probabilities. The dataset was split 60% for training, 10% for validation, and 30% for testing. The discriminatory operating threshold was determined using a validation set held out from the training set and selected such that the sensitivity on the validation set would be above a predefined threshold of 0.95 by configuring the beta parameter of the F-score. The resulting model was then evaluated on the held-out test set using the following diagnostic metrics: area under the receiver operator curve (AUROC), area under the precision recall curve (AUPRC), sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). Confidence intervals were constructed for each metric using a bootstrapping procedure on the test set in which the test set was repeatedly resampled with replacement 1000 times. Feature importance was assessed using a permutation test on importance. To test the contribution of each feature to model performance, the feature values were randomly shuffled, thereby disrupting their correlations with the outcome, and the decrease in model performance (f1-score) was recorded. All machine learning analyses were performed using Python, making extensive use of the Scikit-learn package.

Results

Descriptive

In total, there were 3,444 cases who were tested for SARS-CoV-2 and considered in our analysis. After exclusion of patients who did not have the minimal necessary features to make predictions (a complete blood cell count and at least one inflammatory marker), 1455 cases remained (1273 negative and 182 positive cases) (see Fig 1). All cases were either from the emergency room or inpatient settings. Mean age was 58.1 (SD 22.3), 53% were men, 49% white, 24% Latino, and 29% immunosuppressed. See Table 1 for descriptive characteristics for included features by SARS-CoV-2 status.

Fig 1

Diagram of eligible, included and excluded cases, and diagnostic cross tabulation.

Table 1

Characteristics of cases by SARS-CoV-2 status.

	SARS-CoV-2 status
	Negativen (%)	Positiven (%)	Totaln	p- value
Total	1273 (87.5)	182 (12.5)	1455
Age, years, mean (SD)	57.2 (22.6)	64.2 (19.1)	58.1 (22.3)	<0.001
Gender				0.030
Female	610 (47.9)	71 (39.0)	681 (46.8)
Male	663 (52.1)	111 (61.0)	774 (53.2)
Race/ethnicity				0.006
Asian	91 (7.1)	16 (8.8)	107 (7.4)
Black	156 (12.3)	18 (9.9)	174 (12.0)
Latino	281 (22.1)	61 (33.5)	342 (23.5)
Other	110 (8.6)	17 (9.3)	127 (8.7)
White	635 (49.9)	70 (38.5)	705 (48.5)
Immunosuppressed ⁺	385 (30.2)	35 (19.2)	420 (28.9)	0.003
HIV	17 (1.3)	1 (0.5)	18 (1.2)	0.590
Transplant	180 (14.1)	19 (10.4)	199 (13.7)	0.214
Immunosuppressive medications	312 (24.5)	29 (15.9)	341 (23.4)	0.014
Not immunosuppressed	888 (69.8)	147 (80.8)	1035 (71.1)
Hemoglobin, g/dl, mean (SD) ^a	11.80(9.90–13.5)	12.60(11.0–14.2)	11.90(10.0–13.6)	<0.001
Absolute neutrophil count x 10^3/uL, median (IQR)	6.02(3.93–9.39)	5.19(3.47–7.46)	5.92(3.88–9.12)	0.001
Absolute lymphocyte count x 10^3/uL, median (IQR) ^e	1.22(0.74–1.90)	0.96(0.63–1.38)	1.18(0.72–1.86)	<0.001
Neutrophil:lymphocyte ratio, median (IQR)	4.81(2.47–9.77)	5.21(2.91–10.3)	4.88(2.56–9.81)	0.112
Absolute basophil count x 10^3/uL, median (IQR)	0.03(0.02–0.05)	0.01(0.01–0.03)	0.03(0.02–0.05)	<0.001
Absolute eosinophil count x 10^3/uL, median (IQR)	0.08(0.02–0.18)	0.01(0.00–0.04)	0.07(0.01–0.16)	<0.001
Absolute monocyte count x 10^3/uL, median (IQR)	0.65(0.47–0.95)	0.48(0.33–0.70)	0.64(0.44–0.92)	<0.001
Platelet count x 10^3/uL, mean (SD) ^b	231(168–298)	188(149–252)	227(164–291)	<0.001
C-reactive protein, mg/dl, mean (SD) ^c	1.90(0.30–7.80)	6.60(2.10–12.2)	2.80(0.50–8.90)	<0.001
Ferritin, ng/ml, mean (SD) ^d	216(93.0–522)	439(261–770)	261(110–585)	<0.001
Lactate dehydrogenase, U/L, mean (SD) ^e	245(192–342)	306(231–412)	261(198–357)	<0.001

Abbreviations: IQR, interquartile range; SD, standard deviation.

Missing values (n, % of total): a hemoglobin 3 (0.2%); b platelets 6 (0.4%); c C-reactive protein 517 (35.5%); d ferritin 737 (50.6%); e lactate dehydrogenase 693 (47.6%).

+ We defined immunosuppressed status as a case with an HIV diagnosis, record of receipt of an organ transplant, or had taken an oral immunosuppressive medication prior to their SARS-CoV-2 test (e.g., prednisone, tacrolimus, mycophenolate, azathioprine, methotrexate).

Abbreviations: IQR, interquartile range; SD, standard deviation. Missing values (n, % of total): a hemoglobin 3 (0.2%); b platelets 6 (0.4%); c C-reactive protein 517 (35.5%); d ferritin 737 (50.6%); e lactate dehydrogenase 693 (47.6%). + We defined immunosuppressed status as a case with an HIV diagnosis, record of receipt of an organ transplant, or had taken an oral immunosuppressive medication prior to their SARS-CoV-2 test (e.g., prednisone, tacrolimus, mycophenolate, azathioprine, methotrexate).

Machine learning model: Diagnostic metrics

The AUROC of the model in the held-out test set (n = 392) was 0.91 (95% confidence interval [CI] 0.87–0.96) and the AUPRC was 0.76 (95% CI 0.66–0.83). The model achieved a sensitivity of 0.93 (95% CI 0.84–0.98), specificity of 0.64 (95% CI 0.59–0.69), NPV of 0.98 (95% CI 0.96–1.00), and PPV of 0.29 (95% CI 0.23–0.36). Receiver operator curves and precision-recall curves were presented in Fig 2. Using a feature importance analysis, we found that the features that provide most of the information to the model were: C-reactive protein and LDH (see Fig 3).

Fig 2

Performance of the model on the held-out test set (N = 392).

A) Receiver operator curve. B) Precision-recall curve. At a sensitivity-optimized operating threshold, sensitivity and specificity were 0.93 (95% CI 0.85–0.98) and 0.64 (95% CI 0.59–0.69), respectively. Red solid lines were the mean receiver operator curve and mean precision-recall curve, respectively; the purple shaded lines were the curves obtained from the bootstrapping procedure to calculate the 95% confidence intervals.

Fig 3

Combined model feature importance.

Performance of the model on the held-out test set (N = 392).

Combined model feature importance.

Decrease in model performance (f1-score) after randomly shuffling the respective feature values. Higher values represent important features for classification. Abbreviations: LDH, lactate dehydrogenase; NLR, neutrophil to lymphocyte ratio; RBC, red blood cells. In sensitivity analyses, we calculated AUROC and AUPRC when adding the inflammatory features relative to the baseline model of only demographic characteristics and features of the complete blood cell count (see Fig 4). The AUROC of the model of the baseline model was 0.79 (95% CI 0.71–0.85). Then, we added the inflammatory markers to the model one at a time. With ferritin, the AUROC was 0.83 (95% CI 0.78–0.88); with C-reactive protein 0.86 (95% CI 0.79–0.92); with LDH, 0.87 (95% CI 0.82–0.92). The AUPRC of the baseline model was 0.50 (95% CI 0.36–0.65); with ferritin 0.56 (95% CI 0.45–0.68); with LDH, 0.66 (95% CI 0.55–0.77); with C-reactive protein 0.66 (95% CI 0.50–0.80). Through these analyses we observed that adding inflammatory markers, especially LDH, CRP, and the combination of the three resulted in statistically significant improvements relative to the baseline model.

Fig 4

Performance of models while removing one of the features.

All analyses were performed on the held-out test set (N = 392). A) Receiver operating curve. B) Precision-recall curve. Base model includes only demographic features and complete blood cell count. Abbreviations: CRP, C-reactive protein; LDH, lactate dehydrogenase.

Performance of models while removing one of the features.

Discussion

This is the largest study to date using a machine learning algorithm as a proxy to diagnose COVID-19. We built the algorithm based on a set of basic demographic characteristics and frequently obtained blood biomarkers that could be easily obtained in many hospital settings. Thus, the most likely application of the approach presented in this work is the use of these biomarkers as a proxy for testing in locations where COVID-19 testing is scarce. We showed a high sensitivity for COVID-19 diagnosis when compared to SARS-CoV-2 RT PCR testing as the gold standard. The blood biomarkers included in the model can be obtained with a single blood draw and turnaround time is typically within 24 hours at most hospital centers with laboratory capabilities. Due to the model’s high sensitivity and rapid turnaround time, the proposed algorithm lends itself to practical use in hospital facilities as a screening tool. At the time of submission, this model was being actively developed into a web or mobile application, whereby a clinician inputs the obtained values and receives immediate prediction on the probability of a particular patient having COVID-19. Further validation will be required to ascertain its performance in other medical centers. Our set of features performed as well as, or better than, the three diagnostic algorithms with the largest number of cases known to us at this time [12, 13, 16]. A report by Sun et al. used epidemiologic, clinical, laboratory and imaging features in their algorithms and reported AUROCs of 0.91 (full model), 0.88 (without epidemiologic features), 0.88 (without imaging features), and 0.65 (with clinical features alone) [12]. They used features from a complete blood cell count and from a basic chemistry panel (sodium and creatinine), whereas, we used inflammatory markers (ferritin, C-reactive protein, LDH) instead of sodium, potassium, and creatinine as we did not suspect significant differences a priori in sodium, potassium, or creatinine. Meng et al reported an AUROC of 0.89 using a different set of features that included activated partial thromboplastin time, triglycerides, uric acid, albumin/globulin, sodium, and calcium [16]. Batista et al. developed an algorithm aimed for use in lower resource settings and reported an AUROC of 0.87 in a sparser dataset that only included basic demographics and complete blood cell counts [13]. In fact, our model which incorporated inflammatory markers significantly improved upon this set of features in terms of both AUROC and AUPRC. For a full comparison of diagnostic algorithms related to COVID-19 we refer the reader to [15]—a living systematic review. Our findings should be considered in light of the following limitations. We included data from one medical center in Los Angeles. Incorporating data from other medical centers in other geographic areas would provide a higher likelihood of generalizability. Second, although many of our patients either had immunosuppressive conditions (e.g., solid organ transplants) or were taking immunosuppressive medications (e.g., steroids), immunosuppressed hosts are a heterogenous group and their immunosuppression may impact the laboratory values we used in our models. We would need more cases with those conditions to understand how the algorithm would perform in these populations. It is likely that specific models tailored to the immunocompromised host should be developed to improve accuracy in this population. Third, it is also possible that other community respiratory viral infections (e.g., influenza, RSV) could cause a similar laboratory profile; however, incidence of these other community respiratory viruses was low during the case inclusion period. Further validation comparing COVID-19 cases to cases of other community respiratory viruses is needed. Finally, as all of our patients’ blood was tested in the emergency department or as an inpatient, the applicability of this model in the outpatient setting or milder cases of COVID-19 is unclear.Our report, in combination with others [12, 13, 15, 16], demonstrate the high diagnostic accuracy of machine learning models based on early available data. Other models have also been developed based on characteristic imaging changes [15]. We and others were able to demonstrate impressive results in our data silos [12-15]. Yet, to realize the full potential of machine learning and its applicability to clinical medicine, collaborations from the international community are crucial, both for the sharing of data and for the development and validation of advanced algorithms. It is unclear if testing capacity for active disease using PCR-based methods will ever meet the expanding need globally. In fact, countries in low-resource settings, such as in Sub-Saharan Africa or Latin America, face bottlenecks in the testing supply chain, and are unable to compete with affluent nations for prohibitively expensive PCR test kits. Even in developed nations, scale up of PCR-based testing has many bottlenecks that include purchase of new testing platforms, sample acquisition, availability of reagents, swabs and transport media, and the technical human expertise in performing PCR tests. In summary, by using readily available laboratory tests combined with machine learning we achieved a high sensitivity comparable to that of PCR. This machine learning modality may be especially useful as a screening test in smaller medical centers or those in resource-poor regions that may have limited capacity for COVID-19 PCR-based diagnosis, or in instances were testing capacity is in danger due to low supplies. Further validation is necessary in diverse geographic settings and in a prospective manor to be used is a reliable tool to support clinical decision making. (PDF) Click here for additional data file.

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present. 8 Sep 2020 A m achine learning algorithm to increase COVID-19 inpatient diagnostic capacity PONE-D-20-20126 Dear Dr. Goodman-Meza, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Ryan J. Urbanowicz, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: * Yes, the manuscript is technically sound. The only reason I said "Partly" to question 1 above is that for reasons out of the authors' control, the sample size is somewhat small. However, as they say, there is the (very exciting) prospect that such work can lead to data-sharing, particularly among hospitals in diverse regions for data-diversity, leading to much larger training data-sets and hence, more-accurate models for realistic data. * The statistical analysis is rigorous. * The answer to question 3 is "No" since the data is anonymized medical-center data: the authors say "The datasets generated during and/or analyzed during the current study are not publicly available due to institutional restrictions on data sharing and privacy concerns. However, the data can be available from the corresponding author on reasonable request. All code necessary to perform the analyses will be available on a public repository by the time of publication." I think this is reasonable, and really the best that can be expected under the circumstances. * The paper is well-written in general, but I have one technical question (see (e) below) and a few minor comments: (a) Page 9: "pandemic status research" -> pandemic status, research" (b) Hyphenate multi-word adjectives throughout: e.g., as in "machine-learning algorithm" (c) Page 10: "from UCLA Health System" -> "from the UCLA Health System" (d) Page 10: "non-human subjects" -> "non-human-subjects" (e) Page 10, on normalizing all features: How do you do such normalization for gender -- a discrete feature with very small support? Reviewer #2: The paper by Goodman-Meza et al. describes an ensemble machine learning algorithm for the diagnosis of COVID-19. Specifically, using the largest available dataset of patients testing for COVID-19 in the hospital setting, the authors make use of demographic and laboratory features to obtain highly accurate predictions. Their results are comparable to the gold-standard PCR test. This work is particularly valuable for COVID-19 diagnosis at hospitals with limited resources or where standard testing is too slow. Despite the limitation of generalizability, overall this research provides a useful model for the diagnosis of COVID-19 in the hospital setting. Patients were excluded from the analysis if they did not have a CBC and at least one inflammatory lab value. Patients who tested positive for COVID-19 were more likely to be older, male, and not immunosuppressed. Drawing upon recent literature, the authors chose age, gender, seven features from blood cell counts, and three inflammatory markers as features for their model. Missing data was imputed post-normalization using the mean values. Using seven machine learning models, the authors created an ensemble machine learning algorithm which classified patients as positive or negative using a majority vote. The data was split 60/10/30 for training, validation, and testing, as is standard for many machine learning analyses, the. AUROC, AUPRC, NPV, PPV, sensitivity and specificity were reported on the testing set, in addition to confidence intervals generated by bootstrapping. Two of the inflammatory markers, LDH and C-reactive protein, exhibited the highest feature importance using a permutation test. Sensitivity analyses further demonstrated the utility of these inflammatory markers as additions to the baseline model. The authors make note of the limitations surrounding the generalizability to outpatient settings and the fairly high number of immunocompromised patients among the cases. Key strengths: 1. This paper drew on previous literature to choose the most informative laboratory features to include in the model. All of the features in the model are commonly and easily obtained in the hospital setting, even in resource-poor areas. 2. The authors reported a variety of related performance metrics that all demonstrated the algorithm’s ability to capture true positives and limit false negatives. This model performed very well, notable demonstrated by the AUROC, sensitivity, and NPV. Prioritizing sensitivity and NPV are of keen clinical importance in this context and especially during a pandemic. 3. Finally, the authors used Python’s Scikit-learn package to perform the machine learning analyses, a highly accessible and open source software. They have agreed to make all of their code public and are in the process of creating a wed/mobile application for expanded use. Suggestions for improvement: 1. Nearly 2000 patients who were tested for COVID-19 were excluded from this analysis due to incomplete laboratory measures. It would be helpful to provide some information or comments on why these patients did not have these particular lab measures, and if this may result in selection bias (for more severe cases). For example, were these excluded patients more likely to be negative or have mild cases of COVID-19? At a minimum, this limitation should be acknowledged. 2. Relatedly, under study design in the Methods section, the authors note that their features were selected in part based on higher values in those with severe COVID-19. This should also be noted in the limitations, given that generalizability to patients with milder cases of COVID-19 is unclear. 3. The authors provided no explanation for their choice of machine learning (ML) algorithms. Granted, the seven listed methods are all classification methods within Scikit-learn’s supervised machine learning models. It would nevertheless be useful to include a rationale or references for why these seven were specifically chosen. 4. Lastly, the authors stated that “We compared seven machine learning models…” (Methods, Machine learning analysis) but provided no data, figures, or discussion on this ‘comparison’. An explanation or figure summarizing individual model performance would provide additional clarity to this statement. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 14 Sep 2020 PONE-D-20-20126 A machine learning algorithm to increase COVID-19 inpatient diagnostic capacity Dear Dr. Goodman-Meza: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Ryan J. Urbanowicz Academic Editor PLOS ONE

11 in total

1. Coronavirus and the race to distribute reliable diagnostics.

Authors: Cormac Sheridan
Journal: Nat Biotechnol Date: 2020-04 Impact factor: 54.908

2. Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China.

Authors: Dawei Wang; Bo Hu; Chang Hu; Fangfang Zhu; Xing Liu; Jing Zhang; Binbin Wang; Hui Xiang; Zhenshun Cheng; Yong Xiong; Yan Zhao; Yirong Li; Xinghuan Wang; Zhiyong Peng
Journal: JAMA Date: 2020-03-17 Impact factor: 56.272

Review 3. Clinical testing for COVID-19.

Authors: Stephanie Ward; Andrew Lindsley; Josh Courter; Amal Assa'ad
Journal: J Allergy Clin Immunol Date: 2020-05-20 Impact factor: 10.793

4. Clinical Characteristics of Coronavirus Disease 2019 in China.

Authors: Wei-Jie Guan; Zheng-Yi Ni; Yu Hu; Wen-Hua Liang; Chun-Quan Ou; Jian-Xing He; Lei Liu; Hong Shan; Chun-Liang Lei; David S C Hui; Bin Du; Lan-Juan Li; Guang Zeng; Kwok-Yung Yuen; Ru-Chong Chen; Chun-Li Tang; Tao Wang; Ping-Yan Chen; Jie Xiang; Shi-Yue Li; Jin-Lin Wang; Zi-Jing Liang; Yi-Xiang Peng; Li Wei; Yong Liu; Ya-Hua Hu; Peng Peng; Jian-Ming Wang; Ji-Yang Liu; Zhong Chen; Gang Li; Zhi-Jian Zheng; Shao-Qin Qiu; Jie Luo; Chang-Jiang Ye; Shao-Yong Zhu; Nan-Shan Zhong
Journal: N Engl J Med Date: 2020-02-28 Impact factor: 91.245

5. An interactive web-based dashboard to track COVID-19 in real time.

Authors: Ensheng Dong; Hongru Du; Lauren Gardner
Journal: Lancet Infect Dis Date: 2020-02-19 Impact factor: 25.071

Review 6. Diagnostic Testing for Severe Acute Respiratory Syndrome-Related Coronavirus 2: A Narrative Review.

Authors: Matthew P Cheng; Jesse Papenburg; Michaël Desjardins; Sanjat Kanjilal; Caroline Quach; Michael Libman; Sabine Dittrich; Cedric P Yansouni
Journal: Ann Intern Med Date: 2020-04-13 Impact factor: 25.391

7. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study.

Authors: Fei Zhou; Ting Yu; Ronghui Du; Guohui Fan; Ying Liu; Zhibo Liu; Jie Xiang; Yeming Wang; Bin Song; Xiaoying Gu; Lulu Guan; Yuan Wei; Hui Li; Xudong Wu; Jiuyang Xu; Shengjin Tu; Yi Zhang; Hua Chen; Bin Cao
Journal: Lancet Date: 2020-03-11 Impact factor: 79.321

8. Epidemiological and Clinical Predictors of COVID-19.

Authors: Yinxiaohe Sun; Vanessa Koh; Kalisvar Marimuthu; Oon Tek Ng; Barnaby Young; Shawn Vasoo; Monica Chan; Vernon J M Lee; Partha P De; Timothy Barkham; Raymond T P Lin; Alex R Cook; Yee Sin Leo
Journal: Clin Infect Dis Date: 2020-07-28 Impact factor: 9.079

9. Geographic Differences in COVID-19 Cases, Deaths, and Incidence - United States, February 12-April 7, 2020.

Authors:
Journal: MMWR Morb Mortal Wkly Rep Date: 2020-04-17 Impact factor: 17.586

10. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

Authors: Laure Wynants; Ben Van Calster; Gary S Collins; Richard D Riley; Georg Heinze; Ewoud Schuit; Marc M J Bonten; Darren L Dahly; Johanna A A Damen; Thomas P A Debray; Valentijn M T de Jong; Maarten De Vos; Paul Dhiman; Maria C Haller; Michael O Harhay; Liesbet Henckaerts; Pauline Heus; Michael Kammer; Nina Kreuzberger; Anna Lohmann; Kim Luijken; Jie Ma; Glen P Martin; David J McLernon; Constanza L Andaur Navarro; Johannes B Reitsma; Jamie C Sergeant; Chunhu Shi; Nicole Skoetz; Luc J M Smits; Kym I E Snell; Matthew Sperrin; René Spijker; Ewout W Steyerberg; Toshihiko Takada; Ioanna Tzoulaki; Sander M J van Kuijk; Bas van Bussel; Iwan C C van der Horst; Florien S van Royen; Jan Y Verbakel; Christine Wallisch; Jack Wilkinson; Robert Wolff; Lotty Hooft; Karel G M Moons; Maarten van Smeden
Journal: BMJ Date: 2020-04-07

13 in total

1. Influence of Co-morbidities During SARS-CoV-2 Infection in an Indian Population.

Authors: Adrian Matysek; Aneta Studnicka; Wade Menpes Smith; Michał Hutny; Paweł Gajewski; Krzysztof J Filipiak; Jorming Goh; Guang Yang
Journal: Front Med (Lausanne) Date: 2022-08-01

2. Machine learning is the key to diagnose COVID-19: a proof-of-concept study.

Authors: Cedric Gangloff; Sonia Rafi; Guillaume Bouzillé; Louis Soulat; Marc Cuggia
Journal: Sci Rep Date: 2021-03-30 Impact factor: 4.379

Review 3. Artificial intelligence in the diagnosis of COVID-19: challenges and perspectives.

Authors: Shigao Huang; Jie Yang; Simon Fong; Qi Zhao
Journal: Int J Biol Sci Date: 2021-04-10 Impact factor: 6.580

Review 4. Machine Learning Approaches in COVID-19 Diagnosis, Mortality, and Severity Risk Prediction: A Review.

Authors: Norah Alballa; Isra Al-Turaiki
Journal: Inform Med Unlocked Date: 2021-04-03

Review 5. A contemporary review on the important role of in silico approaches for managing different aspects of COVID-19 crisis.

Authors: Mohammad Moradi; Reza Golmohammadi; Ali Najafi; Mehrdad Moosazadeh Moghaddam; Mahdi Fasihi-Ramandi; Reza Mirnejad
Journal: Inform Med Unlocked Date: 2022-01-21

6. The application of a deep learning system developed to reduce the time for RT-PCR in COVID-19 detection.

Authors: Yoonje Lee; Yu-Seop Kim; Da-In Lee; Seri Jeong; Gu-Hyun Kang; Yong Soo Jang; Wonhee Kim; Hyun Young Choi; Jae Guk Kim; Sang-Hoon Choi
Journal: Sci Rep Date: 2022-01-24 Impact factor: 4.379

7. Domain Shifts in Machine Learning Based Covid-19 Diagnosis From Blood Tests.

Authors: Theresa Roland; Carl Böck; Thomas Tschoellitsch; Alexander Maletzky; Sepp Hochreiter; Jens Meier; Günter Klambauer
Journal: J Med Syst Date: 2022-03-29 Impact factor: 4.920

8. Analyzing the impact of machine learning and artificial intelligence and its effect on management of lung cancer detection in covid-19 pandemic.

Authors: Raja Sarath Kumar Boddu; Partha Karmakar; Ankan Bhaumik; Vinay Kumar Nassa; Sumanta Bhattacharya
Journal: Mater Today Proc Date: 2021-12-03

Review 9. The Clinical Information Systems Response to the COVID-19 Pandemic.

Authors: J Jeffery Reeves; Natalie M Pageler; Elizabeth C Wick; Genevieve B Melton; Yu-Heng Gamaliel Tan; Brian J Clay; Christopher A Longhurst
Journal: Yearb Med Inform Date: 2021-09-03

10. Identifying Country-Level Risk Factors for the Spread of COVID-19 in Europe Using Machine Learning.

Authors: Serafeim Moustakidis; Christos Kokkotis; Dimitrios Tsaopoulos; Petros Sfikakis; Sotirios Tsiodras; Vana Sypsa; Theoklis E Zaoutis; Dimitrios Paraskevis
Journal: Viruses Date: 2022-03-17 Impact factor: 5.048