Caitlin K Monaghan1, John W Larkin1, Sheetal Chaudhuri1,2, Hao Han1, Yue Jiao1, Kristine M Bermudez3, Eric D Weinhandl3, Ines A Dahne-Steuber3, Kathleen Belmonte4, Luca Neri5, Peter Kotanko6,7, Jeroen P Kooman2, Jeffrey L Hymes3, Robert J Kossmann3, Len A Usvyat1, Franklin W Maddux8. 1. Fresenius Medical Care, Global Medical Office, Waltham, Massachusetts. 2. Division of Nephrology, Maastricht University Medical Center, Maastricht, The Netherlands. 3. Fresenius Medical Care North America, Medical Office, Waltham, Massachusetts. 4. Nursing & Clinical Services, Fresenius Kidney Care, Waltham, Massachusetts. 5. Fresenius Medical Care Deutschland GmbH, EMEA Medical Office, Bad Homburg, Germany. 6. Research Division, Renal Research Institute, New York, New York. 7. Division of Nephrology, Icahn School of Medicine at Mount Sinai, New York, New York. 8. Fresenius Medical Care AG & Co. KGaA, Global Medical Office, Bad Homburg, Germany.
Abstract
Background: We developed a machine learning (ML) model that predicts the risk of a patient on hemodialysis (HD) having an undetected SARS-CoV-2 infection that is identified after the following ≥3 days. Methods: As part of a healthcare operations effort, we used patient data from a national network of dialysis clinics (February-September 2020) to develop an ML model (XGBoost) that uses 81 variables to predict the likelihood of an adult patient on HD having an undetected SARS-CoV-2 infection that is identified in the subsequent ≥3 days. We used a 60%:20%:20% randomized split of COVID-19-positive samples for the training, validation, and testing datasets. Results: We used a select cohort of 40,490 patients on HD to build the ML model (11,166 patients who were COVID-19 positive and 29,324 patients who were unaffected controls). The prevalence of COVID-19 in the cohort (28% COVID-19 positive) was by design higher than the HD population. The prevalence of COVID-19 was set to 10% in the testing dataset to estimate the prevalence observed in the national HD population. The threshold for classifying observations as positive or negative was set at 0.80 to minimize false positives. Precision for the model was 0.52, the recall was 0.07, and the lift was 5.3 in the testing dataset. Area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) for the model was 0.68 and 0.24 in the testing dataset, respectively. Top predictors of a patient on HD having a SARS-CoV-2 infection were the change in interdialytic weight gain from the previous month, mean pre-HD body temperature in the prior week, and the change in post-HD heart rate from the previous month. Conclusions: The developed ML model appears suitable for predicting patients on HD at risk of having COVID-19 at least 3 days before there would be a clinical suspicion of the disease.
Background: We developed a machine learning (ML) model that predicts the risk of a patient on hemodialysis (HD) having an undetected SARS-CoV-2 infection that is identified after the following ≥3 days. Methods: As part of a healthcare operations effort, we used patient data from a national network of dialysis clinics (February-September 2020) to develop an ML model (XGBoost) that uses 81 variables to predict the likelihood of an adult patient on HD having an undetected SARS-CoV-2 infection that is identified in the subsequent ≥3 days. We used a 60%:20%:20% randomized split of COVID-19-positive samples for the training, validation, and testing datasets. Results: We used a select cohort of 40,490 patients on HD to build the ML model (11,166 patients who were COVID-19 positive and 29,324 patients who were unaffected controls). The prevalence of COVID-19 in the cohort (28% COVID-19 positive) was by design higher than the HD population. The prevalence of COVID-19 was set to 10% in the testing dataset to estimate the prevalence observed in the national HD population. The threshold for classifying observations as positive or negative was set at 0.80 to minimize false positives. Precision for the model was 0.52, the recall was 0.07, and the lift was 5.3 in the testing dataset. Area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) for the model was 0.68 and 0.24 in the testing dataset, respectively. Top predictors of a patient on HD having a SARS-CoV-2 infection were the change in interdialytic weight gain from the previous month, mean pre-HD body temperature in the prior week, and the change in post-HD heart rate from the previous month. Conclusions: The developed ML model appears suitable for predicting patients on HD at risk of having COVID-19 at least 3 days before there would be a clinical suspicion of the disease.
Authors: Scott M Lundberg; Gabriel Erion; Hugh Chen; Alex DeGrave; Jordan M Prutkin; Bala Nair; Ronit Katz; Jonathan Himmelfarb; Nisha Bansal; Su-In Lee Journal: Nat Mach Intell Date: 2020-01-17
Authors: Pablo Ormeño; Gastón Márquez; Camilo Guerrero-Nancuante; Carla Taramasco Journal: Int J Environ Res Public Health Date: 2022-06-30 Impact factor: 4.614
Authors: Haresh Selvaskandan; Katherine L Hull; Sherna Adenwalla; Safa Ahmed; Maria-Cristina Cusu; Matthew Graham-Brown; Laura Gray; Matt Hall; Rizwan Hamer; Ammar Kanbar; Hemali Kanji; Mark Lambie; Han Sean Lee; Khalid Mahdi; Rupert Major; James F Medcalf; Sushiladevi Natarajan; Boavojuvie Oseya; Stephanie Stringer; Matthew Tabinor; James Burton Journal: BMJ Open Date: 2022-05-30 Impact factor: 3.006
Authors: Mingquan Lin; Bojian Hou; Lei Liu; Mae Gordon; Michael Kass; Fei Wang; Sarah H Van Tassel; Yifan Peng Journal: Sci Rep Date: 2022-08-18 Impact factor: 4.996