| Literature DB >> 35545624 |
Auriel A Willette1,2,3, Sara A Willette4, Qian Wang5, Colleen Pappas5, Brandon S Klinedinst5, Scott Le5, Brittany Larsen5, Amy Pollpeter5, Tianqi Li5, Jonathan P Mochel6, Karin Allenspach7, Nicole Brenner8, Tim Waterboer8.
Abstract
Many risk factors have emerged for novel 2019 coronavirus disease (COVID-19). It is relatively unknown how these factors collectively predict COVID-19 infection risk, as well as risk for a severe infection (i.e., hospitalization). Among aged adults (69.3 ± 8.6 years) in UK Biobank, COVID-19 data was downloaded for 4510 participants with 7539 test cases. We downloaded baseline data from 10 to 14 years ago, including demographics, biochemistry, body mass, and other factors, as well as antibody titers for 20 common to rare infectious diseases in a subset of 80 participants with 124 test cases. Permutation-based linear discriminant analysis was used to predict COVID-19 risk and hospitalization risk. Probability and threshold metrics included receiver operating characteristic curves to derive area under the curve (AUC), specificity, sensitivity, and quadratic mean. Model predictions using the full cohort were marginal. The "best-fit" model for predicting COVID-19 risk was found in the subset of participants with antibody titers, which achieved excellent discrimination (AUC 0.969, 95% CI 0.934-1.000). Factors included age, immune markers, lipids, and serology titers to common pathogens like human cytomegalovirus. The hospitalization "best-fit" model was more modest (AUC 0.803, 95% CI 0.663-0.943) and included only serology titers, again in the subset group. Accurate risk profiles can be created using standard self-report and biomedical data collected in public health and medical settings. It is also worthwhile to further investigate if prior host immunity predicts current host immunity to COVID-19.Entities:
Mesh:
Year: 2022 PMID: 35545624 PMCID: PMC9092926 DOI: 10.1038/s41598-022-07307-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Baseline Demographics and Data Characteristics. Blood pressure (BP); high-density lipoprotein (HDL); low-density lipoprotein (LDL). A summary and comparison of data among either all participant test cases or a sub-group of test cases that also had non COVID-19 serology. All retrospective baseline data has italics. Values are in mean ± SD, percentages, or frequency. P values less than 0.05 were considered significant and applicable predictors and indices are bolded.
Baseline characteristics of infectious disease serology from 2006 to 2010. Antibody levels are specific to each antigen and expressed in Median Fluorescence Intensity (MFI) units. Seroprevalence of at least the main UK Biobank cohort was estimated on samples from 9695 randomized participants, as described in white papers (see “Methods”). The “bold” and “italics” shading are used to distinguish between pathogens and their respective antigens. aCagA levels are based on roughly half of the original sample due to a technical lab error.
Sets of predictors used to predict classification of COVID-19 test cases as negative or positive. Area Under the Curve (AUC); Confidence Interval (CI); Geometric Mean (G-Mean). Non-parametric bootstrapping (1000 iterations, 95% CI) was used for robust estimation. P values less than 0.05 were considered significant. “bold” and “italics” shading are used to distinguish between predictors that loaded for a given model. aDue to several variables representing the same construct (i.e., being multicollinear), body composition consisted of: whole-body water mass; whole-body fat mass; whole-body non-fat mass (i.e., muscle, bone).
Figure 1Receiver Operating Characteristics (ROC) curves illustrating the relative classifier performance of various sets of predictors. Outcomes of interest were COVID-19 infection risk and whether an infection was mild or severe. Two separate sets of analyses were done for the full tested sample and a sub-group of participants with serology data. Test statistics for predictors are provided in Tables 3 and 4.
Sets of predictors used to predict classification of COVID-19 positive cases as mild or severe. Area Under the Curve (AUC); Confidence Interval (CI); Geometric Mean (G-Mean). Non-parametric bootstrapping (1000 iterations, 95% CI) was used for robust estimation. P values less than 0.05 were considered significant. “bold” and “italics” shading are used to distinguish between predictors that loaded for a given model. aDue to several variables representing the same construct (i.e., being multicollinear), body composition consisted of: whole-body water mass; whole-body fat mass; whole-body non-fat mass (i.e., muscle, bone). bDue to the full serology panel of 44 antibody titers exceeding degrees of freedom, titers for 6 antigens were excluded for pathogens with the lowest estimated prevalence in the cohort (HIV, HCV, HTLV-1).