| Literature DB >> 35795812 |
Rayanne A Luke, Anthony J Kearsley, Nora Pisanic, Yukari C Manabe, David L Thomas, Christopher D Heaney, Paul N Patrone.
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has emphasized the importance and challenges of correctly interpreting antibody test results. Identification of positive and negative samples requires a classification strategy with low error rates, which is hard to achieve when the corresponding measurement values overlap. Additional uncertainty arises when classification schemes fail to account for complicated structure in data. We address these problems through a mathematical framework that combines high dimensional data modeling and optimal decision theory. Specifically, we show that appropriately increasing the dimension of data better separates positive and negative populations and reveals nuanced structure that can be described in terms of mathematical models. We combine these models with optimal decision theory to yield a classification scheme that better separates positive and negative samples relative to traditional methods such as confidence intervals (CIs) and receiver operating characteristics. We validate the usefulness of this approach in the context of a multiplex salivary SARS-CoV-2 immunoglobulin G assay dataset. This example illustrates how our analysis: (i) improves the assay accuracy (e.g. lowers classification errors by up to 35 % compared to CI methods); (ii) reduces the number of indeterminate samples when an inconclusive class is permissible (e.g. by 40 % compared to the original analysis of the example multiplex dataset); and (iii) decreases the number of antigens needed to classify samples. Our work showcases the power of mathematical modeling in diagnostic classification and highlights a method that can be adopted broadly in public health and clinical settings.Entities:
Year: 2022 PMID: 35795812 PMCID: PMC9258291
Source DB: PubMed Journal: ArXiv ISSN: 2331-8422
Fig. 1:Positive (red X) and negative (blue ♦s) training antibody data. (a) N plotted against RBD, (b) the ELISA-based total IgG is added as the vertical axis. The green boxes in (a) and (b) are the negative sample mean plus 3σ confidence intervals. Supplemental Figure 1 shows an animation of (b).
Fig. 2:3D probability model plotted along with the training data. Positive samples are indicated by red Xs and negatives with blue ♦s. Supplemental Figure 2 shows an animation.
Fig. 3:Optimal classification domains for the training (a) and test (b) data. Positive samples are indicated by red Xs and negatives with blue ♦s. The green boxes drawn in (a) and (b) are the negative sample mean plus 3σ confidence intervals. Supplemental Figures 3 and 4 show animations.
Fig. 4:Optimal classification domains for the training (a) and test (b) data with a holdout region (white; magenta and cyan markers are indeterminate samples). Positive samples are indicated by red Xs and negatives with blue ♦s. The green boxes drawn in (a) and (b) are the negative sample mean plus 3σ confidence intervals. Supplemental Figures 5 and 6 show animations.
Summary information about the SARS-CoV-2 datasets with sensitivities, specificities, and classification accuracies for training and test data with and without allowing an indeterminate class. Model and 3σ (relative to negative sample mean) confidence interval results are shown for all samples; the original analysis [9] was conducted on all seven antibody targets and the ELISA-based total IgG without indeterminate samples.
| Data and method | Positive | Negative | Total |
|---|---|---|---|
| Training samples | |||
| 147 | 283 | 430 | |
|
| Sensitivity (%) | Specificity (%) | Accuracy (%) |
| Model | 131/147, 89.1 | 279/283, 98.6 | 410/430, 95.4 |
| Confidence interval | 121/147, 82.3 | 277/283, 97.9 | 398/430, 92.6 |
|
| Sensitivity (%) | Specificity (%) | Accuracy (%) |
| Model | 111/115, 96.5 | 256/256, 100 | 367/371, 98.9 |
| Confidence interval | 111/115, 96.5 | 255/256, 99.6 | 366/371, 98.7 |
| Pisanic et al. (2020) | 111/115, 96.5 | 219/219, 100 | 330/334, 98.8 |
| Patrone et al. (2022) | 115/119, 96.6 | 227/227, 100 | 342/346, 98.8 |
| Test samples | |||
| 87 | 192 | 279 | |
|
| Sensitivity (%) | Specificity (%) | Accuracy (%) |
| Model | 83/87, 95.4 | 187/192, 97.4 | 270/279, 96.8 |
| Confidence interval | 81/87, 83.1 | 188/192, 97.9 | 269/279, 96.4 |
|
| Sensitivity (%) | Specificity (%) | Accuracy (%) |
| Model | 80/81, 98.8 | 163/163, 100 | 243/244, 99.6 |
| Confidence interval | 80/81, 98.8 | 158/163, 96.9 | 238/244, 97.5 |
| Pisanic et al. (2020) | 81/81, 100 | 125/126, 99.2 | 206/207, 99.5 |
| Patrone et al. (2022) | 81/82, 98.8 | 157/158, 99.4 | 238/240, 99.2 |