| Literature DB >> 33986416 |
Dari F Da1, Ruth McCabe2, Bernard M Somé3, Pedro M Esperança2, Katarzyna A Sala4,5, Josua Blight5, Andrew M Blagborough4, Floyd Dowell6, Serge R Yerbanga3, Thierry Lefèvre7,8,9, Karine Mouline7,8, Roch K Dabiré3,8, Thomas S Churcher2.
Abstract
There is an urgent need for high throughput, affordable methods of detecting pathogens inside insect vectors to facilitate surveillance. Near-infrared spectroscopy (NIRS) has shown promise to detect arbovirus and malaria in the laboratory but has not been evaluated in field conditions. Here we investigate the ability of NIRS to identify Plasmodium falciparum in Anopheles coluzzii mosquitoes. NIRS models trained on laboratory-reared mosquitoes infected with wild malaria parasites can detect the parasite in comparable mosquitoes with moderate accuracy though fails to detect oocysts or sporozoites in naturally infected field caught mosquitoes. Models trained on field mosquitoes were unable to predict the infection status of other field mosquitoes. Restricting analyses to mosquitoes of uninfectious and highly-infectious status did improve predictions suggesting sensitivity and specificity may be better in mosquitoes with higher numbers of parasites. Detection of infection appears restricted to homogenous groups of mosquitoes diminishing NIRS utility for detecting malaria within mosquitoes.Entities:
Mesh:
Year: 2021 PMID: 33986416 PMCID: PMC8119679 DOI: 10.1038/s41598-021-89715-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The number of laboratory and field mosquitoes analyzed.
| Days since feeding (laboratory-reared mosquitoes) | Wild caught mosquitoes | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 5 | 7 | 9 | 11 | 13 | 15 | 17 | 19 | 21 | Total | Longo | Klesso | Total | |
| Inactivated blood | 100 | 100 | 140 | 100 | 100 | 100 | 109 | 90 | 45 | 20 | 904 | NA | NA | NA |
| Uninfected | 147 | 110 | 106 | 75 | 73 | 70 | 58 | 40 | 39 | 1 | 719 | 2445a | 80 | 2525 |
| Infected (oocysts) | 45 | 88 | 91 | 112 | 104 | 102 | 101 | 90 | 66 | 30 | 829 | 387 | 25 | 412 |
| Infectiousb (sporozoites) | 0 | 0 | 0 | 51 | 92 | 102 | 101 | 90 | 66 | 30 | 532 | 302 | 21 | 323 |
| Total | 292 | 298 | 337 | 287 | 277 | 272 | 268 | 220 | 220 | 51 | 2452 | 2832 | 105 | 2937 |
All data were Anopheles coluzzii mosquitoes infected with wild strains of Plasmodium falciparum.
aBlood source unknown as mosquitoes were collected potentially exposed.
bAll infectious mosquitoes were classified as also infected (whether or not oocysts were visible).
Figure 1The ability of NIRS to predict laboratory-reared mosquitoes infectious with wild parasites. All models were trained on sporozoite positive and sporozoite negative laboratory reared mosquitoes using all the data presented in Table 1. (A) Receiver operating characteristic (ROC) curve illustrating the diagnostic ability of the best-fit model. Overall performance is given by the average area under the ROC curve (AUC). Figure illustrates the false positive and true positive rates achievable for different classification probability thresholds. A theoretical perfect diagnostic would be in the top left corner. Average ROC curve shown by the solid line with boxplots showing the variability for 50 randomizations of the training, validation and testing datasets (horizontal black line shows the median whilst the 25th/75th, 15th/85th and 5th/95th percentiles are shown by box edges, inner and outer whiskers, respectively). (B) Coefficient functions for the best fit model for each of the 50 dataset randomizations (grey lines) and the overall average (black line). (C) Histogram showing the predicted status of tested mosquitoes that were infectious (light blue colored bars) or uninfectious (green bars).Vertical solid black line indicates the best threshold for differentiating between infectious or uninfectious mosquitoes. Darker blue bars indicates where the two distributions overlap and show those mosquitoes misclassified—false negatives are shown to the left of the optimal classification threshold line and false positives to the right. Inset shows the confusion matrix illustrating the different error rates: true negative rate (tnr, specificity); false negative rate (fnr); false positive rate (fpr); and true positive rate (tpr, sensitivity).
Summary of overall accuracy of the different NIRS models for predicting presence of sporozoites.
| Model trained on | Within-sample accuracy | Model predicting | Out-of-sample accuracy | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Best model (Q) | Accuracy (std error) | TPR | TNR | Best within-sample model | Best out-of-sample model | |||||||
| Accuracy (std error) | TPR | TNR | Best model (Q) | Accuracy (std error) | TPR | FPR | ||||||
| Laboratory mosquitoes | GLM (11) | 73% (0.02) | 74% | 72% | Field mosquitoes (all) | 50% (0.01) | 56% | 44% | fsGLM (4) | 52% (0.007) | 52% | 51% |
| Field mosquitoes (all) | fsGLM (2) | 51% (0.04) | 65% | 37% | NA | NA | NA | NA | NA | NA | NA | |
| Field mosquitoes (V1) | pGLM (2) | 51% (0.04) | 57% | 46% | Field mosquitoes (V2) | 51% (0.05) | 58% | 45% | fpGLM (2) | 52% (0.05) | 60% | 43% |
| Field mosquitoes (V2) | fpGLM (5) | 47% (0.1) | 28% | 64% | Field mosquitoes (V1) | 51% (0.02) | 30% | 71% | fspGLM (5) | 51% (0.02) | 39% | 63% |
Models were trained on either laboratory or field mosquitoes, either all mosquitoes grouped together (all) or separately for mosquitoes from the villages of Longo (V1) or Klesso (V2). The number of PLS components (Q) is presented alongside overall models accuracy (the percentage of mosquitoes correctly classified), the true positive rate (TPR) and false positive rate (FPR). This is shown for either within sample accuracy (where the same group of mosquitoes were used to train/validate and test the model) or out-of-sample accuracy (where a different group of wild caught mosquitoes were used). For within-sample accuracy different individual mosquitoes were used to train, validate and test the model though out-of-sample evaluation provides a more robust test as different groups (i.e. laboratory vs field or different field locations) were used to assess accuracy. Two different models are presented for out-of-sample accuracy, either the most accurate either within-sample or out-of-sample (which tend to be more generalizable and have lower numbers of components, denoted Q).
Figure 2The ability of NIRS to predict field caught mosquitoes with high number of sporozoites. All models were trained using mosquitoes infected in the wild and were either sporozoite positive mosquitoes with > 20 sporozoite per Anopheles (20 gene copy number as defined by qPCR) or sporozoite negative mosquitoes (Table 1). (A) The receiver operating characteristic (ROC) curve for the best-fit model demonstrating how the false positive and true positive rates vary for different for different classification probability thresholds. Overall performance is given by the average area under the ROC curve (AUC). A perfect model with 100% sensitivity and specificity would be in the top left corner. Solid line shows the average ROC curve with boxplots showing the variability for 50 randomizations of the training, validation and testing datasets (with box edges, inner and outer whiskers showing 25th/75th, 15th/85th and 5th/95th percentiles, respectively; and the black line inside the box showing the median/50th-percentile). (B) Coefficient functions for the best fit model for each of the 50 dataset randomizations (grey lines) and the corresponding average (black line). (C) The histogram of the estimated linear predictor for the test mosquitoes, the green and light blue colored bars indicate the true class, showing the model’s ability to separate the two groups of mosquitoes. Vertical black line indicates the best threshold for differentiating infectious or uninfectious mosquitoes. The darker blue shaded area where the two distributions overlap corresponds to mosquitoes which have been misclassified—false negatives to the left and false positives to the right of the optimal classification threshold. Inset shows the confusion matrix reporting the different error rates: tnr, true negative rate (specificity); fnr, false negative rate; fpr, false positive rate; and tpr, true positive rate (sensitivity).