| Literature DB >> 32903746 |
Supriya D Mehta1, Dan Zhao1, Stefan J Green2, Walter Agingu3, Fredrick Otieno3, Runa Bhaumik1, Dulal Bhaumik1, Robert C Bailey1.
Abstract
Background: We determined the predictive accuracy of penile bacteria for incident BV in female sex partners. In this prospective cohort, we enrolled Kenyan men aged 18-35 and their female sex partners aged 16 and older. We assessed BV at baseline, 1, 6, and 12 months. Incident BV was defined as a Nugent score of 7-10 at a follow-up visit, following a Nugent score of 0-6 at baseline. Amplification of the V3-V4 region of the bacterial 16S rRNA gene was performed on meatal and glans/coronal sulcus swab samples. Majority vote classifier combined the decisions of three machine learning classification algorithms (Random Forest, Support Vector Machine, K Nearest Neighbor). We report the estimate cross-validation predictive accuracy for incident BV based on baseline penile taxa.Entities:
Keywords: Kenya; bacterial vaginosis; circumcision; ensemble voting; machine learning; penile microbiome; penile microbiota; synthetic minority oversampling technique
Mesh:
Substances:
Year: 2020 PMID: 32903746 PMCID: PMC7438843 DOI: 10.3389/fcimb.2020.00433
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 5.293
Baseline characteristics of couples included in analyses.
| Bacterial vaginosis (BV) status at follow-up | ||||
| Persistent negative | 116 (69.0) | 54 (69.2) | ||
| Incident | 52 (31.0) | 24 (30.8) | ||
| Nugent score at baseline | ||||
| 0–3 | 142 (84.5) | 69 (88.5) | ||
| 4–6 | 26 (15.5) | 9 (11.5) | ||
| Time in months to first incident BV, among women with Nugent score 0–3 at baseline | ||||
| 1 month | 13 (35.1) | 4 (20.0) | ||
| 6 months | 16 (43.2) | 10 (50.0) | ||
| 12 months | 8 (21.6) | 6 (30.0) | ||
| Time in months to first incident BV, among women with Nugent score 4–6 at baseline | ||||
| 1 month | 10 (66.7) | 4 (100) | ||
| 6 months | 4 (26.7) | |||
| 12 months | 1 (6.7) | |||
| Male partner circumcision status | ||||
| Circumcised Incident BV in• female partner | 99 (58.9) | 26 (26.3) | 52 (66.7) | 14 (26.9) |
| Uncircumcised Incident BV in• female partner | 69 (41.1) | 26 (37.3) | 26 (33.3) | 10 (38.5) |
| Median age, years (IQR) | 27 (24–30) | 23 (20–25) | 27.5 (25–31) | 24 (21–26) |
| Number of sex partners past 6 months | ||||
| 1 | 132 (79.0) | 163 (98.8) | 56 (71.8) | 76 (98.7) |
| 2 or more | 35 (21.0) | 2 (1.2) | 22 (28.2) | 1 (1.3) |
| Missing | 1 | 3 | 1 | |
| Condom used at last sex | 28 (16.7) | 28 (16.7) | 14 (18.0) | 14 (18.0) |
The 78 couples in which glans/coronal sulcus sample is available from men are a subset of the 168 couples.
Figure 1Bacterial relative abundance heatmap for 20 most abundant meatal taxa by incident Bacterial vaginosis status. Observations from 168 samples are sorted by female partner BV status. The top bar reflects observations where the female partner is persistently BV negative [gray] vs. those with incident BV [black].
Presence and mean relative abundance of 20 most abundant meatal taxa by incident Bacterial vaginosis (BV) status.
| 115 (99) | 50 (96) | 18.3 (21.3) | 15.7 (17.9) | |
| 94 (81) | 38 (73) | 8.89 (16.7) | 11.0 (22.2) | |
| 114 (98) | 51 (98) | 9.69 (9.82) | 7.97 (7.86) | |
| 115 (99) | 51 (98) | 8.27 (11.9) | 7.06 (9.84) | |
| 67 (58) | 34 (65) | 6.28 (15.2) | 8.62 (19.6) | |
| 107 (92) | 47 (90) | 5.40 (7.05) | 5.86 (8.10) | |
| 105 (91) | 44 (85) | 5.96 (10.3) | 4.66 (8.54) | |
| 44 (38) | 28 (54) | 4.04 (11.1) | 6.41 (12.4) | |
| 59 (51) | 25 (48) | 4.62 (12.2) | 2.60 (8.45) | |
| 80 (69) | 43 (83) | 3.13 (9.20) | 3.02 (7.01) | |
| 64 (55) | 33 (63) | 2.54 (5.94) | 4.04 (8.35) | |
| 56 (48) | 34 (65) | 1.86 (4.64) | 3.34 (5.65) | |
| 62 (53) | 33 (63) | 1.60 (3.89) | 2.01 (5.35) | |
| 78 (67) | 33 (63) | 1.95 (4.20) | 1.21 (3.07) | |
| 43 (37) | 19 (37) | 1.49 (6.09) | 0.56 (1.66) | |
| 20 (17) | 16 (31) | 1.05 (4.04) | 1.17 (2.89) | |
| 75 (64) | 29 (56) | 1.15 (2.86) | 0.93 (3.42) | |
| 56 (48) | 22 (42) | 1.30 (4.40) | 0.52 (1.22) | |
| 36 (31) | 15 (29) | 0.79 (4.17) | 0.47 (2.19) | |
| 24 (21) | 12 (23) | 0.51 (3.47) | 0.99 (6.45) | |
SD, Standard Deviation; BV, Bacterial vaginosis.
Classification performance for prediction of incident Bacterial vaginosis in women by male partner's meatal microbiome.
| Accuracy | 0.733 | 0.740 | 0.771 | 0.775 |
| Specificity | 0.772 | 0.724 | 0.609 | 0.746 |
| Sensitivity | 0.690 | 0.757 | 0.952 | 0.807 |
| Area under the curve (AUC) | 0.790 | 0.827 | 0.889 | 0.888 |
Figure 2AUC distribution generated from voting classification of incident Bacterial vaginosis for penile microbiomes. The x-axis represents the area under the curve (AUC) of the predictive accuracy. The y-axis indicates the bacterial dataset, meatal (orange) or glans/coronal sulcus (blue). The box plots show the median (centerline), upper and lower quartiles (box shoulders), and outliers (black dots). The results are based on 1,000 simulations.
Variable importance ranking by classifier method and for voting from meatal samples: top 20 Taxa by voting.
| 3 | 4 | 4 | 1 | |
| 6 | 1 | 13 | 2 | |
| 5 | 12 | 11 | 3 | |
| 1 | 2 | 27 | 4 | |
| 10 | 20 | 7 | 5 | |
| 16 | 5 | 19 | 6 | |
| 11 | 7 | 22 | 7 | |
| 8 | 8 | 26 | 8 | |
| 18 | 15 | 9 | 9 | |
| 17 | 16 | 16 | 10 | |
| 4 | 9 | 42 | 11 | |
| 33 | 10 | 14 | 12 | |
| 7 | 6 | 45 | 13 | |
| 15 | 45 | 2 | 14 | |
| 40 | 3 | 20 | 15 | |
| 30 | 28 | 5 | 16 | |
| Circumcised (vs. uncircumcised) | 9 | 27 | 32 | 17 |
| 2 | 18 | 49 | 18 | |
| 21 | 11 | 43 | 19 | |
| 19 | 23 | 36 | 20 |
This table shows the 20 top-ranked taxa by voting, and their variable importance ranking according to each of the other classifiers. The voting importance ranking is determined by averaging the rank of across the three classifiers. We use conditional formatting (Excel) to facilitate reading ranks across the three classifiers, whereby red represents taxa ranked with higher importance and blue represents taxa that are ranked with lower importance.
Figure 3Venn Diagram of 20 top-ranked meatal taxa predicting Bacterial vaginosis, by machine learning classifier. This Venn diagram shows the overlap of the top 20 important variables from each classifier: KNN, K Nearest Neighbor; RF, Random Forest; SVM, Support Vector Machine. The unique variables are listed for each classifier and the common taxa across all three are listed as indicated.