| Literature DB >> 35016609 |
Jody Phelan1, Taane G Clark2,3, Wouter Deelder1,4, Gary Napier1, Susana Campino1, Luigi Palla1,5.
Abstract
BACKGROUND: Drug resistant Mycobacterium tuberculosis is complicating the effective treatment and control of tuberculosis disease (TB). With the adoption of whole genome sequencing as a diagnostic tool, machine learning approaches are being employed to predict M. tuberculosis resistance and identify underlying genetic mutations. However, machine learning approaches can overfit and fail to identify causal mutations if they are applied out of the box and not adapted to the disease-specific context. We introduce a machine learning approach that is customized to the TB setting, which extracts a library of genomic variants re-occurring across individual studies to improve genotypic profiling.Entities:
Keywords: Cycloserine; Drug resistance; Ethionamide; Machine learning; Mycobacterium tuberculosis; PAS
Mesh:
Substances:
Year: 2022 PMID: 35016609 PMCID: PMC8753810 DOI: 10.1186/s12864-022-08291-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Predictive performance across algorithms
| Drug | Total tests | % resistance | TB-Profiler | Treesist-TB | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sens | Spec | Acc | AUC | Sens | Spec | Acc | AUC | |||
| INH | 1835 | 16.2 | 86.2 | 98.4 | 96.5 | 92.3 | 84.2 | 99.2 | 96.8 | 91.7 |
| RIF | 2045 | 8.1 | 90.3 | 98.2 | 97.6 | 94.2 | 86.1 | 98.5 | 97.5 | 92.3 |
| EMB | 1999 | 3.5 | 71.4 | 96.7 | 95.8 | 84.1 | 57.1 | 98.2 | 96.8 | 77.7 |
| PAS | 1114 | 8.8 | 38.8 | 95.7 | 90.7 | 67.2 | 64.3 | 90.6 | 88.2 | 77.4 |
| CYS | 833 | 18.0 | 30.7 | 95.2 | 83.6 | 62.9 | 45.3 | 93.7 | 85.0 | 69.5 |
| ETH | 2118 | 32.2 | 71.1 | 78.6 | 76.2 | 74.8 | 72.1 | 75.8 | 74.6 | 73.9 |
| INH | 1835 | 16.2 | 85.6 | 100 | 97.7 | 92.9 | 80.2 | 99.2 | 96.1 | 89.8 |
| RIF | 2045 | 8.1 | 81.2 | 100 | 98.5 | 91.5 | 87.3 | 99.8 | 98.8 | 93.6 |
| EMB | 1999 | 3.5 | 32.9 | 99.7 | 97.3 | 82.9 | 34.3 | 99.5 | 97.2 | 83 |
| PAS | 1114 | 8.8 | 64.3 | 100 | 96.9 | 85.5 | 50 | 97.8 | 93.6 | 74.1 |
| CYS | 833 | 18.0 | 33.3 | 99.4 | 87.5 | 67.3 | 35.3 | 98 | 86.7 | 66.7 |
| ETH | 2118 | 32.2 | 48.8 | 94.3 | 79.7 | 77.5 | 49.6 | 92.5 | 78.7 | 76.2 |
INH Isoniazid, RIF Rifampicin, PAS para-aminosalisylic acid, CYS cycloserine, ETH ethionamide, EMB Ethambutol, Sens Sensitivity, Spec Specificity, Acc Accuracy, AUC Area under the ROC Curve
adefault application of Treesist-TB
bapplication of Treesist-TB with a single combined study dataset
The Treesist-TB inferred variants
| Drug | Gene | # variants in the 32 k dataset* | Treesist-TB Mutations** |
|---|---|---|---|
| RIF | 757 | ||
| RIF | 700 | ||
| INH | 31 | ||
| INH | 26 | ||
| INH | 648 | ||
| EMB | 743 | ||
| EMB | 762 | ||
| PAS | 191 | ||
| PAS | 262 | Q153G, Q153A, S150G, | |
| PAS | 148 | ||
| CYS | 239 | ||
| CYS | 700 | ||
| ETN | 494 | ||
| ETN | 26 | ||
| ETN | 764 | ||
| ETN | 108 | I21T, | |
| ETN | 250 |
* 32 k M. tuberculosis isolates [18]
** Bolded if not in TB-Profiler in https://github.com/jodyphelan/tbdb/blob/master/tbdb.csv; * stop codon
INH Isoniazid, RIF Rifampicin, PAS para-aminosalisylic acid, CYS cycloserine, ETH ethionamide, EMB Ethambutol
** Mutations underlined if they are in > 5% of MDR-TB or XDR-TB strains in the 32 k M. tuberculosis isolates