| Literature DB >> 35211720 |
Tingting Yang1, Mingyu Gan2, Qingyun Liu2, Wenying Liang3, Qiqin Tang3, Geyang Luo4, Tianyu Zuo5, Yongchao Guo6, Chuangyue Hong7, Qibing Li3, Weiguo Tan7, Qian Gao4.
Abstract
Whole genome sequencing (WGS) can provide insight into drug-resistance, transmission chains and the identification of outbreaks, but data analysis remains an obstacle to its routine clinical use. Although several drug-resistance prediction tools have appeared, until now no website integrates drug-resistance prediction with strain genetic relationships and species identification of nontuberculous mycobacteria (NTM). We have established a free, function-rich, user-friendly online platform for MTB WGS data analysis (SAM-TB, http://samtb.szmbzx.com) that integrates drug-resistance prediction for 17 antituberculosis drugs, detection of variants, analysis of genetic relationships and NTM species identification. The accuracy of SAM-TB in predicting drug-resistance was assessed using 3177 sequenced clinical isolates with results of phenotypic drug-susceptibility tests (pDST). Compared to pDST, the sensitivity of SAM-TB for detecting multidrug-resistant tuberculosis was 93.9% [95% confidence interval (CI) 92.6-95.1%] with specificity of 96.2% (95% CI 95.2-97.1%). SAM-TB also analyzes the genetic relationships between multiple strains by reconstructing phylogenetic trees and calculating pairwise single nucleotide polymorphism (SNP) distances to identify genomic clusters. The incorporated mlstverse software identifies NTM species with an accuracy of 98.2% and Kraken2 software can detect mixed MTB and NTM samples. SAM-TB also has the capacity to share both sequence data and analysis between users. SAM-TB is a multifunctional integrated website that uses WGS raw data to accurately predict antituberculosis drug-resistance profiles, analyze genetic relationships between multiple strains and identify NTM species and mixed samples containing both NTM and MTB. SAM-TB is a useful tool for guiding both treatment and epidemiological investigation.Entities:
Keywords: drug-resistant tuberculosis; drug-susceptibility testing; nontuberculous mycobacteria; transmission; whole genome sequencing
Mesh:
Substances:
Year: 2022 PMID: 35211720 PMCID: PMC8921607 DOI: 10.1093/bib/bbac030
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1SAM-TB analysis pipelines. The SAM-TB platform includes three analysis pipelines: single sample variant analysis (light brown background), phylogenetic analysis (green background), and pairwise SNP distance (blue background). The single sample variant analysis is composed of four modules: (1) read quality analysis; (2) MTB/NTM species identification; (3) variant detection and annotation; (4) molecular drug-susceptibility test (mDST).
The functions of SAM-TB platform
| Function category | Function | Detail |
|---|---|---|
| Data analysis | Detect genome-wide variants | SAM-TB performs variants detection by mapping the sequencing reads to the reference genome H37Rv (NC000962.3) and provides annotation for all identified variants. |
| Predict drug-resistance | SAM-TB performs mDST analysis for 17 anti-tuberculosis drugs and provides the confidence level of the mutations for predicting resistance. It can also predict susceptibility to the four first-line drugs. | |
| Analyze the genetic relationship between strains | The integration of the phylogenetic tree and the pairwise SNP distances can be used to analyze the genetic relationship between strains, providing the basis for identifying clusters and inferring recent transmission. | |
| Identify NTM species/complex | SAM-TB integrates mlstverse software, which is able to identify 175 NTM species. SAM-TB will also identify samples containing both NTM and MTB. | |
| Data sharing | Share sequencing data | SAM-TB users can share sequencing data and analysis results with each other. The shared data and analysis can be viewed and used for further analysis. |
| Share analysis results |
The accuracy of SAM-TB for predicting resistance or susceptibility to the four first-line drugs
| Drug | Resistant phenotype | Susceptible phenotype | Sensitivity | Specificity | PPV | NPV | Sensitivity all[ | Specificity all[ | NGP | RP | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R | S | N.D. | Total | NGP | R | S | N.D. | Total | NGP | ||||||||||
| Isoniazid | 1536 | 44 | 38 | 1618 | 2.3 | 21 | 1367 | 66 | 1454 | 4.5 | 97.2 | 98.5 | 98.7 | 96.9 | 94.9 | 94.0 | 3.4 | 52.7 | |
| Rifampin | 1508 | 37 | 14 | 1559 | 0.9 | 46 | 1448 | 66 | 1560 | 4.2 | 97.6 | 96.9 | 97.0 | 97.5 | 96.7 | 92.8 | 2.6 | 50.0 | |
| Ethambutol | 907 | 47 | 27 | 981 | 2.8 | 206 | 1546 | 216 | 1968 | 11.0 | 95.1 | 88.2 | 81.5 | 97.0 | 92.5 | 78.6 | 8.2 | 33.3 | |
| Pyrazinamide | 612 | 34 | 39 | 685 | 5.7 | 95 | 1719 | 49 | 1863 | 2.6 | 94.7 | 94.8 | 86.6 | 98.1 | 89.3 | 92.3 | 3.5 | 26.9 | |
Note: NGP, no genotypic prediction; NPV, negative predictive value; PPV, positive predictive value; and RP, resistance prevalence. Unless otherwise indicated, percentages are based on genotypic predictions of resistant (R) or susceptible (S) only (i.e. excluding isolates with mutations of unknown resistance association and genotypic predictions that failed because of missing data around a genomic resistance locus [N.D.]).
†Percentages were calculated with the total number of isolates (R, S and N.D.) as the denominator.
Accuracy of SAM-TB for predicting resistance to other drugs
| Drug | Resistant phenotype | Susceptible phenotype | Sensitivity | Specificity | PPV | NPV | RP | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R | N.D. | Total | R | N.D. | Total | |||||||
| Streptomycin | 514 | 62 | 576 | 43 | 118 | 161 | 89.2 | 73.3 | 92.3 | 65.6 | 78.2 | |
| Ethionamide | 297 | 44 | 341 | 153 | 162 | 315 | 87.1 | 51.4 | 66.0 | 78.6 | 52.0 | |
| Amikacin | 216 | 48 | 264 | 5 | 358 | 363 | 81.8 | 98.6 | 97.7 | 88.2 | 42.1 | |
| Capreomycin | 264 | 69 | 333 | 20 | 350 | 370 | 79.3 | 94.6 | 93.0 | 83.5 | 47.4 | |
| Kanamycin | 300 | 50 | 350 | 11 | 309 | 320 | 85.7 | 96.6 | 96.5 | 86.1 | 52.2 | |
| Moxifloxacin | 101 | 19 | 120 | 71 | 184 | 255 | 84.2 | 72.2 | 58.7 | 90.6 | 32.0 | |
| Ofloxacin | 393 | 65 | 458 | 24 | 284 | 308 | 85.8 | 92.2 | 94.2 | 81.4 | 59.8 | |
| Para-aminosalisylic acid | 26 | 57 | 83 | 19 | 395 | 414 | 31.3 | 95.4 | 57.8 | 87.4 | 16.7 | |
| Cycloserine | 43 | 108 | 151 | 16 | 317 | 333 | 28.5 | 95.2 | 72.9 | 74.6 | 31.2 | |
Note: NPV, negative predictive value; PPV, positive predictive value; RP, resistance prevalence. If drug-resistance mutation is detected in a sample, the sample is designated as resistant (R) to the drug; otherwise, the prediction result is ‘No Resistance Mutations Detected (N.D.)’.
Figure 2Inference of transmission clusters and annotation of drug-resistance mutations acquired during transmission. (A) The schematic diagram shows the inference of recent transmission clusters based on the results of pairwise SNP distance and phylogenetic analysis. The upper left is a schematic diagram of the phylogenetic tree, with different colors indicating different lineages. To the right of this is a schematic diagram of the SNP distance between strain pairs whose distance is less than a given threshold. On the lower tree the red branches indicate genomic clusters and the red stars indicate clustered strains (SNP distance threshold ≤12). (B) The diagram shows the evolution of drug-resistance during transmission by annotating the resistance mutations on the phylogenetic tree. The colors indicate mutations conferring resistance to different drugs. INH, isoniazid; RIF, rifampicin; EMB, ethambutol; PZA, pyrazinamide; SM, streptomycin; FQ, fluoroquinolone.
Figure 3Prediction of mycobacterial species with SAM-TB. Rows represents the species/subspecies reported in NCBI and columns represents the species/subspecies identified by the mlstverse software in SAM-TB. The black boxes indicate the different species in the Mycobacterium avium or Mycobacterium tuberculosis complexes and the blue boxes indicate the different subspecies of Mycobacterium avium, Mycobacterium fortuitum and Mycobacterium abscessus. The red font indicates two strains whose species were unspecified by NCBI but identified with mlstverse.