| Literature DB >> 31736907 |
Camilla Hundahl Johnsen1, Philip T L C Clausen1, Frank M Aarestrup1, Ole Lund1.
Abstract
Resistance in Mycobacterium tuberculosis is a major obstacle for effective treatment of tuberculosis. Multiple studies have shown promising results for predicting drug resistance in M. tuberculosis based on whole genome sequencing (WGS) data, however, these tools are often limited to this single species. We have previously developed a common platform for resistance prediction in multiple species. This platform detects acquired resistance genes (ResFinder) and species-specific chromosomal mutations (PointFinder) associated with resistance, all based on WGS data. In this study, we present a new version of PointFinder together with an updated M. tuberculosis database. PointFinder now includes predictions based on insertions and deletions, and it explicitly reports frameshift mutations and premature stop codons. We found that premature stop codons in four resistance-associated genes (katG, ethA, pncA, and gidB) were over-represented in resistant strains, and we saw an increased prediction performance when including premature stop codons in these genes as resistance markers. Different M. tuberculosis resistance prediction tools vary in performance mostly due to the mutation library used. We found that a well-established mutation library included non-predictive linage markers, and through forward feature selection we eliminated those from the mutation library. Compared to other similar web-based tools, PointFinder performs equally good. The advantages of PointFinder is that together with ResFinder it serves as a common web-based and downloadable platform for resistance detection in multiple species. It is easy to use for clinicians and already widely used in the research community.Entities:
Keywords: antimicrobial resistance (AMR); bioinformatics; resistance prediction; tuberculosis; whole genome sequencing
Year: 2019 PMID: 31736907 PMCID: PMC6834686 DOI: 10.3389/fmicb.2019.02464
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Genes and genomic regions of interest for drug resistance in M. tuberculosis.
| Rifampicin | rpoB, rpoC | ||
| Isoniazid | katG, inhA, kasA, ahpC | katG promoter, ahpC promoter, fabGl promoter | |
| Streptomycin | rpsL, | rrs (16SrRNA) | |
| Ethambutol | embA, embR, | embA promoter | |
| Amikacin | rrs (16SrRNA) | ||
| Capreomycin | tlyA | rrs (16SrRNA) | |
| Ethionamide | ethR, ethA, inhA | fabGl promoter, | |
| Kanamycin | rrs (16SrRNA) | eis promoter | |
| Pyrazinamide | pncA, panD, rpsA | pncA promoter | |
| Fluoroquinolone | gyrA, gyrB | ||
| Para-aminosalicylic acid | ridD, folC, thyA | ||
| Linezolid | rplC | rrl (23S rRNA) | |
| Bedaquiline | Rv0678 | ||
| Clofazimine | Rv0678 | ||
| d-Cycloserine | |||
| XDR-TB |
FIGURE 1Flow chart describing the PointFinder workflow. The input sequences are aligned to a database of reference genes. The genetic differences observed in the alignments are compared to a mutation library, with annotated phenotypes. Based on this a resistance phenotype prediction is made.
PointFinder predicted resistance compared with phenotypic drug susceptibility testing on the ReSeq data set.
| RMP | 771 | 2710 | 0.965 | 0.887 | 0.848 |
| INH | 1093 | 2420 | 0.929 | 0.903 | 0.819 |
| STM | 728 | 1239 | 0.874 | 0.798 | 0.670 |
| EMB | 466 | 3040 | 0.796 | 0.850 | 0.484 |
| PZA | 325 | 2993 | 0.935 | 0.575 | 0.475 |
| KAN | 76 | 617 | 0.989 | 0.776 | 0.814 |
| FLQ | 240 | 1175 | 0.956 | 0.679 | 0.664 |
| AMK | 107 | 866 | 0.785 | 0.766 | 0.386 |
| ETH | 49 | 175 | 0.943 | 0.469 | 0.481 |
| CAP | 116 | 1024 | 0.971 | 0.474 | 0.512 |
Occurrence of resistance-associated genes with premature stop codons in resistant or susceptible strains in the ReSeq data set.
| RMP | 771 | 2710 | rpoC | 3 | 1 | 0.011 |
| 2 | 0 | 0.008∗ | ||||
| inhA | 1 | 0 | 0.137 | |||
| INH | 1093 | 2420 | kasA | 3 | 0 | 0.01∗ |
| ahpC | 1 | 2 | 0.934 | |||
| 13 | 0 | < 1.0e−5∗ | ||||
| STM | 728 | 1239 | rpsL | 1 | 0 | 0.192 |
| gidB | 42 | 40 | 0.006∗ | |||
| embA | 1 | 4 | 0.658 | |||
| embR | 3 | 4 | 0.021 | |||
| EMB | 466 | 3040 | embC | 1 | 4 | 0.658 |
| embB | 1 | 2 | 0.306 | |||
| ubiA | 1 | 0 | 0.011 | |||
| PZA | 325 | 2993 | 372 | 7 | < 1.0e−5∗ | |
| 6 | 0.148 | |||||
| ETH | 49 | 175 | inhA | 1 | 0 | 0.058 |
| 21 | 24 | < 1.0e−5∗ | ||||
| CAP | 116 | 1024 | 2 | 2 | 0.008∗ |
PointFinder predicted resistance compared with phenotypic drug susceptibility testing on the ReSeq data set when considering premature stop codons in katG, pncA, ethA, and gidB as resistance markers.
| RMP | 771 | 2710 | 0.965 | 0.887 | 0.848 |
| INH | 1093 | 2420 | 0.929 | 0.909 | 0.823 |
| STM | 728 | 1239 | 0.847 | 0.839 | 0.674 |
| EMB | 466 | 3040 | 0.796 | 0.850 | 0.484 |
| PZA | 325 | 2993 | 0.935 | 0.612 | 0.502 |
| KAN | 76 | 617 | 0.989 | 0.776 | 0.814 |
| FLQ | 240 | 1175 | 0.956 | 0.679 | 0.664 |
| AMK | 107 | 866 | 0.785 | 0.766 | 0.386 |
| ETH | 50 | 175 | 0.829 | 0.800 | 0.564 |
| CAP | 116 | 1024 | 0.971 | 0.474 | 0.512 |
Forward feature selection of resistance mutations on the ReSeq data set.
| 1 | 0.878 | 0.860 | ||
| RMP | 2 | 0.868 | 0.878 | |
| 3 | 0.869 | 0.874 | ||
| 1 | 0.880 | 0.884 | ||
| INH | 2 | 0.881 | 0.882 | |
| 3 | 0.883 | 0.877 | ||
| 1 | 0.743 | 0.744 | ||
| STM | 2 | 0.733 | 0.762 | |
| 3 | 0.757 | 0.715 | ||
| 1 | 0.648 | 0.590 | ||
| embA promoter −12C > T, embA promoter −16C > T | ||||
| EMB | 2 | 0.643 | 0.616 | |
| 3 | 0.623 | 0.660 | ||
| 1 | 0.589 | 0.625 | ||
| PZA | 2 | 0.590 | 0.636 | |
| 3 | 0.638 | 0.537 | ||
| 1 | 0.789 | 0.864 | ||
| KAN | 2 | 0.829 | 0.785 | |
| 3 | 0.826 | 0.795 | ||
| 1 | 0.699 | 0.691 | ||
| FLQ | 2 | 0.686 | 0.713 | |
| 3 | 0.702 | 0.687 | ||
| 1 | 0.702 | 0.744 | ||
| AMK | 2 | 0.722 | 0.705 | |
| 3 | 0.725 | 0.699 | ||
| 1 | 0.652 | 0.393 | ||
| ETH | 2 | ethA prem. stop codon | 0.514 | 0.407 |
| 3 | 0.543 | 0.592 | ||
| 1 | 0.511 | 0.514 | ||
| CAP | 2 | 0.511 | 0.514 | |
| 3 | 0.514 | 0.509 |
PointFinder predicted resistance compared with phenotypic drug susceptibility testing on the ReSeq data set after including premature stop codons and exclusion of non-predictive mutations.
| RMP | 771 | 2710 | 0.978 | 0.878 | 0.871 |
| INH | 1093 | 2420 | 0.974 | 0.895 | 0.881 |
| STM | 728 | 1239 | 0.907 | 0.835 | 0.743 |
| EMB | 466 | 3040 | 0.898 | 0.848 | 0.631 |
| PZA | 325 | 2993 | 0.974 | 0.575 | 0.604 |
| KAN | 76 | 617 | 0.992 | 0.776 | 0.814 |
| FLQ | 240 | 1175 | 0.968 | 0.679 | 0.695 |
| AMK | 107 | 866 | 0.992 | 0.607 | 0.716 |
| ETH | 50 | 175 | 0.829 | 0.800 | 0.564 |
| CAP | 116 | 1024 | 0.971 | 0.474 | 0.512 |
Occurrence of resistance-associated genes with premature stop codons found in resistant or susceptible strains in the validation data set.
| RMP | 596 | 1814 | rpoC | 1 | 1 | 0.407 |
| INH | 768 | 1641 | ahpC | 0 | 1 | 0.566 |
| katG | 8 | 2 | 0.001∗ | |||
| STM | 379 | 687 | gidB | 28 | 27 | 0.015 |
| PZA | 248 | 420 | pncA | 25 | 0 | <1.0e−5∗ |
| ETH | 186 | 245 | ethR | 0 | 1 | 0.383 |
| ethA | 24 | 28 | 0.642 | |||
| CAP | 191 | 261 | tlyA | 0 | 1 | 0.392 |
Validating the effect of including premature stop codons and excluding non-predictive mutations from the mutation library.
| RMP | 596 | 1814 | 0.978 | 0.896 | 0.885 | 0.978 | 0.896 | 0.885 | 0.986 | 0.886 | 0.895 |
| INH | 768 | 1641 | 0.946 | 0.879 | 0.826 | 0.945 | 0.880 | 0.826 | 0.969 | 0.870 | 0.854 |
| STM | 379 | 687 | 0.85 | 0.744 | 0.592 | 0.817 | 0.805 | 0.606 | 0.902 | 0.778 | 0.688 |
| EMB | 304 | 880 | 0.739 | 0.914 | 0.576 | 0.739 | 0.914 | 0.576 | 0.825 | 0.908 | 0.666 |
| PZA | 239 | 420 | 0.971 | 0.799 | 0.802 | 0.971 | 0.808 | 0.809 | 0.971 | 0.808 | 0.809 |
| KAN | 211 | 176 | 0.920 | 0.896 | 0.814 | 0.920 | 0.896 | 0.814 | 0.920 | 0.896 | 0.814 |
| FLQ | 271 | 296 | 0.905 | 0.878 | 0.784 | 0.905 | 0.878 | 0.784 | 0.909 | 0.878 | 0.788 |
| AMK | 213 | 278 | 0.953 | 0.850 | 0.814 | 0.953 | 0.850 | 0.814 | 0.964 | 0.826 | 0.807 |
| ETH | 184 | 245 | 0.624 | 0.853 | 0.479 | 0.539 | 0.919 | 0.479 | 0.539 | 0.919 | 0.479 |
| CAP | 191 | 261 | 0.877 | 0.864 | 0.738 | 0.877 | 0.864 | 0.738 | 0.877 | 0.864 | 0.738 |
Comparing PointFinder and PhyResSE prediction performance.
| RMP | 14 | 77 | 0.961 | 1.000 | 0.890 | 0.935 | 1.000 | 0.830 | 0.751 |
| INH | 29 | 62 | 0.952 | 0.897 | 0.848 | 0.968 | 0.931 | 0.899 | 0.262 |
| STM | 37 | 54 | 0.981 | 0.649 | 0.693 | 0.981 | 0.838 | 0.843 | 0.047∗ |
| EMB | 14 | 77 | 0.961 | 0.857 | 0.796 | 0.974 | 0.857 | 0.831 | 0.395 |
| PZA | 8! | 83! | 0.964 | 0.750 | 0.677 | 0.964 | 0.625! | 0.589 | 0.666 |