| Literature DB >> 30333483 |
Erol S Kavvas1, Edward Catoiu1, Nathan Mih1,2, James T Yurkovich1,2, Yara Seif1, Nicholas Dillon3,4, David Heckmann1, Amitesh Anand1, Laurence Yang1, Victor Nizet3,4, Jonathan M Monk5, Bernhard O Palsson6,7,8.
Abstract
Mycobacterium tuberculosis is a serious human pathogen threat exhibiting complex evolution of antimicrobial resistance (AMR). Accordingly, the many publicly available datasets describing its AMR characteristics demand disparate data-type analyses. Here, we develop a reference strain-agnostic computational platform that uses machine learning approaches, complemented by both genetic interaction analysis and 3D structural mutation-mapping, to identify signatures of AMR evolution to 13 antibiotics. This platform is applied to 1595 sequenced strains to yield four key results. First, a pan-genome analysis shows that M. tuberculosis is highly conserved with sequenced variation concentrated in PE/PPE/PGRS genes. Second, the platform corroborates 33 genes known to confer resistance and identifies 24 new genetic signatures of AMR. Third, 97 epistatic interactions across 10 resistance classes are revealed. Fourth, detailed structural analysis of these genes yields mechanistic bases for their selection. The platform can be used to study other human pathogens.Entities:
Mesh:
Year: 2018 PMID: 30333483 PMCID: PMC6193043 DOI: 10.1038/s41467-018-06634-y
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Identification of key resistance-conferring genes using mutual information. The pairwise mutual information (vertical axis) between the pan-genome alleles and antibiotic resistance was calculated across all possible pairs. The listed genes correspond to the pan-genome alleles that hold the most information about the listed drug’s AMR phenotype
Known AMR genes uncovered by machine learning
| Antibiotics | Known AMR genes |
|---|---|
| Isoniazid | |
| Rifampicin | |
| Ethambutol | |
| Pyrazinamide |
|
| Streptomycin | |
| Ofloxacin |
|
| 4-Aminosalicylic acid | |
| Ethionamide | |
| Known AMR genes associated with other antibiotics |
The eight antibiotics shown each have an AUC greater than 0.80 (Supplementary Fig. 5)
aNot found in top 40 ranked alleles determined by mutual information, chi-squared, and ANOVA F-test
Fig. 2Allele co-occurrence tables of correlated AMR genes. Co-occurrence of epistatic genes identified in a ethambutol and b isoniazid. For the rows on the bottom and on the far right, #R refers to the total number of strains that have the allele and are resistant to the specific drug. Total refers to the total number of strains that have that allele that were tested on that specific drug. Each cell is colored by the log odds ratio (LOR) with respect to the AMR phenotype. The numbers in the bottom right of each allele co-occurrence box describes the number of unique sublineages comprised by the strains with both alleles (Methods). The alleles enclosed by a purple box represent those chosen as features by the support vector machine (SVM). Note that in some cases the rows and columns do not sum up to the total strains due to rare cases when strains lack those alleles (Methods)
Newly proposed AMR genes
| Gene | Drug | Dominant allele | Mutation | Structural domain feature |
|---|---|---|---|---|
|
| EMB, XDR | R: (25/26) | SNP | Outside transmembrane helical domain |
|
| EMB | S: (2/37, 9/129) | SNP | Proximal to DNA-binding domain |
|
| EMB | R: (8/11) | SNP | – |
|
| EMB | S: (1/27, 11/127) | SNP | – |
|
| EMB | R: (80/91) | SNP 11 | Inside transmembrane helical domain |
|
| INH | R: (66/66, 26/26) | SNP 253 | TPP enzyme M-terminal domain |
|
| ETA | R: (29/37, 34/60) | SNP 296 | DELs in mutagen and helical domain |
|
| ETA | R: (48/58, 8/12) | SNP 105 | Inside beta-lactamase domain |
|
| ETA, XDR, SM | R: (48/50) | SNP 64 | Inside Cupin 1 domain |
|
| PAS | R: (35/48) | SNP 520 | – |
|
| PAS | R: (13/13) | DEL 137–264 | BAC Luciferase |
|
| PAS | R: (34/46, 4/6) | SNP 223 | Different mutational backgrounds |
|
| PZA | S: (6/41) | DEL 1–80 | Compositional bias Proline-rich domain |
|
| RIF, INH | S: (9/67, 6/46, 5/51) | SNP 295 | – |
|
| RIF | S: (10/91, 12/79) | SNP 119 | Within opuAC signaling domain |
|
| RIF, MDR, INH | R: (18/19) | SNP 196 | No mutation in methyltransferase domain |
|
| RIF, MDR | S: (10/84, 12/80) | SNP 128 | Proximal to binding domain |
|
| MDR, PAS | R: (17/17) | SNP 503 | Outside transmembrane helical domain |
|
| SM | R: (22/22) | SNP 233 | Proximal to nucleotide binding domain 213 |
|
| SM | R: (30/30) | SNP 87 | Within transmembrane helical domain |
|
| OFX, MDR | R: (16/16) | SNP 127 | Within CoA carboxyltransferase domain |
|
| RIF, OFX, SM, MDR | R: (20/28, 25/44) | SNP 140 | SNP in ATP binding domain |
|
| XDR | R: (14/23, 14/20) | DEL 88–138 | Within second magnesium binding domain |
The mutation column represents the distinguishing mutation for the resistant or susceptible-dominant allele(s). Abbreviations: R, resistant; S, susceptible; EMB, ethambutol; PAS, para-aminosalicylic acid; INH, isoniazid; PZA, pyrazinamide; RMP, rifampicin; SM, streptomycin; OFX, ofloxacin; ETA, ethionamide; MDR, multidrug resistant; XDR, extensively-drug resistant
Fig. 33D and annotated protein structure mutation maps for identified AMR genes. a 3D protein structures with mapped mutations are shown for inhA, embR, and oxcA. The colors adjacent to and within the structural mutation table correspond to domains and mutations displayed on the protein structure, respectively. b Mutation tables for seven new AMR genes. The colors in the mutation table correspond to the incidence of an annotated structural feature located below the table. The two rows directly below the mutation table are colored according to the log odds ratio between the allele frequency and AMR phenotype. Two AMR classes are shown for Rv3471c and Rv3041c