Literature DB >> 23995135

Genomic analysis identifies targets of convergent positive selection in drug-resistant Mycobacterium tuberculosis.

Maha R Farhat1, B Jesse Shapiro, Karen J Kieser, Razvan Sultana, Karen R Jacobson, Thomas C Victor, Robin M Warren, Elizabeth M Streicher, Alistair Calver, Alex Sloutsky, Devinder Kaur, Jamie E Posey, Bonnie Plikaytis, Marco R Oggioni, Jennifer L Gardy, James C Johnston, Mabel Rodrigues, Patrick K C Tang, Midori Kato-Maeda, Mark L Borowsky, Bhavana Muddukrishna, Barry N Kreiswirth, Natalia Kurepina, James Galagan, Sebastien Gagneux, Bruce Birren, Eric J Rubin, Eric S Lander, Pardis C Sabeti, Megan Murray.   

Abstract

M. tuberculosis is evolving antibiotic resistance, threatening attempts at tuberculosis epidemic control. Mechanisms of resistance, including genetic changes favored by selection in resistant isolates, are incompletely understood. Using 116 newly sequenced and 7 previously sequenced M. tuberculosis whole genomes, we identified genome-wide signatures of positive selection specific to the 47 drug-resistant strains. By searching for convergent evolution--the independent fixation of mutations in the same nucleotide position or gene--we recovered 100% of a set of known resistance markers. We also found evidence of positive selection in an additional 39 genomic regions in resistant isolates. These regions encode components in cell wall biosynthesis, transcriptional regulation and DNA repair pathways. Mutations in these regions could directly confer resistance or compensate for fitness costs associated with resistance. Functional genetic analysis of mutations in one gene, ponA1, demonstrated an in vitro growth advantage in the presence of the drug rifampicin.

Entities:  

Mesh:

Year:  2013        PMID: 23995135      PMCID: PMC3887553          DOI: 10.1038/ng.2747

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


The evolution of antibiotic drug resistant bacteria is a major public health concern. To combat antibiotic resistant infections, we must not only develop new drugs, but also learn to use existing drugs more effectively. With some exceptions (e.g. in the case of phenotypic drug tolerance), resistance is encoded in the bacterial genome; therefore, resistance-associated mutations, whether they are directly causal of resistance or not, can serve as biomarkers which can be rapidly identified in the clinic by PCR or sequencing-based assays. These molecular biomarkers allow the determination of a bacterial infection's drug resistance profile in a matter of hours, instead of the days or weeks required for culture-based diagnostics. In some cases, this time lag can make the difference between a successful or unsuccessful treatment. Here, we describe a method to identify biomarkers of drug resistance in a rapid and unbiased manner. It consists of sequencing the genomes of bacteria with different resistance phenotypes, and applying phylogenetic methods and statistical tests for positive selection to identify variants in the genome that are consistently associated with resistance. The method is amenable to different microbes with different phenotypes of interest. Here, we apply it to identify biomarkers of drug resistance in Mycobacterium tuberculosis (MTB), the bacterium responsible for tuberculosis (TB). The evolution and spread of drug resistant TB threaten to undermine the success of TB treatment and control programs worldwide. Multi-drug resistant (MDR) TB is defined as TB that is resistant to isoniazid and rifampicin, the two most effective anti-tubercular drugs. With a global estimate of 650,000 MDR cases in 2010[1] and a rising number of cases that are extensively drug-resistant (XDR; defined as MDR cases that are also resistant to fluoroquinolones and injectable agents), drug-resistant TB poses a major challenge, requiring advances in diagnostics, methods of surveillance, and therapeutics. Resistance in MTB is thought to arise through the serial acquisition of point mutations in genes encoding drug activating enzymes or targets. Current molecular diagnostics amplify and detect known drug resistance mutations, and their performance depends on the inclusion of a comprehensive catalog of these mutations. Although known mutations explain much resistance in MTB, causative mutations have not been identified in 10-40% of clinically resistant isolates[2] and, even where causative mutations have been identified, there may be additional variants that contribute to drug resistance. In addition to classical drug-resistance genes (encoding the protein target of the drug or a drug-metabolizing enzyme), mutations in three other classes of genes may confer a selective advantage in the presence of drugs. First, mutations that reduce cell wall permeability or increase the activity of drug efflux pumps are expected to increase the mean inhibitory concentrations of drugs, potentially providing an early step toward full-blown drug resistance[3]. Second, compensatory mutations that ameliorate the fitness costs of other resistance mutations can occur and be selected, as seen in both clinical and experimental evolution studies[4]. Third, mutator phenotypes can increase the rate at which rare beneficial mutations occur (though, at the expense of also accumulating deleterious mutations) and therefore provide a selective advantage in the presence of drug treatment[5]. To identify novel genomic regions associated with drug resistance, we performed next-generation whole genome sequencing of 116 MTB isolates from four categories: (i) eight epidemiologically-linked clusters of cases (epiclusters) with emergent drug resistance, (ii) two uniformly drug-sensitive epiclusters, (iii) 35 non epi-linked isolates sampled to represent the 6 major global lineages of MTB and (iv) 8 isolates from a single patient displaying emergent resistance (Figure 1). We combined these data with publicly available genomes of seven isolates. The full 123 MTB sample set include 47 isolates resistant to at least one TB drug, including nine isolates that are XDR-TB. The resulting dataset captured substantial genetic diversity, with 24,711 polymorphic sites relative to the H37Rv reference genome. A genome-wide phylogeny revealed significant population differentiation between epiclusters, confirmed by high fixation indices (FST > 0.36) between all epicluster pairs (Figure 1B,C).
Figure 1

Characteristics of sequenced TB isolates: (A) Geographic distribution of sampled isolates (circle size is proportional to the number of isolates sampled; circle color refers to TB lineage. (B) Parsimony phylogenetic tree with node bootstrap support. Root length not to scale. Epiclusters are merged into triangles for clarity, with the exception of two paraphyletic epiclusters: Peru2 and Russia1. (C) Genetic differentiation between the 14 epiclusters (higher FST reflects higher differentiation). FST values provided in Supplementary Table 19.

We first determined whether drug resistance could be explained by previously known resistance mutations. To ensure that no resistance mutations were missed, we performed additional deep targeted sequencing of the known resistance genes in 35 resistant isolates (15 isolates with no apparent resistance mutations and 20 with at least one known resistance mutation). We detected mutations in known resistance determinants that had been missed by the initial whole genome sequencing in two isolates; the remaining 13 had confirmed resistance to at least one drug that could not be explained by known mutations (Supplementary Table 1). We did not miss any mutations in the remaining 20 isolates. The isolates with ‘unexplained’ resistance likely harbor novel mutations that encode resistance and perhaps contribute more generally to the acquisition and maintenance of resistance. Next, we reasoned that genomic variants (mutations or alleles) under selection in resistant strains could provide clues about the cellular mechanisms conferring resistance, while also serving as biomarkers of resistance. We therefore sought to identify genes harboring mutations that confer a selective advantage to drug resistant strains. Unfortunately, many commonly used tests to identify genes under positive selection are not well suited to bacteria such as MTB. Haplotype-based tests for positive selection, often used in humans and other eukaryotes, cannot be used as genetic diversity in MTB arises primarily by clonal expansion rather than by mating and homologous recombination among isolates[6,7]. The widely-used dN/dS is also not suited to MTB, as the method has low sensitivity in detecting positive selection in recently diverged sequences from a single species[8], and as strong purifying selection on synonymous mutations in MTB[6] can spuriously give rise to high dN/dS scores, resulting in low specificity. Indeed, dN/dS lacked power and likely specificity when applied to our MTB dataset, recovering only five of 11 known resistance determinants, while detecting 143 additional genes (Supplementary Table 2). Instead, we sought to leverage evolutionary convergence – the repeated and independent emergence of resistance-associated mutations at specific loci or genes – to develop a test for selection in clonal bacterial species like MTB[7]. To identify independently arising mutations, we reconstructed a phylogenetic tree for the 123 isolates using M. canetti as an outgroup. Based on this tree, we inferred nonsynonymous and noncoding ancestral sequence changes and internal resistance states using parsimony. We focused on nonsynonymous mutations as these were more likely to encode functional protein changes than their synonymous counterparts. Nevertheless, due to emerging evidence that synonymous sites may also be under selection for adaptive changes in gene expression or mRNA stability[6,9] we also inferred synonymous mutations (in a secondary analysis reported only in the Supplementary Note). We took several precautions to ensure that the reconstructed changes and resistance states in our analysis were not influenced by possible errors in the tree topology. First, we reconstructed the phylogeny in triplicate using different methodologies, and removed all mutations not inferred in all three trees. We also ignored ambiguous mutations from the ancestral reconstruction, and mutations occurring at branches with lower than 70% bootstrap support. Second, to remove local uncertainty in the tree topology, we counted ‘close’ changes only once. ‘Close’ changes were any changes that occurred in two isolates separated by less than the 98th percentile for within-epicluster genetic distance. Third, we implemented a simplified ‘pairwise’ convergence test in which we compared the most sensitive to the most resistant isolate in each of 8 epiclusters, ignoring the rest of the phylogeny entirely (Supplementary Figure 1). Using the high confidence ancestral reconstruction, we designed a phylogenetic convergence test (phyC). We first looked for specific mutations with a higher frequency in the resistant branches compared with the sensitive branches as candidate targets of independent mutation (TIMs). To distinguish convergence due to positive selection in resistant branches from patterns expected by chance under neutral evolution[10], we assessed the significance of each candidate TIM relative to the expectation from the observed mutations across the phylogeny. Briefly, for each TIM with mutations in x resistant branches and y sensitive branches, we redistributed the x + y hits onto the phylogeny, with branches receiving hits at random, in proportion to their length. We repeated this permutation 10,000 times to obtain an empirical P-value assessing the significance of the association of each TIM with resistance. This procedure controls for the tree topology, including the distribution of resistance phenotypes across the tree, and for local mutation rate within each TIM. We then expanded the PhyC test from individual nucleotide sites to encompass whole genes and intergenic regions as targets. Here, we looked for genes or intergenic regions with a higher frequency of independently arising mutations anywhere along their length, using the same framework as described above for the site-specific test. We applied the PhyC test, for mutations, genes and intergenic regions, to the genomewide phylogeny of 123 TB isolates. For our analysis, we defined the resistance phenotype as resistance to any anti-TB drug by conventional drug susceptibility testing so that we would be able to identify mutations associated with multiple resistances as well as those that confer resistance to a single drug. We repeated these analyses to identify selection in isolates resistant to each of the five first-line anti-TB drugs (isoniazid, rifampin, pyrazinamide, ethambutol, and streptomycin). As a proof of concept, we assessed the functional impact of the observed mutations within one of the TIMs identified by the PhyC test. We constructed two ponA1 mutants (carrying 2 of the 3 SNPs (C123G and G1095T) that were most enriched in resistant strains) in an H37Rv MTB laboratory strain using recombineering and site directed mutagenesis. We then compared the survival of the two mutant strains to wild type cells and those that lacked the A1 gene in increasing concentrations of the drugs rifampicin, isoniazid, streptomycinand ofloxacin.

Results

Targets of independent mutation

PhyC detected all eleven known resistance determinants as significant TIMs. Nine of these were also identified by a weaker but conservative phylogeny-independent ‘pairwise’ convergence test (Supplementary Tables 3 & 4) developed here. We further identified 39 novel TIMs not previously associated with resistance, consisting of 7 nonsynonymous coding sites, 2 non coding sites, 28 genes and 2 intergenic regions (p < 0.05) (Figure 2, Supplementary Table 5). All 9 individual nucleotide site TIMs fell within genes or intergenic regions also identified as TIMs. We observed that mutations in resistant branches cluster more closely in the genome than those in sensitive branches and that many of the TIMs fall in these regions of dense resistant-specific mutations (Supplementary Tables 6 & 7). Mutation 946T in the conserved membrane protein Rv0218 had the highest number of independent hits in resistant branches of any candidate mutation occurring in eight resistant branches and on no sensitive branch (P <0.00001). However, because there were more sensitive than resistant isolates in our dataset, mutations in sensitive branches are far more prevalent than in resistant branches (compare red and blue histograms of Figure 2). Several of the TIMs significantly associated with resistance also occur in sensitive branches (Supplementary Table 5). This suggests that several TIMs may not cause resistance directly, but rather provide an incremental fitness advantage to resistant strains.
Figure 2

Candidate genes under selection in resistant MTB. Circular plot of gene locations. Outer black lines represent: the 11 benchmark drug resistance genes in the H37Rv reference genome (red text). Inner red lines represent locations of targets of independent mutation (TIMs). Four novel TIMs of interest are named in black text. The innermost barplot shows the number of mutations per gene or intergenic region, in resistant (red) or sensitive (blue) isolates. Plotted using circos[29].

Functions of candidate selected loci

Among the 39 novel genomic regions identified by PhyC, 11 have annotated function (Fig. 4A), 16 belong to a family of genes (PE/PPE) of principally unknown function that is unique to mycobacteria and constitutes about 10% of the MTB genome (Supplementary Table 5), and the remaining 12 are of unknown function. We systematically mined the literature on these genes not previously associated with resistance, noting evidence that many closely interact with known DR genes (physically or genetically) or drug efflux pumps, alter intrinsic drug resistance in MTB or non-tuberculous mycobacteria, are involved in DNA repair, replication or recombination or affect cell wall biogenesis.
Figure 4

MTB ponA1 mutant survival in the presence of the drug rifampicin. (A) Bacterial survival (percent of untreated control OD600 absorbance) under increasing concentrations of rifampicin for ΔponA1 (ponA1 deletion mutant), Δ∷wildtype (ponA1 deletion mutant complemented with the wildtype ponA1 gene), Δ∷G1095T (mutant complemented with ponA1 carrying the G1095T allele), and Δ∷C123G (mutant complemented with ponA1 carrying the C123G allele) (B) Bacterial survival (as in A) of strains cultured in the presence of 0.00125μg/ml of rifampicin. The two-sided t-test comparing survival between wildtype and Δ∷G1095T was significant with a P-value of 0.006. Error bars represent the standard deviation. Four replicate experiments were performed.

Previously known resistance loci

Two novel TIMs were located nearby known resistance genes in the genome, suggesting that they modify or compensate for the phenotypes of the known genes. The first occurs in the promoter of the known resistance gene rrs, which encodes the 16S RNA component of the ribosome, a target of aminoglycoside drugs. The second, rpoC (Figure 3), is in the same operon as rpoB, which codes for the β subunit of RNA polymerase, the main target of the drug rifampicin. RpoC encodes the interacting β′ subunit, and has been identified as a target of compensatory mutations modifying the fitness of rifampicin-resistant isolates of both MTB and S. typhimurium[11-14]. Prior studies have shown that substitutions in rpoC are frequent in clinical rifampicin resistant isolates with known rpoB mutations and that the relative fitness of rpoC/rpoB mutants in vitro is higher than those with rpoB mutations alone[12-13]. Some rpoC mutations in S. typhimurium also confer low-level rifampicin resistance, suggesting that rifampin-resistance phenotypes may be the result of additive substitutions in different genes[13]. Among the 43 rifampicin resistant isolates in our collection, 13 (30%) harbored rpoC substitutions that did not appear in any rifampicin sensitive isolates.
Figure 3

Evolutionary convergence at the gene level in rpoC. Resistant branches (inferred by parsimony, and usually involving progressive resistance to several drugs) and strain names are colored red; sensitive branches in black. Stars on the phylogeny designate inferred sequence changes in rpoC: Blue stars denote changes in resistant branches (10 in total), black in sensitive branches (6 in total). Nucleotides in the multiple sequence alignment are also colored blue or black accordingly. Sites shown in the multiple alignment are (left to right) 763884, 764181, 764580, 764819, 765150, 765171, 765230, 765462, 765463, 765482, 765619, 766467, 766488, 766645, 767060 (H37Rv coordinates).

Unexplained drug resistance

We determined whether nonsynonymous mutations in the TIMs could explain resistance in the 13 isolates without known drug resistance mutations. For each drug and isolate we identified all mutations in the candidate genes excluding mutations occurring in any isolate sensitive to that drug. Although no single candidate mutation or gene was found to account for all the unexplained drug-specific resistance, two of six isolates with unexplained kanamycin resistance harbored mutations in PPE60 (Supplementary Table 8). Eight of the 13 (62%) isolates exhibited changes in at least one TIM, with four of the isolates exhibiting changes in two or more.

Drug-efflux pumps

Although no efflux pumps were identified among genes that met our statistical criteria for TIMs, we found that several efflux pumps, including the ABC transporters Rv0194 and Rv1463, have a larger number of independent mutations in resistant strains relative to sensitive strain.

DNA repair

Another of the novel TIMs, dnaQ, encodes a component of DNA polymerase III that provides “proof-reading” activity during DNA replication. Several dnaQ mutations in E. coli yield strong mutator phenotypes[15]. To date, no similar phenotype has been described in MTB, although dnaQ variants are not uncommon among clinical isolates[16].

The PE/PPE gene family as a target of independent mutation

Sixteen novel TIMs are members of the PE/PPE family, the majority of which are within the PGRS sub-family; this was the only gene group significantly enriched in the convergence analysis. Multiple members of this family are surface-exposed cell wall proteins; some affect cell wall structure and permeability and some have been shown to be antigens[17]. PE/PPE genes contain an extremely high density of substitutions, and were therefore excluded from the genomewide phylogeny. As a result, it is expected (although not guaranteed) that these genes would be enriched for conflicting phylogenetic signal (homoplasy), but not necessarily that homoplasic mutations be associated with resistance. The association of PE/PPE genes with drug resistance is therefore noteworthy. Due to the high genetic diversity in these genes, the association may reflect random fixation of this diversity during population bottlenecks that occur during antibiotic treatment, rather than genuine positive selection on resistant isolates. In other words, resistant isolates may be descended from the survivors of severe bottlenecks, during which neutral mutations repeatedly become fixed, and these mutations are most readily observed in diverse loci like PE/PPE genes. However, we cannot rule out a possible functional role of PE/PPE genes in the evolution of resistance – for example, PPE60 is one of the best candidates to account for some of the unexplained kanamycin resistant isolates (see below, and Supplementary Table 8).

Cell wall homeostasis

Five novel TIMs contribute to MTB cell wall biogenesis or remodeling. The structure of the mycobacterial cell wall is unique among prokaryotes in that in addition to the peptidoglycan layer typical of most bacteria, it contains several outer layers characterized by unusual complex lipids (Table 1 and Supplementary Figure 2). These layers contribute to the permeability barrier that underlies the intrinsic antibiotic resistance of most mycobacteria[18]. Multiple TB drugs target structures in the cell wall, and many of the known resistance genes code for enzymes in cell wall lipid pathways. Three of the five genes (ppsA, pks12 and pks3) participate in the biosynthesis and translocation of the surface exposed lipids including phthiocerol dimycocerosate (PDIM) [19-21] while the remaining two (murD and ponA1) contribute to the biosynthesis and homeostasis of the cell wall component, peptidoglycan[22]. Deletion of ppsA, depletion of pks12 and depletion of ponA1 each affect susceptibility to antibiotics in non-tuberculous mycobacteria and/or MTB[23-25]. Deletion of pks12 has been recently shown to increase drug resistance in M. avium through a cell wall remodeling pathway[26]. In addition, pks12 had a synonymous site that was a significant TIM (Supplementary Table 9)
Table 1

Targets of independent mutation of annotated function. Numbers in bold are literature references. Genes involved in cell wall biosynthesis are ppsA, pks3, pks12, ponA1, murD. Refer to Supplementary Table 5 for a complete list of TIMs.

Cellular functionResistance association
Synthesis or regulation of surface exposed lipidsPG homeo-stasisTranscription-regulationDNA replication and repairGlucose metabolism & anti-oxidationGene or pathway is resistance assoc. in MTBGene or pathway is resistance assoc. in NTMResistance assoc. in other bacteria
ppsARv2931192725
pks3Rv118021
pks12Rv2048c202424,26
ponA1Rv0050222330
murDRv2155c22
mtrBRv3245c3132,3334
rpoCRv06681211,1213
dnaQRv3711c3515
opcARv1446c3536
rbsKRv24363735
rrs promoter (pre-Rvnr01)3839

Functional Analysis of PonA1 Mutants

In the presence of the drug rifampicin, the strain carrying the ponA1 G1095T mutation had a 4-6 fold survival advantage at a concentration of 0.00125μg/ml of drug (Figure 4). The MIC to rifampicin for this mutant is estimated at 0.0025 μg/ml or 2-fold higher than wildtype MIC (0.00125 μg/ml). In contrast, the ponA1 C123G mutant strain showed no growth advantage in the presence of rifampicin, and neither mutant demonstrated a growth advantage when grown in the presence of isoniazid, streptomycin or ofloxacin. The 1095 site maps close to the ponA1 transpeptidase domain catalytic site, raising the possibility that the SNP inactivates this enzymatic activity; this is supported by the finding that the ponA1 deletion mutant demonstrates a similar rifampicin resistance phenotype (Figure 4).

Discussion

This work describes a comprehensive genomewide search for genes under selection in clinically resistant MTB isolates. Convergent evolution provides the basis for a highly sensitive test for selection that recovered all of a set of 11 known drug resistance determinants, and is easily generalizable to the study of other types of bacteria. Our method involves sequencing the genomes of related bacteria with different phenotypes of interest (in this case, antibiotic resistance), reconstructing a phylogeny, and identifying targets of convergent evolution using a simple statistical test. While the method relies on a genomewide phylogeny, it can accommodate recombination, provided that it is not so rampant as to completely obscure the phylogeny. Recombinant regions often conflict with the genomewide phylogeny, allowing them to be identified by our method as targets of convergent evolution if they are consistently associated with the phenotype of interest, but not if they are randomly distributed among genomes with different phenotypes. The method is therefore amenable to bacteria with a range of recombination rates. It provides a rapid and unbiased means of identifying molecular biomarkers that are predictive of phenotypes of interest. Applying this method to MTB, we identified 39 genes and intergenic regions newly associated with resistance. While several of these selected mutations occur in genes that are either nearby known DR loci in the genome or are mutator genes in other organisms, a preponderance were associated with cell wall permeability phenotypes. This suggests that stable drug resistance phenotypes may evolve through a complex step-wise process involving cell wall remodeling. Our finding that a ponA1 mutation (G1095T) identified as a TIM conferred a fitness advantage in the presence of rifampicin is consistent with this model. The relevance of cell wall remodeling pathways to drug resistance is also highlighted by two recent studies that compared resistant isolates to their drug sensitive precursors. In one, the investigators noted increased levels of PDIM and peptidoglycan precursors and up-regulation of the PDIM biosynthetic operon (including ppsA) in rifampicin resistant strains. Interestingly we identified ppsA as a TIM in strains resistant to rifampicin (95% of which had nonsynonymous mutations in rpoB, Supplementary Table 10) in addition to other drugs, raising the possibility that rifampin resistance-causing rpoB mutations may lead to alterations in cell wall metabolism possibly a result of altered transcription[27]. The second study documented the occurrence of eleven new non-synonymous mutations in serial MTB isolates from three patients who developed increasing levels of resistance during anti-TB therapy. Seven of the mutations occurred in genes involved in cell wall biosynthesis or transport including fadD32 and Rv1739c, an ABC transporter. Although none of these overlapped with the TIMS identified in this study, the enrichment of mutations in genes associated with cell wall biosynthetic pathways among progressively drug resistant strains is consistent with our findings and further supports the hypothesis that these changes reflect accommodation to drug exposure[28]. The genomic TIMs we have identified here are associated with drug resistance and may represent changes that confer a selective advantage in the presence of drug. In aggregate, they provide promising new targets for molecular diagnostics and development of therapeutics against drug resistance in MTB.

Online Methods

Institutional review board

The study was evaluated by Institutional Review Boards at the Harvard School of Public Health, the Broad Institute of MIT and Harvard, and the Centers for Disease Control and Prevention where it was determined that it met criteria for exemption from human subject review.

Isolate selection

We assembled an archive of sensitive and resistant isolates that capture a range of Mycobacterium tuberculosis (MTB) lineages, geographic sources and resistance profiles. Building on our previous molecular epidemiological studies (Supplementary Table 11 and references within) we aimed to include sets of progressively resistant isolates, sampled either from community transmission chains or from individuals with chronic disease. These sets included isolates from 11 such micro-epidemics (‘epiclustered’ isolates), as defined by molecular fingerprinting, chosen to include the most sensitive and most resistant members of the cluster as well as 8 progressively resistant isolates from a single patient over time. To obtain a measure of background evolution restricted to sensitive isolates and avoid mis-identifying highly variable loci as associated with drug resistance, we included 23 geographically diverse drug sensitive isolates and additional isolates from two drug sensitive epiclusters. To increase the diversity of the sample, we included 11 additional resistant isolates even when a less resistant progenitor was not available. We also included 3 isolates that had been spontaneously evolved in vitro and manifested an aminoglycoside resistance phenotype that was unexplained by targeted sequencing of all previously known resistance genes. In all, 116 MTB isolates were selected for sequencing described in detail in Supplementary Table 11A. We added 7 publicly available MTB genomes to our alignment, including two drug sensitive Beijing lineage isolates, the latter of which were missing from our sampled isolates thus far. Isolates obtained from public sources are detailed in Supplementary Table 11B. There is no established method to determine the power of genomic analyses performed with this sample size, but the strain set studied here is among the largest set of drug resistant MTB strains sequenced to date. Of the 123 total isolates, 47 were resistant to one or more drugs (Supplementary Table 11). Eighty three isolates belonged to 14 distinct epiclusters, and the rest were isolates with a unique molecular fingerprint. Twelve clusters had one or more resistant isolates. One other publicly available genome from the species Mycobacterium canetti was included to serve as a phylogenetic outgroup.

Resistance phenotype

We defined a “broad resistance” phenotype as resistance to any anti-TB drug by conventional drug susceptibility testing. We used this “broad resistance” as our primary phenotype of interest as our goal was to identify genes and mutations associated with resistance to at least one drug, but potentially many. As a secondary, more specific phenotype of interest, we used resistance to each of the 5 first line anti-TB drugs (isoniazid, rifampin, pyrazinamide, ethambutol, and streptomycin). Resistance status to first line anti-TB drugs had much fewer missing data points than resistance to the other drugs (Supplementary Table 11).

Sequencing, Alignment, and SNP calling

DNA was extracted from all isolates using standard methods and sequenced on an Illumina GAIIx instrument using reads of 36bp length or more. Sequence reads were aligned to the reference genome sequence for H37Rv using MAQ[40]. Reads that aligned with more than 3 mismatches in the first 24bp or that aligned to multiple locations were discarded. SNPs were called with a minimum depth of 20X, and consensus quality score of 20. The required maximum mapping quality of reads covering the SNP was set at 50. SNPs within 5bp of an indel (insertion or deletion) or did not have an adjacent consensus quality of 20 were also discarded. Refer to the supplementary note for further details.

FST and dN/dS analyses

Fixation index (FST) and dN/dS rates were computed using standard methods detailed in the supplementary note.

Phylogeny construction

The phylogeny was constructed based on the whole MTB genome multiple alignment, as MTB populations are thought to be predominantly clonal, with most of the genome supporting a single consensus phylogeny not impacted significantly by recombination[6]. A superset of SNPs relative to reference strain H37Rv[35] was created across all clinical isolates from the MAQ SNP reports. SNPs occurring in repetitive elements including transposases, PE/PPE/PGRS genes, and phiRV1 members (273 genes, 10% of genome) (genes listed in reference[41]) were excluded to avoid any concern about inaccuracies in the read alignment in those portions of the genome. Furthermore, SNPs in an additional 39 genes previously associated with drug resistance[38] were also removed to exclude the possibility that homoplasy of drug resistance mutations would significantly alter the phylogeny. After applying these filters to the initial set of 24,711 SNPs, the 23,393 remaining SNPs were concatenated and used to construct phylogenetic trees using three methods. Using the PHYLIP dnapars algorithm v3.68[42] we constructed a parsimony phylogenetic tree using default parameters with M. canettii as an outgroup root. We constructed a second phylogeny with Bayesian Markov chain Monte Carlo (MCMC) methods as implemented in the package MrBayes v3.2[42] using the GTR model and a stop criterion of standard deviation of split frequencies of <0.05. We constructed a maximum likelihood tree using PhyML v3.0[44] using the GTR model with 8 categories for the gamma model with and without M. canetti to determine the location of the root. One hundred bootstrap resamplings were performed for each tree, except for the Bayesian tree where posterior probabilities on the branches was used as a measure of confidence. A phylogeny was also constructed using the full SNP set (without excluding SNPs in repetitive elements or known drug resistance genes), and only minor differences in the terminal branches of the tree were found. We used the trees constructed with exclusion of SNPs in repetitive elements and known drug resistant genes for all subsequent analyses.

Phylogenetic convergence test for selection (PhyC)

The phylogenetic convergence test used sequences from all branches of the phylogeny. Ancestral nucleotide non-synonymous (or intergenic) substitutions were reconstructed along each branch using both parsimony and maximum likelihood criteria using the R v2.14.1 package ape v3.0.1[45]. Each branch was assigned a resistant or sensitive label using parsimony. The reconstruction was performed in triplicate for the three phylogenies (Bayesian, parsimony and maximum likelihood). We excluded all ambiguously reconstructed states (<90% probability for maximum likelihood reconstruction). We considered changes occurring along the terminal and deep branches of the phylogenetic tree, but excluded changes occurring at branches with bootstrap support or posterior probability of <70%. For each nucleotide site in the genome, we counted the number of convergent SNPs (to the same base) in resistant and sensitive branches. Given that some background convergence is expected due to neutral mutation and sequence error even without positive selection[10], we assessed the significance of each convergent SNP compared to the empirical background distribution (Supplementary Figure 3). For a SNP that converges in x resistant and y sensitive branches, we sampled x+y branches from the distribution of all SNPs in all branches genomewide, repeated this 10,000 times, and recorded the proportion of times substitutions were observed ≥x resistant and ≤y sensitive branches. This proportion serves as an empirical p-value for an unexpectedly high level of convergence among resistant branches, suggesting the action of selection. To be considered a candidate for positive selection, we required a SNP to have p < 0.05 across all phylogenetic and ancestral reconstruction methods. As multiple different SNPs within the same gene might nevertheless code for similar resistance phenotypes, we expanded the convergence test beyond individual SNPs to include whole genes and intergenic regions. In this method, branches were defined as convergent if they contained a SNP in the same gene or region, even if the SNPs occurred at different nucleotide sites. For each gene and intergenic region in the genome we counted the number of SNPs occurring within the region boundaries, counting at most one SNP for each branch and counting SNPs in order of their frequency in the phylogeny. Using the same empirical resampling strategy as for SNP-based convergence, we generated a list of significant region-based convergence among resistant branches. Supplementary Table 5 details the genes found to be under selection using the phylogeny based convergence method. See the supplementary note for details on the pairwise convergence test and the analysis of the density of resistance-specific mutations.

Selection testing by first line drug phenotype

PhyC, and other supplementary tests for selection were performed similarly for resistance to each of the five first line TB drugs: isoniazid, rifampicin, ethambutol, pyrazinamide, and streptomycin (Supplementary Tables 10,12-15). The genes significant by the “broad resistance” phenotype and resistance to isoniazid, rifampicin, and streptomycin are highly similar with few exceptions. This is likely a result of how closely associated resistance to isoniazid, rifampicin, ethambutol, and streptomycin are (Supplementary Table 11, for example 82% of isolates resistant to either isoniazid or rifampin were resistant to both). The number of significant genes for pyrazinamide was significantly lower than the other drugs likely resulting from the larger number of isolates with a missing resistance phenotype to this drug, and a resultant low statistical power.

SNPs detected in genes under selection

Supplementary Table 16 lists all the SNPs seen in the TIMs in resistant isolates. SNPs were called relative to the preceding (ancestral) node for each isolate in the phylogenetic tree. Supplementary Table 17 provides a multiple alignment of the genetic sequence for all SNP sites in the TIMs (these include sites occurring in resistant strains only, sensitive strains only and SNP sites occurring in both types of strains).

Candidate genes variants in isolates with unexplained resistance

We identified nonsynonymous SNPs in the genes under positive selection in isolates with unexplained resistance. We filtered out SNPs in these genes that occurred in any isolates sensitive to each drug. The results are detailed in Supplementary Table 8.

M. tuberculosis mutant generation

Rv0050, encoding ponA1, was replaced with a hygromycin resistance cassette using mycobacterial recombineering in the H37Rv host strain. The ponA1∷hyg replacement was confirmed by PCR and whole genome sequencing. As we sought to identify mutants more likely to be independently causative of resistance we focused on ponA1 SNP alleles C123G and G1095T as these occurred in mostly drug resistant clinical strains, and the third site (1891) alleles (C/T) were more prevalent in susceptible strains. SNP alleles in ponA1 were generated by site directed mutagenesis and Sanger sequence confirmed. Wildtype or SNP alleles in ponA1 were cloned under the control of a constitutive promoter and integrated in the M. tuberculosis genome as single copies at the L5 phage integration site.

MIC assays

All strains were grown in Middlebrook 7H9 medium supplemented with 0.25% glycerol, 10% oleic acid-albumin-dextrose-catalase, and 0.05% Tween-80. For MIC calculations, the strains ΔponA1, ΔponA1 L5∷ponA1wildtype, ΔponA1 L5∷ponA1C123G, and Δ ponA1 L5∷ ponA1G1095T were grown until mid-log phase (0.5 – 0.8 spectroscopic optic density at 600nm (OD600)) and then diluted to a calculated starting OD600 of 0.006 and grown ± drug for 6 days at 37°C with shaking. All conditions were done in duplicate. Two sets of duplicate experiments were performed at slightly different drug concentrations. Percent survival is calculated by normalizing the OD600 measurements of each strain to its respective untreated control. The MIC was defined as the drug concentration that inhibits growth to 1% or less of the untreated control.
  43 in total

1.  Presence of a mutation in ponA1 of Neisseria gonorrhoeae in numerous clinical samples resistant to various beta-lactams and other, structurally unrelated, antimicrobials.

Authors:  Katsumi Shigemura; Toshiro Shirakawa; Nasrum Massi; Kazushi Tanaka; Soichi Arakawa; Akinobu Gotoh; Masato Fujisawa
Journal:  J Infect Chemother       Date:  2005-10       Impact factor: 2.211

2.  Characterization of a Mycobacterium smegmatis mutant lacking penicillin binding protein 1.

Authors:  H Billman-Jacobe; R E Haites; R L Coppel
Journal:  Antimicrob Agents Chemother       Date:  1999-12       Impact factor: 5.191

Review 3.  Prevention of drug access to bacterial targets: permeability barriers and active efflux.

Authors:  H Nikaido
Journal:  Science       Date:  1994-04-15       Impact factor: 47.728

4.  An essential two-component signal transduction system in Mycobacterium tuberculosis.

Authors:  T C Zahrt; V Deretic
Journal:  J Bacteriol       Date:  2000-07       Impact factor: 3.490

5.  A conspicuous adaptability to antibiotics in the Escherichia coli mutator strain, dnaQ49.

Authors:  K Tanabe; T Kondo; Y Onodera; M Furusawa
Journal:  FEMS Microbiol Lett       Date:  1999-07-01       Impact factor: 2.742

6.  Adaptation to the fitness costs of antibiotic resistance in Escherichia coli.

Authors:  S J Schrag; V Perrot; B R Levin
Journal:  Proc Biol Sci       Date:  1997-09-22       Impact factor: 5.349

7.  Genes required for intrinsic multidrug resistance in Mycobacterium avium.

Authors:  Julie S Philalay; Christine O Palermo; Kirsten A Hauge; Tige R Rustad; Gerard A Cangelosi
Journal:  Antimicrob Agents Chemother       Date:  2004-09       Impact factor: 5.191

8.  Disruption of msl3 abolishes the synthesis of mycolipanoic and mycolipenic acids required for polyacyltrehalose synthesis in Mycobacterium tuberculosis H37Rv and causes cell aggregation.

Authors:  Vinod S Dubey; Tatiana D Sirakova; P E Kolattukudy
Journal:  Mol Microbiol       Date:  2002-09       Impact factor: 3.501

9.  Deletion of the genes encoding the MtrA-MtrB two-component system of Corynebacterium glutamicum has a strong influence on cell morphology, antibiotics susceptibility and expression of genes involved in osmoprotection.

Authors:  Nina Möker; Melanie Brocker; Steffen Schaffer; Reinhard Krämer; Susanne Morbach; Michael Bott
Journal:  Mol Microbiol       Date:  2004-10       Impact factor: 3.501

10.  Mycobacterium tuberculosis pks12 produces a novel polyketide presented by CD1c to T cells.

Authors:  Isamu Matsunaga; Apoorva Bhatt; David C Young; Tan-Yun Cheng; Stephen J Eyles; Gurdyal S Besra; Volker Briken; Steven A Porcelli; Catherine E Costello; William R Jacobs; D Branch Moody
Journal:  J Exp Med       Date:  2004-12-20       Impact factor: 14.307

View more
  182 in total

Review 1.  Microbial Speciation.

Authors:  B Jesse Shapiro; Martin F Polz
Journal:  Cold Spring Harb Perspect Biol       Date:  2015-09-09       Impact factor: 10.005

2.  Evolution of Rifampin Resistance in Escherichia coli and Mycobacterium smegmatis Due to Substandard Drugs.

Authors:  Zohar B Weinstein; Muhammad H Zaman
Journal:  Antimicrob Agents Chemother       Date:  2018-12-21       Impact factor: 5.191

3.  Mycobacterial mistranslation is necessary and sufficient for rifampicin phenotypic resistance.

Authors:  Babak Javid; Flavia Sorrentino; Melody Toosky; Wen Zheng; Jessica T Pinkham; Nina Jain; Miaomiao Pan; Padraig Deighan; Eric J Rubin
Journal:  Proc Natl Acad Sci U S A       Date:  2014-01-06       Impact factor: 11.205

4.  Bacterial and host determinants of cough aerosol culture positivity in patients with drug-resistant versus drug-susceptible tuberculosis.

Authors:  Grant Theron; Jason Limberis; Rouxjeane Venter; Liezel Smith; Elize Pietersen; Aliasgar Esmail; Greg Calligaro; Julian Te Riele; Marianna de Kock; Paul van Helden; Tawanda Gumbo; Taane G Clark; Kevin Fennelly; Robin Warren; Keertan Dheda
Journal:  Nat Med       Date:  2020-06-29       Impact factor: 53.440

5.  Convergent evolution in the genomics era: new insights and directions.

Authors:  Timothy B Sackton; Nathan Clark
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-06-03       Impact factor: 6.237

6.  Phylogenetic Methods for Genome-Wide Association Studies in Bacteria.

Authors:  Xavier Didelot
Journal:  Methods Mol Biol       Date:  2021

Review 7.  Clinical implication of novel drug resistance-conferring mutations in resistant tuberculosis.

Authors:  N P Mnyambwa; D-J Kim; E S Ngadaya; R Kazwala; P Petrucka; S G Mfinanga
Journal:  Eur J Clin Microbiol Infect Dis       Date:  2017-06-07       Impact factor: 3.267

8.  Cryptic Microheteroresistance Explains Mycobacterium tuberculosis Phenotypic Resistance.

Authors:  John Z Metcalfe; Elizabeth Streicher; Grant Theron; Rebecca E Colman; Christopher Allender; Darrin Lemmer; Rob Warren; David M Engelthaler
Journal:  Am J Respir Crit Care Med       Date:  2017-11-01       Impact factor: 21.405

Review 9.  Microbial genome-wide association studies: lessons from human GWAS.

Authors:  Robert A Power; Julian Parkhill; Tulio de Oliveira
Journal:  Nat Rev Genet       Date:  2016-11-14       Impact factor: 53.242

Review 10.  Evolution and population genomics of the Lyme borreliosis pathogen, Borrelia burgdorferi.

Authors:  Stephanie N Seifert; Camilo E Khatchikian; Wei Zhou; Dustin Brisson
Journal:  Trends Genet       Date:  2015-03-09       Impact factor: 11.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.