Literature DB >> 29785015

Frequent transmission of the Mycobacterium tuberculosis Beijing lineage and positive selection for the EsxW Beijing variant in Vietnam.

Kathryn E Holt1, Paul McAdam2, Phan Vuong Khac Thai3, Nguyen Thuy Thuong Thuong4, Dang Thi Minh Ha3, Nguyen Ngoc Lan3, Nguyen Huu Lan3, Nguyen Thi Quynh Nhu4, Hoang Thanh Hai4, Vu Thi Ngoc Ha4, Guy Thwaites4,5, David J Edwards2, Artika P Nath6,7, Kym Pham8, David B Ascher2, Jeremy Farrar4,5, Chiea Chuen Khor9,10, Yik Ying Teo11,12, Michael Inouye7,8,13, Maxine Caws14,15, Sarah J Dunstan16.   

Abstract

To examine the transmission dynamics of Mycobacterium tuberculosis (Mtb) isolated from tuberculosis patients in Ho Chi Minh City, Vietnam, we sequenced the whole genomes of 1,635 isolates and compared these with 3,144 isolates from elsewhere. The data identify an underlying burden of disease caused by the endemic Mtb lineage 1 associated with the activation of long-term latent infection, and a threefold higher burden associated with the more recently introduced Beijing lineage and lineage 4 Mtb strains. We find that Beijing lineage Mtb is frequently transferred between Vietnam and other countries, and detect higher levels of transmission of Beijing lineage strains within this host population than the endemic lineage 1 Mtb. Screening for parallel evolution of Beijing lineage-associated SNPs in other Mtb lineages as a signal of positive selection, we identify an alteration in the ESX-5 type VII-secreted protein EsxW, which could potentially contribute to the enhanced transmission of Beijing lineage Mtb in Vietnamese and other host populations.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29785015      PMCID: PMC6143168          DOI: 10.1038/s41588-018-0117-9

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Introduction

Tuberculosis (TB) is a leading cause of death from infectious disease and the global burden is now higher than at any point in history1,2. Despite coordinated efforts to control TB transmission, the factors contributing to its successful spread remain poorly understood. Vietnam is identified as one of 30 high burden countries for TB and MDR-TB with an incidence of 137 TB cases per 100,000 individuals in 20152. Recent phylogenomic analyses of the causative agent Mycobacterium tuberculosis (Mtb) in other high-prevalence regions have provided insights into the complex processes underlying TB transmission3–5.

Results

Genetic diversity and drug resistance

To characterize the diversity of Mtb circulating in Ho Chi Minh City (HCMC), we sequenced the genomes of 1,635 isolates (Supplementary Table 1) obtained from 2,091 HIV uninfected, smear positive adults (≥18 years) commencing anti-TB therapy at district TB units (DTUs) in eight districts of HCMC between December 2008 and July 2011 (see Methods). This identified 73,718 SNPs, which we used to reconstruct a maximum likelihood phylogeny (Fig. 1a) and to assign lineages6. The majority of isolates (n=957, 59%) belonged to lineage 2.2.1, a subgroup of the Beijing lineage (2.2). Lineage 1 (Indo-Oceanic lineage; n=388, 23.7%) and Lineage 4 (Euro-American lineage; n=192, 11.7%) were also common. A single isolate belonged to Lineage 3 (East African-Indian lineage) and was excluded from further analysis. The distribution of lineages did not change during the 2.5-year period of study (Fig. 1b), and was in agreement with previous genotyping studies in urban areas of Vietnam (≥50% 2.2/Beijing lineage and ~20% Lineage 1.1/EIA in Hanoi and HCMC, 1998-2009)7–11. Known antimicrobial resistance mutations were detected in all lineages but were more frequent in Beijing sublineage 2.2.1 (Table 1), consistent with earlier reports from Vietnam7–9,11. In particular, Beijing sublineage 2.2.1 was enriched for mutations associated with resistance to streptomycin (OR 4.6 [95% CI 3.6-6.0], p=1×10-15), isoniazid (OR 1.7 [1.3-2.1], p=3×10-5), rifampicin (OR 5.4 [2.5-13.2], p=1×10-7) and ethambutol (OR 5.6 [2.6-13.7], p=1×10-7), using Fisher’s exact test to compare to all non-2.2.1 Mtb isolates.
Figure 1

Circulating M. tuberculosis strains in HCMC are divided into multiple distinct lineages.

(a) Maximum-likelihood phylogeny of 1635 Mtb isolates collected from TB patients in HCMC, with backgrounds shaded by lineage. Exterior rings indicate presence of known antimicrobial resistance-associated mutations (coloured by drug, according to legend in top right). (b) Frequency distribution of lineages by month. (c) Frequency distribution of lineages by patient age group.

Table 1

Lineage characteristics for HCMC M. tuberculosis isolates, including known antimicrobial resistance mutations identified using Mykrobe Predictor.

Lineage
12.12.2.12.2.24
N%N%N%N%N%
Gender
Female8221.1915.326527.81025.65629.2
Male30678.95084.769272.32974.413670.8
Antimicrobial
Streptomycin4812.41017.042644.51230.83015.6
Isoniazid5714.71220.326928.1923.15227.1
Rifampicin30.823.4586.125.110.5
Quinolones10.335.1181.925.121.0
Ethambutol10.323.4606.337.721.0

Mtb lineage and host demographics

Whilst the majority of TB patients were male (74%, typical for TB studies in Vietnam and elsewhere8–10,12), the Beijing sublineage was significantly associated with TB in females (OR 1.28 [95% CI 1.01-1.62], p=0.043 using Fisher’s exact test; Table 1), consistent with prior observations in Vietnam8 and Nepal13. Beijing sublineage 2.2.1 was also significantly associated with younger people: its frequency declined with age, from 74% of cases in <20 year olds to 50% in ≥60 year olds (p=0.0023 Fisher’s exact test, p=0.0024 linear trend test; Fig. 1c). In contrast, Lineage 1 was significantly associated with males (25% of male cases vs 19% of females, p=0.017) and increased with age regardless of gender, from 12% in <20 year olds to 35% in ≥60 year olds (p=0.0007 Fisher’s exact test, p=0.0014 linear trend test; Fig. 1c). These data confirm that Beijing sublineage 2.2.1 is capable of causing active disease in a wider demographic range of the Vietnamese host population, particularly among females and younger people, than the endemic Lineage 17–10, which is associated with the more typical profile of TB susceptibility that is skewed towards males [M:F prevalence ratio for smear-positive TB was recently estimated at 2.5 (95% CI 2.07–3.04), based on 40 surveys in 22 countries12] and older people2,14.

Mtb lineages display distinct local transmission dynamics

We hypothesised that Beijing lineage or sublineage 2.2.1 was more transmissible than Lineage 1, and/or more capable of causing active disease in infected hosts, in the local Vietnamese Kinh host population. To investigate this, we used the whole genome phylogeny to compare diversity metrics for each lineage (Fig. 2). Terminal branch lengths, which represent an upper bound of evolutionary time since transmission for each Mtb case, were significantly shorter for Beijing sublineage 2.2.1 Mtb isolates (median 8 SNPs) than for non-Beijing lineage isolates (Lineage 1: median 53 SNPs, p<1×10-15 using Kolmogorov-Smirnov test; Lineage 2.1: 30 SNPs, p<1×10-6; Lineage 4: 17 SNPs, p<1×10-9), and slightly shorter than Beijing sublineage 2.2.2 isolates (9 SNPs, p=0.02) (Fig. 2a). The distribution of mean node-to-tip distances for all internal nodes was skewed significantly lower within the Beijing sublineage 2.2.1 compared to the rest of the tree (median 16 SNPs compared to 62, 57, 39 and 60 SNPs for Lineages 1, 2.1, 2.2.2 and 4, respectively; p<0.0015 in all cases).
Figure 2

Properties of lineage subtrees for HCMC M. tuberculosis genomes.

(a) Distributions of terminal branch lengths for the 1635-strain phylogeny. (b) Mean subtree heights (y-axis; measured as mean node-to-tip distances for each subtree) vs subtree size (x-axis; number of descendant tips). Shaded region indicates standard error of the mean across subtrees of a given size; labels indicate lineage. (c) Stacked area plot showing number of clusters (y-axis) within each lineage (coloured as in panel a) identified using different maximum patristic distance thresholds to define clusters (x-axis).

To better understand the differences in transmission dynamics, we explored the distribution of potential transmission clusters using a range of maximum pairwise SNP distance thresholds to define a cluster15 (Fig. 2b-c). Using the smallest cut-off of five SNPs (transmission age of <5 years) there were n=109 clusters, of which 76 (70%) belonged to Beijing sublineage 2.2.1; these had a mean size of 2.4 strains per cluster, compared to mean 2.1 for other clusters (Fig. 2b-c). Using cut-offs of 10 and 20 SNPs there were n=164 and n=220 clusters, respectively; of these, 118 (72%) and 156 (71%), respectively, were Beijing 2.2.1 and these showed significantly larger cluster sizes (means, 2.5 and 3.2 strains per cluster) than those of other lineages (means, 2.2 and 2.4). Notably, the proportion of cases that belonged to transmission clusters was significantly lower amongst Lineage 1 cases (7.7% at the 10 SNP threshold) compared to Lineage 4 cases (20.3%, p=2×10-5), which in turn was significantly lower than among Beijing lineage cases (31.5%, p=3×10-3; see Fig. 2b). Therefore sublineage 2.2.1 infections were more likely to result from recent detectable transmission within the local population, despite our study having low power to detect recent transmission due to sampling only a small fraction of all incident cases in HCMC (~30% of those in the study districts, <10% across HCMC, and restricted to new cases of smear-positive, culture-positive HIV-negative TB only; see Methods). Household data was not available; however pairs of TB cases whose infecting Mtb strains were separated by ≤10 or ≤20 SNPs were significantly more likely to be diagnosed in the same DTU than more distantly related pairs of the same lineage (27% or 21%, vs 15% amongst strains separated by >20 SNPs; p<1×10-12, see Supplementary Figure 1). This phenomenon was observed for both Beijing 2.2.1 and non-Beijing clusters, but the effect was significantly less for the Beijing lineage (of strain pairs separated by <20 SNPs, 21% of Beijing pairs vs 29% of non-Beijing pairs were diagnosed in the same DTUs; p=0.036, see Supplementary Figure 1), suggesting they may be more readily transmitted across greater geographical distances within HCMC than other Mtb. This could potentially be associated with the higher frequency of Beijing lineage TB amongst younger adults (Fig. 1c), who may be more likely to travel regularly between districts, promoting onward transmission; however the age distribution amongst transmission clusters did not differ by lineage. Taken together, the phylogenomic data reveal significant differences in transmission dynamics between the various Mtb lineages circulating in HCMC, and suggest that newly diagnosed cases of Lineage 1 Mtb in this population often result from activation of longer-term latent infections with many private SNPs and no evidence of recent transmission, whilst new cases of Beijing sublineage 2.2.1 Mtb often result from more recent transmission and shorter time to develop active disease.

Geographical relationships of Vietnamese Mtb strains

It has been suggested that the Beijing lineage is slowly displacing the resident Lineage 1 strains in Vietnam, following the introduction of the Beijing strain into urban areas and subsequent spread to rural areas where Lineage 1 still dominates8,10. Our data are consistent with this, showing a higher frequency of Beijing lineage (65%) amongst Mtb infections in HCMC in 2008-2011 compared to the frequencies reported in the city a decade earlier7 (53% in 1998), or in rural areas outside the city8 (32-37% in 2003-2005). We therefore hypothesized that Beijing 2.2.1 isolates from HCMC may represent a locally established epidemic subclade of the Beijing lineage, similar to that previously described in Russia3. To investigate this we combined our HCMC Mtb genome data with 3,146 publicly available Mtb whole genome sequences from Russia3, Malawi4,5, Argentina16, and China17; and globally dispersed Lineage 1 and 2 genomes18,19,20 (Supplementary Table 2); then inferred phylogenies for each lineage (Fig. 3). HCMC Lineage 1 strains were quite distinct from those in other locations (Fig. 3a), with little evidence of transfer between Vietnam and other regions. The vast majority (n=319, 82%) belonged to a localised subclade (1.1.1.16) that included only seven previously sequenced strains, all of which were from Vietnam. A further 46 (12%) of the HCMC strains belonged to a related sister clade (1.1.1) which also included strains from neighbouring Cambodia (n=1) and Thailand (n=2); the remainder (n=23, 6%) belonged to subclade 1.2.1, in which the HCMC strains were intermingled with others from the Philippines (n=9) and China (n=1). These data suggest that Lineage 1 associated TB in HCMC results mainly from a local endemic Mtb population. In contrast, Beijing 2.2.1 isolates from HCMC formed several distinct clusters that each shared a recent common ancestor with isolates from outside Vietnam (Fig. 3b). Notably, isolates from Russia, Malawi, China and numerous other countries were interspersed throughout the HCMC Beijing 2.2.1 population (Fig 3b), suggesting multiple, frequent transfers between host populations in HCMC and other geographic regions. HCMC Lineage 4 isolates were drawn from eight of the ten recognised sublineages6 (including those identified as specialist, generalist and intermediate in their geographic range21) and were interspersed with isolates from other geographical locations, consistent with multiple imports into HCMC (Fig. 3c). In further support of these observations, stochastic mapping of locations onto the phylogenies predicted dozens of strain transfer events between Vietnam and other locations for Lineages 2 and 4, but not for Lineage 1 (Fig. 3d), strongly supporting that Mtb sublineage 1.1.1.1 is endemic in Vietnam.
Figure 3

Phylogenies of M. tuberculosis showing relationships between isolates from HCMC and other locations.

HCMC isolates are coloured grey, isolates from four other localised studies are coloured as in panel (d), other locations are shown in black. (a) Lineage 1 (n=675 genomes). (b) Lineage 2 (n=1871 genomes). (c) Lineage 4 (n=2066 genomes). (d) Number of transfers between Vietnam and other locations predicted by stochastic mapping of locations onto the Lineage 2 and 4 trees.

Beijing lineage-defining SNPs under positive selection

The population structure (Figs 1-2) provides evidence that Beijing lineage strains are more transmissible within this HIV-negative HCMC population than are other Mtb lineages. Genomic evidence for enhanced transmission of the Beijing lineage has been documented in Russia (associated with antimicrobial resistance)3 and Malawi (independent of antimicrobial resistance)4. While antimicrobial resistance was common amongst HCMC Beijing lineage isolates, the majority of transmission clusters (defined by ≤10 SNPs) comprised groups of isolates that did not share any known resistance mutations that could account for their transmission success (Supplementary Figure 2). This is consistent with previous reports that the Beijing lineage is highly transmissible and more likely to progress to active disease in various host populations and is also more virulent and less pro-inflammatory in various cellular assays, independent of antimicrobial resistance22–25. We therefore aimed to interrogate the Mtb genome data to identify mutations that may contribute to the success of the Beijing lineage (2.2). Evolutionary convergence has previously been used as a signal of positive selection to identify mutations associated with antimicrobial resistance in Mtb26,27. We reasoned that advantageous polymorphisms contributing to the enhanced transmissibility of Lineage 2.2 should be fixed in this lineage, and should also be under positive selection that is detectable as convergent or parallel evolution at the same variant sites in other lineages. We identified a total of 424 homoplasic nonsynonymous SNPs (nsSNPs) across the HCMC phylogeny. The most frequent of these occurred in genes in which convergent evolution has previously been associated with antimicrobial resistance including gidB, embB, gyrA, rpoB, and inhA26, which together accounted for 12.4% of all homoplasic nsSNPs. The distribution of common homoplasic nsSNPs in these genes is shown in Supplementary Figure 3; in particular, rifampicin resistance-associated mutations in rpoB and ethambutol-associated mutations in embB arose independently many times in Beijing sublineage 2.2.1 (n=33/35 rpoB-450 mutations; n=16/17 embB-306 mutations; n=8/10 embB-406 mutations). The homoplasic nsSNPs included three that arose on the branch defining Lineage 2.2 and also elsewhere in the HCMC tree (Table 2, Supplementary Figure 4). One was a mutation in esxW (Rv3620c) codon 2 (EsxW-Thr2Ala), which arose on nine other branches (six times in Lineage 4, three times in Lineage 1; see Supplementary Figure 4) and showed evidence of onward transmission on 4/9 occasions. Comparison to the global tree detected the same esxW mutation on a further ten Lineage 4 branches in Malawi and Russia, with onward transmission detected on 6/10 occasions. The other two mutations were in Rv3081 (conserved hypothetical protein) and gidB (mutations in which are often associated with streptomycin resistance) and arose less frequently (Table 2). In contrast, homoplasic nsSNPs on the branches defining Lineages 1 or 4 were each detected on only 1-2 other branches in the HCMC tree and no additional branches of the global tree (Supplementary Table 3). No homoplasic SNPs were associated with sublineage 2.2.1, and although synonymous or intergenic SNPs can have functional consequences, we found no such homoplasies associated with Beijing or other lineages.
Table 2

Homoplasic non-synonymous SNPs identified as occurring on the Beijing lineage-defining branch and also arising independently within other lineages.

The number of branches on which each SNP was identified outside the Beijing lineage-defining branch is shown. The number of such branches that have multiple descendant tips (indicating onward transmission of the SNP) is shown in no. transmitted column. HCMC refers to the 1,635 isolates from HCMC, Vietnam; Elsewhere refers to the 3,146 additional isolates from published studies3–5, 17–19; trees are shown in Figure 3.

MutationHCMCElsewhereFunction
no. branches outside Beijing lineageno. transmittedno. branches outside Beijing lineageno. transmitted
EsxW-T2A94106ESX-5 secreted protein (CFP10 homolog)
Rv3081-F220L2173hypothetical protein
GidB-E92D1100streptomycin resistance

EsxW mutation

EsxW is included in multiple Mtb vaccines currently under development (including H65, ID83, ID93)28–31 due to its demonstrated immunogenicity in mice, safety and immunogenicity in non-human primates31, demonstrated T-cell targeting in humans32, and epitopes predicted to bind a wide range of human HLA-DRB1 alleles28,33. Hence we considered whether the EsxW-Thr2Ala mutation could affect epitope binding. However, residue two lies in the N-terminal loop of EsxW (see Fig. 4e), outside the experimentally demonstrated epitope region (residues 24-34)34. In silico epitope binding analysis on the Beijing and non-Beijing EsxW protein sequences using the Immune Epitope Database Analysis Resource35 predicted HLA binding to the first 9-10 residues of EsxW, but no differences in binding affinities for the wildtype and Thr2Ala mutant alleles. This is consistent with experimental data showing that immunization with ID93 (a recombinant fusion protein containing H37Rv (Lineage 4) wildtype alleles of EsxW/V) is protective against Beijing (2Ala) as well as non-Beijing (2Thr) strains30,31.
Figure 4

EsxW mutation at gene, mRNA, protein and heterodimer level.

(a) Variation in promoter region of esxW and other CFP10 paralogs extracted from H37Rv (Lineage 4). Sites are coloured by conservation, coordinates are relative to the start codon. (b) Variation in protein coding region. Tree shows maximum likelihood phylogeny inferred from amino acid sequences; box indicates QILSS proteins. Alignment of protein-coding DNA sequence is shown, coloured by conservation; arrow indicates site of homoplasic SNP in esxW, (A to G), resulting in Thr to Ala substitution at EsxW protein residue 2. (c-d) RNAseq results for esxW and QILSS/ESX-5 paralogs measured in 4 Mtb isolate pairs, each including a Lineage 1 or 4 EsxW-2Ala mutant and its genetically closest EsxW-2Thr relative, following 24h macrophage infection. mRNA levels were estimated from read counts uniquely mapping to the region -21 to +9 for each gene; normalized to total reads uniquely mapping to the locus encoding the ESX-5 machinery (and esxM) in each isolate. Boxes indicate interquartile range. (d) Difference between 2Ala mutant and 2Thr wildtype for each pair, relative to wildtype expression level. (e) Structural model of EsxW-EsxV heterodimer. EsxV is shown as a surface (grey) and EsxW as a ribbon (red) with key residues shown as labeled sticks. (f) Comparison of biophysical measurements of heterodimer binding affinity between wildtype and mutant EsxW. All binding curves were determined across four replicates by microscale thermophoresis (MST), and are represented as the mean ± standard deviation.

Next we considered whether the EsxW-Thr2Ala mutation could impact on gene expression or protein structure and function. esxW is one of 23 esx genes in the Mtb genome, including 11 clustered pairs of esx genes whose products form heterodimers that are each secreted by one of five type VII secretion systems (T7SS; ESX-1 to ESX-5). The most-studied of these pairs is esxB (CFP10)/esxA (ESAT-6), secreted by ESX-1 and encoded in the RD1 locus which also encodes the ESX-1 system. EsxW and its heterodimerization partner EsxV are encoded by adjacent genes in the RD8 locus and secreted by ESX-5. The ESX-5 system is the most recently evolved T7SS in Mtb and is present only in the slow-growing Mycobacteria36,37 (including the Mtb complex, M. leprae, M. ulcerans and M. marinum). ESX-5 is unique amongst the Mtb T7SS in that (a) it secretes most of the PE/PPE proteins38, which are also unique to slow-growing Mycobacteria, comprise a substantial amount of protein coding capacity (~10%) in the Mtb genome and play various roles in virulence39,40; and (b) it is associated with five pairs of Esx proteins, resulting from duplication and expansion of esxM and esxN (which are encoded within the locus encoding the ESX-5 T7SS machinery) to create four paralogous copies elsewhere in the Mtb genome41 (Fig. 4). Each pair includes a member of the QILSS family (EsxM paralog, including EsxW) and a member of the Mtb9.9 family (EsxN paralog, including EsxV), which partner to form heterodimers. The paralogous proteins differ from one another by just a few amino acids (Fig. 4b), and the reason for this apparent redundancy is not clear. One pair, EsxJ/EsxI, has been shown to play a role in substrate selection for ESX-5 secretion42, and it is hypothesized that the other paralogs including EsxW/V play a similar role41. Notably, two of the loci encoding QILSS paralogs (EsxP/O, RD5; and EsxW/V, RD8) are missing from the M. bovis BCG vaccine strain. The upstream sequences of esxW and its homologs differ substantially (Fig. 4a). This suggests their expression is subject to different regulatory controls, which could provide a reason for their expansion in the Mtb genome despite the lack of differentiation at the protein level43 (Fig. 4b); indeed, it has been shown that the different QILSS family members are all expressed at different levels by Mtb strain H37Rv during growth in broth, sputum and macrophages44. This uniqueness also allows unambiguous read mapping and confident SNP calling at the N-terminal region of EsxW; however we also used PCR and capillary sequencing to confirm the EsxW-2Ala allele in all non-Lineage 2 strains in which the SNP was identified from Illumina reads (see Methods). We sought to investigate the expression of QILSS family proteins, and the potential impact of the EsxW-Thr2Ala mutation on gene expression, during growth in macrophages. For these experiments we selected four non-Beijing Mtb isolates harbouring the EsxW-2Ala allele, and the closest genetic relative of each with the wildtype allele EsxW-2Thr, and used RNAseq to measure genome-wide Mtb expression levels (see Methods). The results showed that the ESX-5 system was highly expressed in all isolates following 24h growth in macrophages (mean 0.25% of total Mtb RNAseq reads); however the QILSS paralogs were expressed at different levels relative to ESX-5 (Fig. 4c). Compared to esxM, which is encoded within the ESX-5 locus itself and was expressed at the highest levels in all isolates, esxK and esxP were expressed at intermediate levels (mean 42% of esxM level) and esxJ and esxW were expressed at low levels (mean 2.2% of esxM level). There were no significant differences between the EsxW-2Ala mutant vs wildtype in terms of Mtb growth in macrophages (measured at 4, 7 or 11 days post-infection), or in esxW (Fig. 4d), ESX-5 or global gene expression measured at 24h post-infection. It has been reported that esxW was significantly expressed in the lungs of Mtb aerosol-infected mice45; hence future experiments with wildtype and mutant EsxW in this animal system could potentially help to unravel its functional effects. Given the proposed role for the EsxW/V heterodimer in substrate selection for ESX-5 secretion41, we investigated whether the EsxW-Thr2Ala mutation could affect protein structure and function. In M. canetti, the reference Mtb genome H37Rv, and the majority of non-Beijing Mtb isolates, EsxW carries the polar threonine (codon ACC) at residue 2, while the other QILSS proteins in the Mtb complex and other slow growing Mycobacteria carry the hydrophobic alanine (GCC) at this position (Fig. 4b). In the Mtb Beijing lineage, EsxW residue 2 is converted to the more typical residue alanine (GCC), making it identical at the protein level to EsxJ, which has been shown to be involved in substrate selection for ESX-5 secretion42. We analysed the quantitative effects of the Thr2Ala mutation on the stability and affinity of the EsxW protomer and the EsxW/V heterodimer, using computational modeling and direct biophysical experiments (see Methods and Fig. 4e). This indicated that the 2Ala mutation was likely to lead to a mild increase in the affinity and stability of the heterodimer complex (mean ΔΔG 0.28±0.06 Kcal/mol), and this was supported by biophysical measurements that showed the mutant bound slightly stronger to EsxV than the wildtype (KDwt=0.6 µM; KDT2A=0.4 µM; p=0.05 using two-tailed t-test; Fig. 4f). We hypothesise that, if EsxW/V does indeed play a role in substrate selection for ESX-5 secretion, then increased stability of the heterodimer and/or the hydrophobicity of EsxW-2Ala could potentially affect the efficiency of secretion of certain PE/PPE proteins. This could thereby have downstream impacts on one or more of the known functions of ESX-5 including inflammasome activation, IL-1β secretion or escape from macrophages46, any of which could potentially work to promote transmission between hosts.

Discussion

The shorter terminal branch lengths and node-to-tip distances for local Beijing lineage Mtb (Fig 2) could be explained by (i) a slower mutation rate in the Beijing lineage, resulting in slower accumulation of SNPs over time; (ii) sampling bias, whereby new cases of active TB arising in the study population were more likely to be detected and included in the study if they were caused by Beijing lineage strains; or (iii) strain-specific transmission dynamics in the study population, whereby the average time taken to progress to active disease is shorter for Beijing lineage than for other Mtb strains. The mutation rate for Mtb has been estimated at ~0.5 SNPs per year for Lineage 4, and ~2x faster for Beijing strains47 (Lineage 1 has not been specifically measured but is assumed to be similar to Lineage 4, which is considered typical). Hence if all Mtb lineages were subject to the same transmission dynamics within the study area, we would expect to see longer terminal branch lengths for Beijing lineage isolates, whereas here we observe the opposite (significantly shorter branch lengths). Patients were recruited into the study following positive diagnosis at eight central DTUs in HCMC (map in Supplementary Figure 5). The identity of the infecting Mtb lineages was not known at the time of recruitment, and we are unaware of any factors that would bias the inclusion of Beijing lineage cases over others presenting to the clinics for treatment; hence we conclude the frequency of infections with Beijing strains, and their shorter branch lengths, reflect differences in the transmission dynamics of these strains within the study population. The lack of transfer of Mtb sublineage 1.1.1.1 between Vietnam and other geographical locations (Fig 3) may be associated with adaptation to the local host population with which it has co-evolved for centuries, similar to the ‘host-specialist’ clades recently identified within Lineage 421. In contrast, whilst the direction of transfer of Lineage 2.2.1 strains between HCMC and other geographical regions cannot be determined from our data, the frequency of transfer events and the scale of diversity amongst the HCMC strains (Fig 3) makes it unlikely that the rise of Beijing sublineage 2.2.1 in HCMC represents clonal spread of a locally established subclade. Regardless of direction, the frequency of transfer between Vietnam and diverse geographically dispersed populations supports previous contentions that the Beijing lineage is a host-generalist, capable of moving between ethnically diverse host populations24. While the mechanism remains to be elucidated, our results provide evidence that the Beijing lineage carries a variant of esxW that is under positive selection in natural Mtb populations. This is consistent with the idea that the protein is important for host interactions, potentially through substrate selection for ESX-5 secretion under certain conditions. Immunizing against EsxW has already been shown to be protective against infection29–32, and positive selection for this protein suggests vaccines including it are likely to remain effective in the long term. Taken together, our data show that the burden of TB in HCMC comprises (i) an underlying burden of disease caused by endemic Lineage 1 Mtb strains (24% of all TB cases), which disproportionately affects men and older people and is associated more with activation of long-term latent infection than short-term transmission clusters; and (ii) an additional disease burden caused by more recently introduced Lineage 2 and 4 Mtb strains (76% of all TB cases). In particular, Beijing sublineage 2.2.1 was associated with a wider demographic host range, infecting women and young people significantly more frequently than other lineages, and was associated with shorter time to active disease and frequent onward local transmission. One third of all Beijing strains were involved in transmission clusters (≤10 SNPs), and these were associated with wider geographic dispersal within HCMC. Notably, 75% of TB cases associated with transmission clusters involved the Beijing lineage, accounting for 20% of all cases included in the genomic study. Importantly, these data show that not all Mtb are equal: genetically diverse strains display distinct transmission dynamics even within a single localized host population, suggesting that more detailed understanding of lineage-specific variation in Mtb could be informative to tailor local TB control in HCMC and other settings. For example, TB contact tracing is commonly used in low-incidence high-income countries, but results vary in high-incidence low- and middle-income countries and there is a need to prioritize resources towards cases where contact tracing is most likely to yield results, which includes considering the likelihood of transmission and progression to active disease within the time frame of a contact tracing program (1-2 years)48,49. In HCMC it may thus be advantageous to direct contact tracing resources towards Beijing lineage cases, as they pose the greatest risk of onward transmission resulting in new active TB cases.

Online Methods

Bacterial isolates used in this study

Between December 2008 and July 2011, 2,091 individuals of the Vietnamese Kinh ethnic group attending the outpatient department of Pham Ngoc Thach Hospital or from 8 District Tuberculosis Units (District 1, 4, 5, 6, 8, Tan Binh, Binh Thanh and Phu Nhuan) in HCMC were recruited into a clinical study investigating predictors of failure and relapse in isoniazid resistant TB 50. The 8 TB units were chosen for inclusion from amongst the 24 servicing HCMC as they are centrally located and close to the TB reference laboratory, which was essential for the logistics of sample collection and processing (see map in Supplementary Figure 5). Inclusion criteria were: (1) 18 years or older, (2) negative HIV test, (3) provision of written informed consent, (4) smear positive pulmonary TB. Exclusion criteria were: (1) under 18 years of age, (2) HIV infected, (3) unable or unwilling to provide consent, (5) pregnancy, (6) prior history of TB antibiotic therapy, (7) will receive TB-DOTS (directly observed treatment, short-course) outside the study centres. Over the 2.5 year study period there were N=5036 new smear positive cases of TB (HIV positive and negative) at the district tuberculosis units, of which N=2091 were eligible for recruitment. Of these, N=1822 Mtb strains were isolated from the study participants. The annual incidence of pulmonary TB in HCMC is ~82,000, of which ~11,000 are in HCMC; therefore our sample represents ~6.6% of all cases in HCMC during the 2.5-year study period, and >30% of cases in the study districts.

Ethics

The study protocol was approved by the Institutional Research Board of Pham Ngoc Thach Hospital (the supervisory institution of the District TB Units in southern Vietnam), Ho Chi Minh City Health Services and the Oxford University Tropical Research Ethics Committee, UK (Oxtrec 030-07). Written informed consent was obtained from all patients.

DNA extraction and sequencing

Mtb isolates were subcultured on Lowenstein Jensen media and DNA extracted at the Oxford University Clinical Research Unit in HCMC using the cetyl trimethylammonium bromide (CTAB) extraction protocol as described previously 51. DNA was successfully obtained from N=1,728 isolates and shipped to the University of Melbourne for whole genome sequencing. DNA extracts were purified using AxyPrep™ Mag PCR Normalizer Protocol prior to library preparation. A total of N=1,655 DNA samples passed QC, were included for sequencing and subjected to library preparation using the Nextera XT protocol. Libraries were quantified using Quant-iT PicoGreen (dsDNA kit, Invitrogen), then normalised and pooled to 4 nM concentration. DNA underwent 150 bp paired end sequencing (Rapid mode v2) on the Illumina HiSeq 2500 platform (Illumina, San Diego). Sequence data was excluded for N=19 Mtb isolates that yielded less than the pre-established criteria of 10x mean read depth across the Mtb genome, as SNPs can not be reliably called below this depth. Sequence data was successfully generated for N=1,635 Mtb isolates from HCMC (representing 90% of those isolated from eligible patients in the cohort) with median three million reads per sample, providing median 99.2% coverage and 86x depth for each Mtb genome (Supplementary Table 1). To confirm the esxW codon 2 SNP, we performed PCR and capillary sequencing of the region flanking the SNP in all non-Beijing lineage strains in which the SNP was detected from Illumina data. The primers used are listed in Supplementary Table 4.

Data Availability

Mtb genome data generated from 1,635 isolates (Figure 1) has been deposited in NCBI BioProject [accession ID: PRJNA355614; see URLs section]. A total of 3,144 Mtb genomes were included in the analysis in Figure 3, comprising data from localized studies: 1,032 from Russia 3, 1,621 from Malawi 4,5, 248 Argentina 16, and 78 from China 17; as well as 106 globally dispersed Lineage 2 genomes 18 and 59 globally dispersed Lineage 1 genomes 19,20. Illumina Mtb genome sequences from various previously published studies were downloaded from the European Nucleotide Archive (see URLs section, individual accession numbers are given in (Supplementary Table 2). The H37Rv reference genome sequence (see URLs section) was used for all reference-driven analyses.

SNP analysis

Sequence reads were mapped to the H37Rv reference genome using the RedDog pipeline v0.5 (see URLs section). Briefly, Bowtie2 v2.2.3 was used for read alignment with the sensitive-local algorithm and the maximum insert length set to 2000 (via the -x parameter) 52 and variant sites (i.e. SNPs) were called using SAMTools v0.1.19 53. SNPs located in previously reported repetitive regions of the genome were excluded prior to phylogenetic analysis 54,55 (Supplementary Table 5); sites for which a definitive allele call could not be made in at least 99.5% of all isolate sequences were also excluded from the set of SNPs used for phylogenetic analysis. Two SNP alignments were compiled for analysis: one comprising the 1,635 HCMC isolates (total 73,718 SNPs), and one comprising all 4,779 isolates (including the HCMC isolates and the global collections downloaded from public data; total 133,492 SNPs).

In silico lineage and antimicrobial resistance typing

Mykrobe Predictor v0.3.6 was used to analyse raw Illumina reads generated from HCMC Mtb isolates and (a) assign each isolate to one of the seven Mtb lineages, and (b) detect known resistance associated polymorphisms 56 (summarized in Table 1, individual mutation calls are provided in Supplementary Table 1). All Mtb isolates were further assigned to sublineages by comparing SNPs identified using RedDog with those used in the haplotyping scheme defined by Coll et al. 6 (lineage assignments are in Supplementary Tables 1-2).

Phylogenomic analyses

ML phylogenetic trees were inferred using RAxML v7.7.2 57 for (a) all HCMC isolates (presented in Fig. 1); and (b) each of lineages 1, 2 and 4 using combined data from the HCMC isolates and available public data (presented in Fig. 3; see isolates list in Supplementary Tables 1-2). The trees presented are those with the highest likelihood from 5 replicate runs, constructed using the GTR model of nucleotide substitution and a Gamma model of rate heterogeneity to analyse a concatenated alignment of SNP alleles. An approximate ML tree containing all data (HCMC isolates and available public data) was inferred using FastTree v2.1.8 58. Ancestral sequence reconstruction was performed for the HCMC tree and combined tree using FastML v3.1 to infer the sequence alignment at each internal node of the ML phylogeny 59. Substitution events occurring on each branch of the tree were extracted by comparing the joint reconstruction sequences for the parent and child nodes; these data were used to identify homoplasic SNPs, and to identify lineage-specific polymorphisms as well as independent occurrences of those polymorphisms outside of the lineage of interest (data in Table 2). Terminal branch lengths reported are the number of substitutions (SNPs) mapped to each terminal branch (data in Fig. 2a) and were compared to Beijing 2.2.1 sublineage using 2-sample Kolmogorov-Smirnov tests. Metrics for genetic diversity and tree topology were calculated from the phylogenies using R. Node-to-tip distances showed similar variances within groups (standard deviations of 27-75 SNPs) and all groups were compared to Beijing 2.2.1 sublineage using 2-sample Kolmogorov-Smirnov tests. Clusters were defined as subtrees for whom the maximum patristic distance between descendant tips fell below a specified threshold (data in Fig. 2b-c). Each cluster was checked to determine whether all members of the cluster shared any of the antimicrobial resistance mutations identified by Mykrobe Predictor; clusters in which no known antimicrobial resistance mutation was conserved in all members of the cluster are reported as not explained by antimicrobial resistance (data in Supplementary Figure 2).

Phylogeography analysis

Transmission between geographical regions was assessed separately for Lineage 1, 2 and 4 trees using an implementation of stochastic mapping on phylogenies (SIMMAP) implemented in the phytools v0.5 package for R 60,61. Region of origin was treated as a discrete trait and mapped to each tree using the ARD model (which allows each region-to-region transfer rate to vary independently) with 100 replicates. The results reported (Fig. 3d) are the median values for the number of transitions to Vietnam from any other region, summarized from 100 replicate mappings for each tree.

Esx sequence analysis

Esx protein sequences were extracted from the H37Rv reference genome using Artemis, aligned using Muscle, and subjected to phylogenetic inference using PhyML v3.0 (tree in Fig. 4). DNA sequences flanking the start codon of each esx gene were extracted from the H37Rv reference genome using Artemis and aligned and visualised using JalView v2.6.1 (Fig. 4).

Macrophage infections

Mtb infection of macrophages (THP1 human cell line, 88081201, Sigma-Aldrich) has been previously described 62. To study Mtb growth in macrophages, 2.5 x 105 THP1 cells were seeded in 24-well cell culture plates and infected with Mtb isolates in triplicate at a multiplicity of infection (MOI) of 1. At 4h post infection extracellular bacteria were removed by washing. At 0, 4, 7 and 11 days post-infection, intracellular bacteria were harvested and plated on Middlebrook 7H10 agar plates. Colony forming unit counts resulted after 3 to 4 weeks incubation at 37°C. To measure genome-wide Mtb expression levels, confluent monolayers of 1.5 x 107 THP1 in 75cm2 cell culture flasks were infected with Mtb isolates at MOI 4. At 4h post infection extracellular bacteria were removed by washing. At 24h post-infection, intracellular bacteria were stabilized and collected in guanidine thiocyanate-based lysis buffer (4 M guanidine thiocyanate, 0.5% Na N-lauryl sarcosine, 25 mM sodium citrate, and 0.1 M β-mercaptoethanol). Total RNA from was then extracted using a previously published method 63.

RNAseq analysis

RNA samples were subjected to reverse transcription and the resulting cDNA was sequenced via Illumina HiSeq (100 bp paired end) at Macrogen. The resulting reads were first mapped to the H37Rv Mtb reference genome to retrieve bacterial sequences, resulting in 6-19 million bacterial sequence reads per sample. A strain-specific reference genome was prepared for each of the eight Mtb isolates, by polishing the H37Rv genome sequence with the Illumina reads previously obtained by sequencing whole genomic DNA extracted from that isolate (i.e. the data used for phylogenomic analyses) using Pilon (v1.22) with default settings. For each isolate, the RNAseq reads were mapped to the isolate’s own reference genome sequence using Bowtie2 (v2.2.3) 52. Read alignments were visualised using the BamView function in v14.0.0 64, which was also used to filter reads to include only those with mapping score ≥30 for both forward and reverse reads, and to extract read counts for regions of interest. Expression values for the ESX-5 locus were calculated as the total reads mapping to the region spanning from pe19 to eccA5 (coordinates 2029880 to 2038532 in H37Rv). Due to the very close sequence homology in the protein-coding regions of the QILSS genes, it was not possible to assess their expression by counting total reads mapping to the length of the gene. Instead, we assessed relative expression of these genes by extracting the number of reads mapping to a unique 30 bp marker region spanning the start codon of each paralog (coordinates -21 to +9 relative to the first base pair in each protein-coding sequence; see Fig. 4 for uniqueness of these regions). To facilitate comparison of QILSS gene expression across strains with different RNAseq library sizes, we normalised the read counts for each isolate by the total number of reads mapping to the ESX-5 locus from pe19 to eccA5 (coordinates 2029880 to 2038532 in H37Rv). The ESX-5 locus was chosen for this purpose as all QILSS proteins are secreted by ESX-5 and thus depend on its expression (note the locus contains the QILSS gene esxM, which was the highest expressed of all QILSS genes in all isolates). The ratios of reads mapping to each QILSS gene vs ESX-5 locus in each strain i (x/ESX-5) is shown in Figure 4. To investigate differences in QILSS gene expression between wildtype (i) and mutant (j) isolate pairs (Fig. 4), we calculated the difference in these ratios relative to the wildtype level, as follows: (x/ESX-5 – x/ESX-5)/(x/ESX-5).

Computational protein structure analysis

Structure guided approaches can provide valuable insight into the molecular mechanism of mutations and their role in diseases 20,65–74, 75. To evaluate the structural effects of the Thr2Ala mutation in EsxW, models of EsxW and EsxV were generated using Modeller v9.1976 and MacroModel (vSchrodinger 2017, New York, NY), based on the experimental structures of homologous ESAT-6-like complexes (PDB IDs: 1WA8, 2KG7, 2VS0, 3OGI, 3ZBH, 4GZR, 4IOG and 4LWS; sequence identities ranging from 20-91%). The models were then minimized using the MMF94s forcefield in Sybyl-X 2.1.1 (Certara L.P., St Louis, MO), with the final structure having more than 95% of residues in the allowed region of a Ramachandran plot. The quality of the models was confirmed with Verify3D v1 (data not shown). Model structures were examined using Pymol v1.9. The model of the EsxW-EsxV complex was built using the homologous complexes to guide protein docking. The EsxW protomer and EsxW-EsxV heterodimer model were subjected to molecular dynamics to generate a series of representative conformations using Desmond (vSchrodinger 2017, New York, NY). Sodium and chloride ions were added to reach a final concentration of 150 mM Na+ and the system was solvated. The Amber and Charmm36 force fields were applied to the system and the default Desmond minimization and equilibration procedure was followed. Simulations were kept at constant pressure (1 atom) and temperature (335 K). The structural consequences of the EsxW Thr2Ala variant were analysed to account for all the potential effects of the mutations 77. The effects of the variant upon the stability of EsxW was predicted using SDM v2 78, mCSM-Stability v1 79 and DUET v1 80. The effect of the mutations upon the affinity of EsxW to bind to EsxV, and the stability of the complex, were predicted using mCSM-PPI v1 79. The predictions were analysed and averaged across the conformational landscape. These computational approaches represent the wild-type structural and chemical environment of a residue as a graph based signature in order to determine the change upon mutation in Gibb’s Free Energy of stability or binding. Reported results are based on analysis of EsxW (wildtype and Thr2Ala mutant) and EsxV amino acid sequences encoded in the H37Rv (lineage 4) reference genome; the only difference between these dimers and those encoded in lineage 1 or 2 is at EsxV residues 20 and 23. Structural modelling with the lineage 1/2 EsxV sequence yielded the same results as the analyses conducted with the lineage 4 background (i.e. stronger binding affinity observed with the EsxW Thr2Ala mutant).

Biophysical Measurements

EsxV and wildtype (2Thr) and mutant (2Ala) EsxW were cloned into a pET18 vector with a C-terminal hexahistidine tag (again using the sequences encoded in H37Rv (lineage 4) reference genome). A di-lysine motif was added prior to the hexahistidine tag of EsxW to facilitate labelling on the extended C-terminus. Protein was expressed, purified and refolded following the method of Mahmood and colleagues81. Solution microscale thermophoresis (MST) binding studies were performed to measure the binding affinity of EsxV for both the wildtype and mutant EsxW, using standard protocols on a Monolith NT.115 (Nanotemper Technologies)82–84. The EsxW was chosen to be labelled due to the presence of a lysine in EsxV at the heterodimer binding interface. As there were no lysines present in EsxW, the dilysine motif was added to the C-terminal projection to facilitate labelling using a RED-NHS (Amine Reactive) Protein Labeling Kit (Nanotemper Technologies), which contains an NT-647 dye, as per the manufacturer’s instructions. Labelled either wildtype or 2Ala mutant EsxW was mixed with EsxV in PBS with 0.05% Tween-20. Each replicate was undertaken using a 16-step 2-fold serial dilution series. The EsxW protein concentration was chosen such that the observed fluorescence was approximately 400 units at 70% LED power. The samples were loaded into standard capillaries and heated at 40% laser power (48 mW) for 30 s, followed by 5 s cooling. The data were normalized against the baseline obtained in the absence of any EsxV, and the maximal response obtained at the highest concentration of inhibitor. The dissociation constant KD was obtained by plotting the normalized fluorescence (Fnorm) against the logarithm of the concentrations of the dilution series and resulted in a sigmoidal binding curve that could be directly fitted with a nonlinear solution of the law of mass action (Figure 4). All experiments were performed with four replicates, and the normalized fluorescence thermophoresis curves were analyzed using GraphPad v6 (GraphPad, San Diego, CA, USA).

Statistical analyses

All statistical analyses were performed in R version 3.3.3 unless otherwise stated. Associations between lineages and patient demographics (age group, sex) were assessed using Fisher’s exact test (two-sided in all cases) to assess n=1,634 HCMC Mtb genomes (i.e. excluding the Lineage 3 isolate). A linear test for trend in proportions was also used to test for an increase in frequency of Lineage 1 within increasing age groups (also n=1,634). Tests for difference in the distributions of terminal branch lengths or node-to-tip distances between lineages (Fig 2) were conducted using the Kolmogorov-Smirnov test (two-sided in all cases, n=1,634 Mtb genomes). Differences in the proportion of pairs involved in transmission clusters between lineages, or between case pairs isolated at the same DTU within ≤10 SNP or ≤20 SNP clusters vs pairs separated by >20 SNPs (Supplementary Figure 1), were tested using a two-sided test for equal proportions (n=1,634 Mtb genomes). A two-tailed t-test was used to compare KD values for mutant vs wildtype (Fig 4f), analysed using GraphPad Prism (n=8, i.e. 4 replicates each for wildtype and mutant).

URLs

NCBI BioProject accession, ID PRJNA355614, http://www.ncbi.nlm.nih.gov/bioproject/355614; The H37Rv reference genome sequence, accession NC_000962.3, https://www.ncbi.nlm.nih.gov/genbank/; RedDog pipeline v0.5, https://github.com/katholt/RedDog; European Nucleotide Archive, http://www.ebi.ac.uk/ena;

Data Availability

Mtb genome data was deposited in NCBI BioProject [ID: PRJNA355614; http://www.ncbi.nlm.nih.gov/bioproject/355614]; individual accession numbers for Mtb genomes analysed in this study are given in Supplementary Tables 1-2 (including data from previous studies).
  82 in total

1.  Molecular characterization of secretory proteins Rv3619c and Rv3620c from Mycobacterium tuberculosis H37Rv.

Authors:  Anjum Mahmood; Shubhra Srivastava; Sarita Tripathi; Mairaj Ahmed Ansari; Mohammad Owais; Ashish Arora
Journal:  FEBS J       Date:  2010-12-06       Impact factor: 5.542

Review 2.  Phylogeny to function: PE/PPE protein evolution and impact on Mycobacterium tuberculosis pathogenicity.

Authors:  S Fishbein; N van Wyk; R M Warren; S L Sampson
Journal:  Mol Microbiol       Date:  2015-03-30       Impact factor: 3.501

3.  Analysis of HGD Gene Mutations in Patients with Alkaptonuria from the United Kingdom: Identification of Novel Mutations.

Authors:  Jeannette L Usher; David B Ascher; Douglas E V Pires; Anna M Milan; Tom L Blundell; Lakshminarayan R Ranganath
Journal:  JIMD Rep       Date:  2015-02-15

4.  Comparison of mammalian cell entry operons of mycobacteria: in silico analysis and expression profiling.

Authors:  Ashwani Kumar; Amita Chandolia; Uma Chaudhry; Vani Brahmachari; Mridula Bose
Journal:  FEMS Immunol Med Microbiol       Date:  2005-02-01

5.  Mycobacterium tuberculosis invasion of macrophages: linking bacterial gene expression to environmental cues.

Authors:  Kyle H Rohde; Robert B Abramovitch; David G Russell
Journal:  Cell Host Microbe       Date:  2007-11-15       Impact factor: 21.023

6.  Intradermal immunization improves protective efficacy of a novel TB vaccine candidate.

Authors:  Susan L Baldwin; Sylvie Bertholet; Maria Kahn; Irina Zharkikh; Gregory C Ireton; Thomas S Vedvick; Steven G Reed; Rhea N Coler
Journal:  Vaccine       Date:  2009-03-26       Impact factor: 3.641

7.  The Beijing genotype is associated with young age and multidrug-resistant tuberculosis in rural Vietnam.

Authors:  T N Buu; M N Huyen; N T N Lan; H T Quy; N V Hen; M Zignol; M W Borgdorff; Frank G J Cobelens; D van Soolingen
Journal:  Int J Tuberc Lung Dis       Date:  2009-07       Impact factor: 2.373

8.  Mycobacterium tuberculosis Beijing genotype emerging in Vietnam.

Authors:  D D Anh; M W Borgdorff; L N Van; N T Lan; T van Gorkom; K Kremer; D van Soolingen
Journal:  Emerg Infect Dis       Date:  2000 May-Jun       Impact factor: 6.883

9.  Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans.

Authors:  Iñaki Comas; Mireia Coscolla; Tao Luo; Sonia Borrell; Kathryn E Holt; Midori Kato-Maeda; Julian Parkhill; Bijaya Malla; Stefan Berg; Guy Thwaites; Dorothy Yeboah-Manu; Graham Bothamley; Jian Mei; Lanhai Wei; Stephen Bentley; Simon R Harris; Stefan Niemann; Roland Diel; Abraham Aseffa; Qian Gao; Douglas Young; Sebastien Gagneux
Journal:  Nat Genet       Date:  2013-09-01       Impact factor: 38.330

10.  The influence of host and bacterial genotype on the development of disseminated disease with Mycobacterium tuberculosis.

Authors:  Maxine Caws; Guy Thwaites; Sarah Dunstan; Thomas R Hawn; Nguyen Thi Ngoc Lan; Nguyen Thuy Thuong Thuong; Kasia Stepniewska; Mai Nguyet Thu Huyen; Nguyen Duc Bang; Tran Huu Loc; Sebastien Gagneux; Dick van Soolingen; Kristin Kremer; Marianne van der Sande; Peter Small; Phan Thi Hoang Anh; Nguyen Tran Chinh; Hoang Thi Quy; Nguyen Thi Hong Duyen; Dau Quang Tho; Nguyen T Hieu; Estee Torok; Tran Tinh Hien; Nguyen Huy Dung; Nguyen Thi Quynh Nhu; Phan Minh Duy; Nguyen van Vinh Chau; Jeremy Farrar
Journal:  PLoS Pathog       Date:  2008-03-28       Impact factor: 6.823

View more
  70 in total

1.  Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning.

Authors:  Arun Prasad Pandurangan; Tom L Blundell
Journal:  Protein Sci       Date:  2019-11-25       Impact factor: 6.725

2.  mmCSM-AB: guiding rational antibody engineering through multiple point mutations.

Authors:  Yoochan Myung; Douglas E V Pires; David B Ascher
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

3.  ThermoMutDB: a thermodynamic database for missense mutations.

Authors:  Joicymara S Xavier; Thanh-Binh Nguyen; Malancha Karmarkar; Stephanie Portelli; Pâmela M Rezende; João P L Velloso; David B Ascher; Douglas E V Pires
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

4.  Model-based integration of genomics and metabolomics reveals SNP functionality in Mycobacterium tuberculosis.

Authors:  Ove Øyås; Sonia Borrell; Andrej Trauner; Michael Zimmermann; Julia Feldmann; Thomas Liphardt; Sebastien Gagneux; Jörg Stelling; Uwe Sauer; Mattia Zampieri
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-30       Impact factor: 11.205

5.  Local adaptation of Mycobacterium tuberculosis on the Tibetan Plateau.

Authors:  Qingyun Liu; Haican Liu; Li Shi; Mingyu Gan; Xiuqin Zhao; Liang-Dong Lyu; Howard E Takiff; Kanglin Wan; Qian Gao
Journal:  Proc Natl Acad Sci U S A       Date:  2021-04-27       Impact factor: 11.205

6.  Host-pathogen genetic interactions underlie tuberculosis susceptibility in genetically diverse mice.

Authors:  Clare M Smith; Richard E Baker; Megan K Proulx; Bibhuti B Mishra; Jarukit E Long; Sae Woong Park; Ha-Na Lee; Michael C Kiritsy; Michelle M Bellerose; Andrew J Olive; Kenan C Murphy; Kadamba Papavinasasundaram; Frederick J Boehm; Charlotte J Reames; Rachel K Meade; Brea K Hampton; Colton L Linnertz; Ginger D Shaw; Pablo Hock; Timothy A Bell; Sabine Ehrt; Dirk Schnappinger; Fernando Pardo-Manuel de Villena; Martin T Ferris; Thomas R Ioerger; Christopher M Sassetti
Journal:  Elife       Date:  2022-02-03       Impact factor: 8.140

7.  Tuberculosis endotypes to guide stratified host-directed therapy.

Authors:  Andrew R DiNardo; Tomoki Nishiguchi; Sandra L Grimm; Larry S Schlesinger; Edward A Graviss; Jeffrey D Cirillo; Cristian Coarfa; Anna M Mandalakas; Jan Heyckendorf; Stefan H E Kaufmann; Christoph Lange; Mihai G Netea; Reinout Van Crevel
Journal:  Med (N Y)       Date:  2021-01-16

8.  Toward a Phage Cocktail for Tuberculosis: Susceptibility and Tuberculocidal Action of Mycobacteriophages against Diverse Mycobacterium tuberculosis Strains.

Authors:  Carlos A Guerrero-Bustamante; Rebekah M Dedrick; Rebecca A Garlena; Daniel A Russell; Graham F Hatfull
Journal:  mBio       Date:  2021-05-20       Impact factor: 7.867

9.  China's tuberculosis epidemic stems from historical expansion of four strains of Mycobacterium tuberculosis.

Authors:  Qingyun Liu; Aijing Ma; Lanhai Wei; Yu Pang; Beibei Wu; Tao Luo; Yang Zhou; Hong-Xiang Zheng; Qi Jiang; Mingyu Gan; Tianyu Zuo; Mei Liu; Chongguang Yang; Li Jin; Iñaki Comas; Sebastien Gagneux; Yanlin Zhao; Caitlin S Pepperell; Qian Gao
Journal:  Nat Ecol Evol       Date:  2018-11-05       Impact factor: 15.460

10.  Phenotypic and genotypic features of the Mycobacterium tuberculosis lineage 1 subgroup in central Vietnam.

Authors:  Nguyen Thi Le Hang; Minako Hijikata; Shinji Maeda; Akiko Miyabayashi; Keiko Wakabayashi; Shintaro Seto; Nguyen Thi Kieu Diem; Nguyen Thi Thanh Yen; Le Van Duc; Pham Huu Thuong; Hoang Van Huan; Nguyen Phuong Hoang; Satoshi Mitarai; Naoto Keicho; Seiya Kato
Journal:  Sci Rep       Date:  2021-06-30       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.