Literature DB >> 27798628

Mycobacterium tuberculosis lineage 4 comprises globally distributed and geographically restricted sublineages.

David Stucki1,2, Daniela Brites1,2, Leïla Jeljeli3,4, Mireia Coscolla1,2, Qingyun Liu5, Andrej Trauner1,2, Lukas Fenner1,2,6, Liliana Rutaihwa1,2, Sonia Borrell1,2, Tao Luo7, Qian Gao5, Midori Kato-Maeda8, Marie Ballif1,2,6, Matthias Egger6, Rita Macedo9, Helmi Mardassi4, Milagros Moreno10, Griselda Tudo Vilanova11, Janet Fyfe12, Maria Globan12, Jackson Thomas13, Frances Jamieson14, Jennifer L Guthrie14, Adwoa Asante-Poku15, Dorothy Yeboah-Manu15, Eddie Wampande16, Willy Ssengooba16,17, Moses Joloba16, W Henry Boom18, Indira Basu19, James Bower19, Margarida Saraiva20,21, Sidra E G Vaconcellos22, Philip Suffys22, Anastasia Koch23, Robert Wilkinson23,24,25, Linda Gail-Bekker23, Bijaya Malla1,2, Serej D Ley1,2,26, Hans-Peter Beck1,2, Bouke C de Jong27, Kadri Toit28, Elisabeth Sanchez-Padilla29, Maryline Bonnet29, Ana Gil-Brusola30, Matthias Frank31, Veronique N Penlap Beng32, Kathleen Eisenach33, Issam Alani34, Perpetual Wangui Ndung'u35, Gunturu Revathi36, Florian Gehre27,37, Suriya Akter27, Francine Ntoumi31,38, Lynsey Stewart-Isherwood39, Nyanda E Ntinginya40, Andrea Rachow41, Michael Hoelscher41, Daniela Maria Cirillo42, Girts Skenders43, Sven Hoffner44, Daiva Bakonyte45, Petras Stakenas45, Roland Diel46, Valeriu Crudu47, Olga Moldovan48, Sahal Al-Hajoj49, Larissa Otero50, Francesca Barletta50, E Jane Carter51,52, Lameck Diero52, Philip Supply53, Iñaki Comas54,55, Stefan Niemann3,56, Sebastien Gagneux1,2.   

Abstract

Generalist and specialist species differ in the breadth of their ecological niches. Little is known about the niche width of obligate human pathogens. Here we analyzed a global collection of Mycobacterium tuberculosis lineage 4 clinical isolates, the most geographically widespread cause of human tuberculosis. We show that lineage 4 comprises globally distributed and geographically restricted sublineages, suggesting a distinction between generalists and specialists. Population genomic analyses showed that, whereas the majority of human T cell epitopes were conserved in all sublineages, the proportion of variable epitopes was higher in generalists. Our data further support a European origin for the most common generalist sublineage. Hence, the global success of lineage 4 reflects distinct strategies adopted by different sublineages and the influence of human migration.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27798628      PMCID: PMC5238942          DOI: 10.1038/ng.3704

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Introduction

Ecologists distinguish between generalists and specialists depending on the width of an organism’s ecological niche1. In infectious diseases, the niche of a given pathogen is determined by host range and the agent’s capacity to survive in the environment2. Some microbes are obligate pathogens restricted to one or several host species3,4, others are mainly free-living and only occasionally pathogenic5. Little is known on the niche width of obligate human pathogens3. The causative agent of tuberculosis, known as the Mycobacterium tuberculosis complex (MTBC), is an obligate pathogen that comprises seven phylogenetic lineages adapted to humans and two lineages adapted to various wild and domestic animal species6. Some human-adapted MTBC lineages have received particular attention. For example, Lineage 2, which includes the Beijing family of strains, has repeatedly been associated with drug resistance7. Lineage 2 likely originated in East Asia8,9, and has recently been expanding in some parts of the world10. By contrast, Lineages 5 and 6 (also known as Mycobacterium africanum West Africa I and II), and Lineage 7 are largely restricted to West- and East Africa, respectively11,12. The observation that the human-adapted MTBC population is phylogeographically structured has led to the hypothesis that the different lineages might be adapted to particular human populations13. Support for this notion comes from the observation that sympatric host-pathogen associations in human tuberculosis remain stable over time, even in metropolitan settings where host and pathogen populations intermix14–17. Moreover, sympatric host-pathogen associations are perturbed in HIV coinfected patients14, indicating that in the context of reduced host immune-competence, the different lineages can successfully infect and cause disease irrespective of the host genetic background. Contrary to the other main human-adapted MTBC lineages, Lineage 4 occurs at significant frequencies on all inhabited continents18. It is hence geographically the most widespread cause of human tuberculosis19. Yet, the reasons for this global success are unknown. Lineage 4 has been shown to exhibit enhanced virulence in macrophage and animal models of infection, albeit with much variation between different Lineage 4 strains19,20. Moreover, molecular epidemiological studies have reported considerable variation in the transmission success of different Lineage 4 strains in clinical settings19. These observations suggest that Lineage 4 is genetically and phenotypically diverse, and this diversity might determine the epidemiology of different Lineage 4 subtypes in different parts of the world. The purpose of this study is to get a better understanding of the global population structure of Lineage 4 and the evolutionary forces that have contributed to the success of Lineage 4 across the world. For this we combined large-scale single nucleotide polymorphism (SNP)-typing with targeted whole-genome sequencing of a global collection of Lineage 4 clinical isolates.

Results

MTBC Lineage 4 comprises 10 separate sublineages

We first analyzed 72 published genome sequences of Lineage 4 clinical strains from global sources21,22. These strains harbored 9,455 variable single nucleotide positions which divided Lineage 4 into 10 sublineages (L4.1.1 to L4.10 in Fig. 1a and Supplementary Fig. 1). We used four complementary approaches to validate these sublineages. First, we performed a principal component analysis, which showed a clear separation of seven sublineages (L4.1.1, L4.1.3, L4.1.2, L4.2, L4.3, L4.4, L4.10; Supplementary Fig. 2). Sublineages L4.5, L4.6.1/Uganda and L4.6.2/Cameroon were less clearly separated. Second, we found that the mean pairwise genetic distance between pairs within the sublineages was significantly lower than between sublineages (276 SNPs versus 602 SNPs, Wilcoxon rank sum test, p < 0.0001, Supplementary Fig. 3). Overall, the mean pairwise SNP distance between any two strain pairs was 565 SNPs. Third, we calculated pairwise fixation indexes (FST) to evaluate the degree of population differentiation. All FST values between the sublineages were larger than 0.33 (Supplementary Table 1), indicating that these populations are separated. Fourth, we mapped previously reported phylogenetic markers onto our genome-based phylogenetic tree15,23–28. Most of these markers were congruent with our sublineage definition (Supplementary Fig. 1).
Figure 1

Definition and global frequency of Lineage 4 sublineages.

(a) We defined 10 sublineages based on the analysis of 72 MTBC Lineage 4 genome sequences published previously21,22. Sublineages were labeled according to Coll et al.27 (whenever possible) and previous designations based on spoligotyping (see Supplementary Fig. 1). Black triangles indicate sublineages identified as specialists, black circles indicate generalists. Filled shapes indicate sublineages, for which we performed deep genomic analyses. (b) Global proportion of each sublineage. A total of 3,366 MTBC Lineage 4 isolates were screened for sublineage-specific SNPs. L4.3/LAM was the most frequent sublineage globally.

Sublineages differ in their phylogeographic distribution

Because the MTBC exhibits limited sequence variation and no signficant ongoing horizontal gene exchange, SNP homoplasies are extremely rare, making SNPs ideal phylogenetic markers29. We further scrutinized the 9,455 variable positions among the 72 MTBC Lineage 4 genomes, and found 51 to 277 specific for one of each of the 10 sublineages. All of these variable positions were mutually exclusive, i.e. they showed no homoplasy. We selected a subset of these sublineage-specific SNPs and used these to screen a global collection of 3,366 Lineage 4 clinical isolates from 100 countries using various genotyping platforms30–35. First, we developed a novel sublineage-specific multiplexed SNP-typing assay using the Luminex platform as previously reported36, and used that method to screen 2,001 isolates (Supplementary Table 2). In addition, we screened 741 isolates using the Sequenom MassARRAY platform (Supplementary Table 3)37, and 624 isolates by PCR and Sanger sequencing (Supplementary Table 4). Overall, 3,181/3,366 (94.5%) Lineage 4 isolates were successfully assigned to a sublineage (Supplementary Table 5). An additional 92/3,366 (2.7%) isolates harbored the reference allele for all sublineages, indicating they belonged to one or several additional and unknown sublineages. For the remaining 93/3,366 (2.8%) isolates, no classification could be obtained for various technical reasons. Among the 3,181 Lineage 4 isolates assigned to one of the 10 sublineages, L4.3/LAM was the most frequent, accounting for 20.3%, followed by L4.6.1/Uganda (14.2%), L4.10/PGG3 (11.9%), L4.4 (10.1%), and L4.1.2/Haarlem (9.9%) (Fig. 1b). Mapping the proportion of each sublineage by country showed that the sublineages differed in their geographical distribution (Fig. 2). Specifically, L4.1.2/Haarlem, L4.3/LAM and L4.10/PGG3 occurred globally (Fig. 3a, Supplementary Fig. 4). By contrast, L4.1.3/Ghana, L4.5, L4.6.1/Uganda and L4.6.2/Cameroon occurred at high frequencies in specific regions of Africa or Asia, and were almost completely absent from Europe and the Americas (Fig. 3b). The geographical spread of the three remaining sublineages was intermediate (Fig. 2, Supplementary Figs. 4 and 5). L4.1.1/X mainly occurred in the Americas and in lower proportions in few countries of Southern Africa, Asia and Europe. L4.2 and L4.4 occurred in high proportions among isolates from particular countries in Asia and Africa, but were largely absent from the Americas (Fig. 2, Supplementary Figs. 4 and 5). A similar pattern of sublineage distribution was observed when normalizing by TB prevalence38 and country surface area (Supplementary Fig. 6).
Figure 2

Global distribution of Lineage 4 sublineages.

Pie charts showing proportions of the 10 Lineage 4 sublineages among all MTBC Lineage 4 isolates in each country. Circle sizes correspond to the number of isolates analyzed per country. A total of 3,366 MTBC Lineage 4 isolates were included. Color codes are as in Fig. 1.

Figure 3

Country-specific proportions of sublineages reveal generalists and specialists.

(a) The generalist sublineages L4.1.2/Haarlem, L4.3/LAM and L4.10/PGG3 were found globally at high proportions. (b) The locally restricted specialist sublineages L4.1.3/Ghana, L4.5, L4.6.1/Uganda and L4.6.2/Cameroon occurred at high frequencies in only a few countries and were restricted to certain geographical regions. Intensity of red indicates proportion of the sublineage among all Lineage 4 isolates in each country. Countries with fewer than three isolates in total are shown as “no data” and are filled white. A total of 3,366 Lineage 4 isolates were included in this analysis. The color scale for all sublineages is as indicated in Panel a, except for sublineage L4.1.3/Ghana (separate scale shown).

Populations that occupy a broader variety of environments may exhibit a wider geographic distribution. Humans differ in their susceptibility to TB39, and human genetic diversity may thus determine the width of the ecological niche accessible to different MTBC genotypes40,41. The geographical restriction of particular MTBC genotypes might reflect local adaption of these pathogen variants to the corresponding human host populations13,15. Such a sympatric host-pathogen association in human TB is compatible with the “local” sublineages observed here, and supports the notion that these sublineages represent ecological specialists. By contrast, the three “global” sublineages could represent generalists capable of infecting and causing disease in many different human populations. This notion was supported by the fact that the three generalist sublineages L4.1.2/Haarlem, L4.3/LAM and L4.10/PGG3 were observed in 49, 47 and 47 countries, respectively, whereas the specialist sublineages L4.1.3/Ghana, L4.5, L4.6.1/Uganda and L4.6.2/Cameroon were only found in few countries each (3, 7, 9 and 10 countries, respectively). The country frequencies of the remaining three sublineages L4.1/X, L4.2/Ural and L4.4 were intermediate (27, 14 and 26 countries, respectively) (Supplementary Fig. 4). The different geographical distribution of generalist and specialist sublineages could be due to intrinsic biological factors, extrinsic factors such as human migration, or both. Hence we next performed various population genomic analyses to explore the genomic characteristics of these Lineage 4 generalists and specialists, as well as the role of human migration in the global spread of the most successful generalist sublineage.

Genomic features of generalist and specialist sublineages

The geographic and niche distribution of populations can be correlated with their genetic variability or with that of their ancestors. One possible reason for the restricted host range of the specialist sublineages might be historical, i.e. the ancestor populations of the extant specialist populations may have harbored more deleterious mutations, restricting their host range. To assess this possibility, we characterized the mutations which contributed to the divergence of the different sublineages; these mutations are variants that have become fixed during the evolution of these sublineages. We focused on the substitutions that occurred in all isolates of any of the generalist sublineages (L4.1.2/Haarlem, L4.3/LAM and L4.10/PGG3) and compared them to the substitutions that occurred in all isolates of any of the three specialist sublineages (L4.6.1/Uganda, L4.5 and L4.6.2/Cameroon). We identified nonsynonymous SNPs predicted to have a functional effect using SIFT42. We found that overall, the specialist and generalist sublineages showed a similar proportion of fixed substitutions (among all substitutions) predicted to impact gene function (23.0% versus 20.6%, χ2 test p=0.62; Supplementary Table 6), suggesting that the mutational load of the ancestor populations did not differ significantly between generalists and specialists. Small populations with restricted geographic ranges are expected to have reduced levels of genetic diversity43. Thus, one possible restriction to niche expansion by specialist sublineages could be that these sublineages have low genetic diversity precluding adaptation to new hosts. We characterized the genetic diversity associated with the process of diversification in Lineage 4 generalists and specialists, focusing on L4.3/LAM and L4.6.1/Uganda, globally the most frequent generalist and specialist sublineages of Lineage 4 in our dataset, respectively (Fig. 1b). We analyzed the whole-genome sequences of 293 L4.3/LAM clinical strains representing the global diversity of this sublineage. These were selected from a global collection of 2,132 L4.3/LAM isolates based on standard genotyping data (Supplementary Table 7, Supplementary Figs. 7-9). For L4.6.1/Uganda, we analyzed whole-genome sequences of 203 clinical strains from Uganda and several neighboring countries (Supplementary Table 7, Supplementary Figs. 10 and 11). This sample included 28 L4.6.1/Uganda strains identified through screening of 13,067 publically available MTBC whole genome sequences (see Online Methods)44–56. Comparing the genetic diversity between these two bacterial populations showed that L4.3/LAM was significantly more diverse than L4.6.1/Uganda (mean number of 395 SNPs between pairs compared to 215 SNPs, respectively; Wilcoxon rank sum test p<0.0001), consistent with the expected difference between generalists and specialists43.

Antigenic diversity in Lineage 4 sublineages

We previously reported that in the human-adapted MTBC, experimentally confirmed human T cell epitopes were conserved57,58. This is unlike many other pathogens where genomic regions encoding antigens tend to be diverse as a result of antigenic variation linked to immune escape59. When we assessed the evolutionary conservation of 1,226 experimentally confirmed human T cell epitopes60 in L4.6.1/Uganda by calculating their dN/dS, we found that these epitopes were significantly more conserved than the non-epitope regions of the corresponding T cell antigens (Wilcoxon rank sum test, p<0.0001, Fig. 4). This result was consistent with our previous findings for the MTBC overall57,58. However for L4.3/LAM, we saw the opposite, i.e. the T cell epitopes showed a significantly higher dN/dS than the non-epitope regions (Wilcoxon rank sum test, p<0.0001, Fig. 4). To test whether a high dN/dS in T cell epitopes is characteristic of the generalist sublineages, we analyzed the genomes of 228 L4.2/Haarlem strains and 301 L4.10/PGG3 strains identified by screening of 13,067 publically available genomes (Supplementary Table 7, Supplementary Figs. 12 and 13). We found that in contrast to L4.3/LAM, the epitope regions in these generalist sublineages were more conserved than the corresponding non-epitope regions, i.e. similar to L4.6.1/Uganda and the MTBC overall57,58 (Fig. 4). Consistent with previous reports57,58,61, essential genes62 were significantly more conserved than nonessential genes in all sublineages, except L4.3/LAM in which the dN/dS of essential and nonessential genes were not significantly different (Fig. 4)
Figure 4

Pair-wise ratios of rates of nonsynonymous to synonymous substitutions (dN/dS) in generalist and specialist sublineages for different gene categories.

Abbreviations: Epi – experimentally confirmed human T cell epitopes; nEpi – non-epitope regions of T-cell antigens, both obtained from the Immune Epitope Database60; Ess – essential genes62; nEss – non-essential genes62. Wilcoxon rank sum tests: L4.6.1/Uganda (N=203) Epi vs nEpi, W=4952, p<0.001; L4.6.1/Uganda (N=203) Ess vs nEss, W=1415, p<0.001; L4.3/LAM (N=293) Epi vs nEpi, W=74540, p<0.001, L4.3/LAM (n=293) Ess vs nEss W=45067, p-value=0.29; L4.1.2/Haarlem (N=228) Epi vs nEpi, W=6561, p<0.001, L4.1.2/Haarlem (N=228) Ess vs nEss W=13369, p<0.001; L4.10/PGG3 (N=301) Epi vs nEpi, W= 27335, p<0.001, L4.10/PGG3 (N=301) Ess vs nEss W= 3103, p<0.001.

One of the limitations of our dN/dS analyses was that despite a large number of genomes analyzed, within individual sublineages, the mean number of pair-wise differences in regions encoding T cell epitopes was very small (Supplementary Table 8), limiting the accuracy of dN/dS inferences for epitopes. Hence, we assessed T cell epitope diversity by comparing the number of epitopes affected by nonsynonymous variants in the different sublineages (Fig. 5). We found that in all four sublineages, the majority of epitopes were completely conserved, consistent with our previous findings for the MTBC overall57,58. However, each of the three generalist sublineages showed significantly more epitopes harboring at least one amino acid change when compared to the specialist sublineages L4.6.1/Uganda (Fig. 5, χ2 tests p<0.0001 for all comparisons). It is possible that this comparably higher epitope diversity in generalists might reflect interactions with broader host populations.
Figure 5

Frequency distribution of the number of epitopes with nonsynonymous variants in generalist and specialist sublineages.

A total of 1,226 T cell epitopes were included in the analysis. The number above each bar corresponds to epitope counts. Generalist sublineages L4.3/LAM, L4.1.2/Haarlem and (L4.10/PGG3. Specialist sublineage L4.6.1/Uganda. Tests: L4.6.1/Uganda vs L4.3/LAM Χ2= 27.04, p<0.001; L4.6.1/Uganda vs L4.1.2/Haarlem Χ2=15.75, p<0.001; L4.6.1/Uganda vs L4.1.2/PGG3 Χ2= 68.24, p<0.001.

The epitopes interrogated in our analysis were encoded by a total of 304 antigens. The number of antigens containing nonsynonymous variation in epitopes was 60, 26, 46 and 48 antigens in L4.3/LAM, L4.6.1/Uganda, L4.2/Haarlem and L4.10/PGG3, respectively. When excluding nonsynonymous mutations present only in one strain in each sublineage (which likely represent transient mutations), the number of antigens dropped to 20, 11, 12, 24 in L4.3/LAM, L4.6.1/Uganda, L4.2/Haarlem and L4.10/PGG3, respectively (Supplementary Table 9). Interestingly, 10 of those antigens exhibited independent parallel, nonsynonymous variation in epitope regions in the different sublineages (Supplementary Table 9). Of those antigens, Pst1, an adhesin promoting phagocytosis63, and FbpB, the precursor of the secreted antigen 85-B64, had already been pointed out as encoding diverse epitopes by a previous study, in which several MTBC lineages were compared57. (Supplementary Table 9). Other antigens exhibiting parallel nonsynonymous changes by different sublineages include known immunodominant, secreted antigens such as Mpb6464, MPT3265 and MPT7066 and three latency-associated antigens (Rv1733c67, Rv3034c, Rv262868, Supplementary Table 9).

Origin and global spread of the L4.3/LAM sublineage

Irrespective of the putative biological differences between the Lineage 4 sublineages, human migration could also have led to variation in the global distribution of MTBC lineages. Because the most successful sublineage of Lineage 4 was also frequently found in Europe, we hypothesized that the global success of L4.3/LAM was driven by European migration and colonization. To test this hypothesis, we first determined the most likely geographical origin of the most recent common ancestor of L4.3/LAM using two methods for reconstruction of ancestral states69. By both methods, Europe was predicted as the most likely place of origin of L4.3/LAM (100% and 99.6%, respectively) (Fig. 6a, Supplementary Fig. 14). Moreover, the ancestral geographical regions reconstructed for subsequent nodes in the phylogeny were consistent with the spread of L4.3/LAM from Europe to other parts of the world (Fig. 6a). Finally, we found that L4.3/LAM strains from Europe were genetically more diverse than L4.3/LAM strains from other continents, which further supports a European origin for this sublineage (Fig. 6b, Kruskall-Wallis test p<0.0001; Fig. 6c).
Figure 6

Genome-based phylogeny and diversity by continent of 293 strains of the L4.3/LAM sublineage.

(a) Bayesian phylogeny with label colors indicating continent of strain origin: blue, Europe/Mediterranean; red, Sub-Saharan Africa; yellow, America; pink, Asia. Numbers on nodes indicate posterior probabilities. Pie charts indicate reconstructed ancestral geographical regions of the internal nodes. The hypothetical L4.3/LAM-ancestor is labeled and a European origin for this ancestor was supported using a Bayesian Method (shown) and a Maximum Parsimony method (Supplementary Fig. 14). The pie colors correspond to the colors of the taxa labels. (b) Boxplot of pairwise genetic distances (number of polymorphisms) of L4.3/LAM strains by continent (p-values from Wilcoxon rank sum test). (c) Nucleotide diversity per site (π), measured by continent. Error bars indicate 95% confidence intervals. MTBC isolates from countries of the continent group “Oceania“ (UN category; including Australia and New Zealand, Melanesia, Micronesia and Polynesia) were excluded for the genetic diversity analysis in panels B and C due the low number of samples.

Discussion

Our findings show that the global success of Lineage 4 is a consequence of both biological and social phenomena. Specifically, we found that Lineage 4 is genetically diverse, and that this diversity is phylogeographically structured. The phylogeography of Lineage 4 supports an ecological distinction between globally represented generalists and geographically restricted specialists. Our in-depth population genomic analyses of one specialist and three generalist sublineages showed that even though the majority of human T cell epitopes were completely conserved in all four sublinages, the proportion of epitopes with amino acid substitutions was significantly higher in generalists. Finally, we demonstrate a likely European origin for L4.1/LAM, the most frequent and globally widespread generalist sublineage of Lineage 4. Our observation that Lineage 4 is phylogenetically diverse is in line with previous findings27,70, and highlights the importance of large and globally representative samples when studying the population structure of human pathogens. We found that Lineage 4 comprises at least 10 sublineages, which differ in their geographical distribution. The phylogeography of these sublineages is consistent with an ecological separation into specialists and generalists, with some sublineages showing an intermediate geographical distribution. Our phylogenetic analyses also showed that the three generalist sublineages identified within Lineage 4 were not monophyletic (Fig. 1a), indicating that generalism was acquired multiple times independently during the evolution of Lineage 4. Specialist sublineages also emerged multiple times, which is consistent with local adaptation to separate human populations13. One could argue that the reason for specialist sublineages being geographically restricted is they diverged later than the generalist sublineages during the evolution of Lineage 4, and thus had insufficient time to spread globally. However, based on recent findings by Comas et al.71, the African specialist sublineages already existed at least several centuries ago, perhaps even several millennia ago, depending on the age of the most recent common ancestor of the MTBC that has been estimated between 70’000 years9,21 and 6’000 years72,73. Thus, this timespan should have offered ample opportunity for the specialist sublineages to become more geographically widespread. The genetic diversity of the specialist sublineage L4.6.1/Uganda was significantly lower than that of the generalist L4.3/LAM, as expected from populations with restricted geographical ranges43. Concomitantly, the diversity of T cell epitopes in the specialist sublineage L4.6.1/Uganda was also significantly lower than in any other of the three generalist sublineages analyzed. Whether the low genetic diversity of the specialist sublineage has hindered the adaptation of these strains to other human populations or reflects a restricted niche due to the lack of opportunity for spreading will need to be explored in future studies. In all sublineages analyzed, the large majority of T cell epitopes were completely conserved, which is in agreement with previous reports for the MTBC overall57,58. This suggests that both these generalists and specialists do not use antigenic variation as a main mechanism of immune evasion. Despite this general trend, we found that some antigens have acquired nonsynonymous mutations in parallel in the different sublineages, suggesting that variation in these particular antigens might be beneficial. For example, acquiring nonsynonymous variation may allow particular antigens to be recognized by T cell receptors of different human populations, which might be beneficial in the presence of different human HLA alleles58. This could also provide an explanation for the differences in the degree of variation in T cell epitopes of the generalist and specialist sublineages, as generalist sublineages are expected to interact with a broader range of HLA alleles. Alternatively, some nonsynonymous mutations in epitopes might reflect escape from human T cell recognition58. More work is needed to determine if and how the limited diversity in T cell epitopes in the MTBC is linked to adaption to different host populations and/or immune escape. Two independent phylogeographic analyses predicted Europe as the most likely geographical origin for the most recent common ancestor of L4.3/LAM. A European origin for L4.3/LAM was further supported by our finding that strains belonging to this sublineage were more genetically diverse in Europe compared to Africa, Asia and America. Taken together, these results suggest a role for Europeans for the spread of L4.3/LAM across the globe. Given the high frequency of L4.3/LAM in Europe (Fig. 2, Fig. 3a), particularly in TB patients from the Iberian Peninsula and in Latin America74,75, Portuguese and Spanish exploration, trade and conquest over the last centuries may have contributed to the global dissemination of this sublineage76. Of note, the Americas lack specialist sublineages, including the three African specialist sublineages, despite centuries of slave trade. Importantly, this also applies to MTBC Lineage 5 and 6 (i.e. M. africanum) which today are largely limited to parts of West Africa11, the source of most of African slaves shipped to the Americas. Even if these lineages did reach the Americas at the time, they later might have been replaced by generalist sublineages from Europe including L4.3/LAM, following the massive influx of Europeans to the Americas during the 19th and early 20th centuries77, a time when the European TB epidemic was at its peak73,78. Importantly, these human migrations can be viewed as natural experiments, in which diverse human populations came into contact with different MTBC genotypes. As mentioned above, there is evidence that the African specialist sublineages of Lineage 4 already existed in sub-Saharan Africa centuries ago71. Following European contact, generalist sublineages were introduced to Africa and today, a significant proportion of human tuberculosis in Africa is caused by L4.1/LAM and other generalists (Figs. 2 and 3a). By contrast, no significant spill-over of African specialist sublineages has occurred into Europe or American populations of European ancestry. Three of the 10 sublineages showed an intermediate pattern of geographical distribution. Independent of the open question as to whether they might represent generalists or specialists, it is interesting to note that none of these three sublineages were found at significant frequency and proportion in Europe. They might therefore represent generalist of a non-European origin. Deeper analyses are needed to shed more light on these sublineages. Our study is limited in that many of the MTBC isolates analyzed come from convenience samples and might therefore not be representative of a particular country. However, for the analysis of sublinage distributions by SNP-genotyping, we included more than 3,000 clinical isolates from 100 countries, which should reduce any potential selection bias. For the deep genomic analyses, we selected strains basesd on a large and diverse collection of classical genotyping patterns, and in addition, screened <13,000 MTBC whole genome sequences available in public repositories. As a further limitation, some isolates in our collection were obtained from patients who recently emigrated from a high tuberculosis incidence region into a low-incidence country. However, we excluded cases from ongoing transmission and focused on immigrants with reactivation disease, i.e. they were most likely infected in their country of origin before moving abroad. Moreover, we used country of birth for all analyses as opposed to country of tuberculosis diagnosis. In conclusion, our findings indicate that the global success of Lineage 4 partly results from the different evolutionary strategies adopted by different sublineages. These strategies reflect an ecological distinction between specialists and generalists. The specialist sublineages are adapted to their sympatric host populations and geographically restricted. The generalist sublineages exhibit a broader ecological niche and are geographically widespread. Moreover, Europeans contributed to the global spread of the most successful generalist sublineage of Lineage 4. Our results highlight the ecological and epidemiological relevance of the deep phylogenetic diversity within the MTBC79. More generally, exploring potential differences between specialists and generalists in other pathogens will improve our understanding of the biology and epidemiology of infectious diseases.

Data Availability Statement

All data generated or analyzed during this study are included in this published article (and its supplementary information files). Sequencing reads have been submitted to the EMBL-EBI European Nucleotide Archive (ENA) Sequence Read Archive (SRA) under the study accession number PRJEB11460.

Online Methods

Mycobacterial isolates

For the definition of Lineage 4 sublineages, we used 72 whole genome sequences of MTBC Lineage 4 and reference sequences of the other MTBC lineages published previously21,22 (Supplementary Table 7). These represented the largest collection of Lineage 4 whole-genome sequences available at that time. For the SNP-screening of clinical isolates for sublineage-classification, we used a retrospective global collection of 3,366 MTBC Lineage 4 isolates from 100 countries (Supplementary Table 5)15,30–35. All isolates had previously been identified as MTBC Lineage 4 by SNP-typing, genomic deletion analysis or spoligotyping. Approximately one third of these isolates were from patients who migrated to another country (1,106; 32.9%), and we used country of birth of the patient as a proxy for the origin of the MTBC strains. Two thirds of the isolates (2,260; 67.1%) were from countries where both country of isolation and country of birth were identical. Isolates of L4.6.1/Uganda from Uganda were genotyped in our previous work34. For the in-depth population genomic analysis of L4.3/LAM, we included previously published genomes21,27,44, and generated whole genome sequences of additional strains selected from a large collection of 2,132 MIRU-VNTR-genotyped isolates representing the global diversity of L4.3/LAM (Supplementary Fig. 7). Starting from 500 whole genome sequences, we excluded sequences with bad quality (sequencing coverage < 15x, proportion of homozygous variant calls <85%), isolates in transmission clusters (defined as isolate pairs differing by ≤12 SNPs) and strains with unknown country of origin, resulting in whole genome sequencing (WGS) data for 293 L4.3/LAM strains, which were included in the final analysis (Supplementary Table 7). For the in-depth population genomic analyses of L4.6.1/Uganda, we generated WGS data from 175 isolates of the L4.6.1/Uganda genotype, selected for maximal geographic diversity and from previous studies34. Moreover, to further increase geographic coverage and genetic diversity among L4.6.1/Uganda strains, we analyzed all available WGS data from several published studies8,45–56,75 and other whole genome data available in the public domain. We used KvarQ80 to screen for the L4.6.1/Uganda-specific SNPs described below. Starting from 13,067 genome sequences and excluding all clustered isolates except for one representative of each cluster, we identified 28 additional L4.6.1/Uganda genome sequences which we included in our analysis of a total of 203 genomes (Supplementary Table 7). For genomic analysis of L4.1.2/Haarlem and L4.10/PGG3 strains, we screened the same 13,067 isolates (plus our own collection) for clade-specific SNPs of these two sublineages. We identified 505 genome sequences of strains of L4.1.2/Haarlem and 748 sequences of L4.10/PGG3. After excluding problematic sequences and strains in transmission clusters (criteria see above), we used 228 strains of L4.1.2/Haarlem and 301 strains of L4.10/PGG3. H37Rv was used as outgroup for all sublineage phylogenies except L4.10/PGG3, for which an isolate of L4.1.2/Haarlem was used (H37Rv belong to L4.10/PGG3).

Whole genome sequencing, variant calling and filtering

WGS of new MTBC isolates was performed using Illumina chemistry (MiSeq, HiSeq2000/2500, NextSeq; paired end or single end). Illumina MiSeq-generated sequencing reads were clipped for adapters with Trimmomatic81 before mapping. We used a previously described pipeline for the mapping of short sequencing reads to the reference genome (a reconstructed hypothetical MTBC ancestor) with BWA 0.6.221. SNPs were called with SAMtools 0.1.19, and excluded if the coverage was less than 10% or more than 200% of the average coverage of the genome, if not supported by at least two reads on each strand, or if the quality was less than 30. All SNPs were then annotated using H37Rv reference annotation (AL123456.2) with Annovar82 and customized scripts. SNPs in regions annotated as “PE/PPE/PGRS”, “maturase”, “phage”, “insertion sequence” were excluded. Additionally, we excluded SNPs in genes with previously identified repetitive regions58. Small insertions and deletions called by BWA/SAMtools as “INDEL” were not considered for the analyses. The presence of large genomic deletions reported previously15,28,74 was assessed by manually inspecting BAM alignment files from BWA mappings in Artemis for the presence of reads at the genomic regions with described deletions. Alternatively, we used a new testsuite in KvarQ80 to check for reads aligning to 25 bp query sequences of the corresponding deletion.

Phylogenetic and population genetic analyses for the definition of sublineages

A phylogenetic tree was generated with all Lineage 4 genomes, plus several reference genomes from all other MTBC lineages. Pairwise SNP distances were calculated using MEGA583 and the ape-package in R84. Fixation indices (FST; estimation of population separation) were calculated using Arlequin 3.5.85. Statistical significance was obtained by permutating haplotypes between sublineages. Principal Component Analysis (PCA) was done with Jalview86. Naming of Lineage 4 sublineages was adapted to Coll et al.27 whenever possible. However, in that publication, no criteria for the definition of sublineages were given, and not all sublineages were identified as such. We therefore added continuous numbers for the clades which were not defined by Coll et al. The new sublineages defined in this study are L4.1.3/Ghana and L4.10/PGG3 (the latter including L4.5, L4.8 and L4.9 according to nomenclature by Coll et al.). The full phylogenetic tree, including previous markers and spoligotyping family names is shown in Supplementary Fig. 1.

Identification of sublineage-specific SNPs

The alignment of all SNPs from the initial 72 MTBC Lineage 4 strains was imported into Mesquite87, in parallel with the phylogenetic tree generated from the same data in MEGA5. We used the “Trace Character History” module of Mesquite to map polymorphisms to clades. The full dataset of reconstructed positions was exported, and sublineage-specific SNPs were extracted as nucleotide differences between internal nodes of the phylogeny.

SNP-typing to screen for MTBC Lineage 4 sublineages

We developed a new SNP-typing assay to screen clinical isolates for the defined Lineage 4 sublineages. For this, we selected one “diagnostic” SNP per sublineage using previously defined methods and criteria36. Oligonucleotides were designed for a 10-plex MOL-PCR assay based on the Luminex xTag platform (Luminex, Austin, USA) (Supplementary Table 2)36. DNA extracts from clinical MTBC isolates were then screened with either i) the new MOL-PCR assay, ii) standard PCR amplification and subsequent Sanger sequencing of the region up- and downstream of the sublineage-specific SNP (see Supplementary Table 4 for PCR and sequencing primers), iii) a real-time PCR melting curve assay using the same SNPs (Supplementary Table 2), or iv) the MassARRAY platform (Sequenom, San Diego CA, USA) using phylogenetically redundant SNPs (Supplementary Table 3). The set of SNPs used in the MassARRAY typing scheme covered only six of 10 sublineages. Hence, all Lineage 4 isolates without any of the six SNPs with the mutant allele defined by MassARRAY typing (n=49) were subjected to the Luminex-assay described above. For all isolates, patient place of birth was used as country information. We obtained sublineage-classification data for 3,273 (97.2%) of a total of 3,366 isolates (Supplementary Table 5).

Spatial analysis and data presentation

For each country with Lineage 4 sublineage data available, sublineage proportions (compared to all Lineage 4 isolates from the same country) were calculated and mapped to a world map with ArcGIS ArcMap 10.0 (Esri, Redland, USA). A shapefile with country boundaries was used from DIVA-GIS, which is freely available. Categories for number of countries were defined as 0, 1-3, 4-10 and >10 countries. For individual sublineage „heat maps“, countries with less than 3 isolates were not included. For the additional maps shown in Supplementary Fig. 6, sublineage proportions were normalized by the TB prevalence in the country as estimated by WHO38, and the area of the country. Other figures were generated with the ggplot2 library in R and GraphPad Prism 6.02 (GraphPad Software, San Diego, USA). Statistical analyses were performed with R or GraphPad Prism.

SIFT-analyses of functional effects of fixed sublineage SNPs

Analysis of SNPs fixed in each of the 10 sublineages were assessed for predicted functional consequence with the „Sorting Intolerant From Tolerant“ (SIFT) in the software SIFT4G (v2.1)42 and the pre-compiled Mycobacterium tuberculosis database „GCA_000195955.2.22“. Conservation levels of SNPs in the pre-compiled database had been obtained by comparing Mycobacterium tuberculosis H37Rv proteins to all proteins in the UniRef90 database. We pooled SNPs fixed in the generalist sublineages and the specialist sublineages, respectively, and excluded the L4.1.2/Ghana sublineage, as whole genome sequences of only two, very closely related isolates were available. Gene categories were analyzed based on the classification by Tuberculist88.

Phylogenetic reconstruction and population genetic analyses

The final alignment of polymorphic positions in all strains was used to estimate phylogenies with Bayesian methods using MrBayes 3.2.589 for L4.3/LAM and L4.6.1/Uganda sublineages (Fig. 6, Supplementary Fig. 10). For the Bayesian analysis we used a gamma rate distribution estimated from our dataset and a burn-in equal to 1/10 the number of generations; after the burn-in phase every 100th tree was saved. Two parallel Markov chains were run in each of two runs. Tree length, log-likelihood score and alpha value of the gamma distribution were inspected for stationarity before termination of MrBayes. Trees were generated with standard parameters. A consensus tree was used for further analyses. Additionally, we used MEGA583 to generate Maximum Likelihood phylogenetic trees (Supplementary Figs. 9, 11, 12 and 13). We used the general time reversible (GTR) model of evolution, and 500 pseudoreplicates for bootstrapping confidence levels. Positions with gaps in more than 50% of taxa were ignored. Tree figures were generated using FigTree version 1.4.2. Pairwise SNP distances were calculated with the ape-package and the dna.dist function in R version 3.2.2, using raw counts of mutations and pairwise deletions for sites with gaps. For the comparison of pairwise number of SNP distributions overall (L4.3/LAM and L4.6.1/Uganda) and between continents for L4.3/LAM, a mean SNP distance to all isolates of the same population was calculated for each isolate, and a distribution of the mean pairwise distance plotted. Wilcoxon rank sum and Kruskall-Wallis tests were used to test for differences between continents as data were assumed to not be normally distributed. Average pairwise nucleotide diversities per site (π) were calculated as the average number of pairwise mismatches among a set of sequences divided by the total length of the interrogated sequences in base pairs (equation 4.21 in Ref.90). Confidence intervals for π were obtained by bootstrapping (1000 replicates) by re-sampling with replacement the nucleotide sites of the original alignments of polymorphic positions using the function sample in R. Lower and upper levels of confidence were obtained by calculating the 2.5th and the 97.5th quantiles of the π distribution obtained by bootstrapping. Code details are available upon request.

Antigenic diversity in human T cell epitopes

Experimentally confirmed human MTBC T cell epitope sequences were retrieved from the Immune Epitope Database on the 24th of April 2015. Only linear epitopes from the MTBC (ID: 77643) tested in human T cell assays, with no MHC restrictions were selected (1,730 epitopes). The sequence of each epitope was blasted using blastP91 against the reference strain (H37Rv) to obtain genomic coordinates. Epitopes with no coordinates in H37Rv or for which no accurate coordinates could be determined (due to multiple hits) and epitopes in repetitive regions such as PE/PPE genes, phages-related genes and transposases were excluded, rendering a final set of 1,226 epitopes. Those epitopes are distributed across 304 antigens and have some overlapping sequences. In order to proceed with the sequence analysis, alignments were obtained by concatenating all epitope sequences after excluding sequence redundancy. Alignments of non-epitope containing antigens were obtained by excluding the regions described as epitopes from each respective antigen. To assess how other regions of the genome are evolving, alignments for essential and non-essential genes were also obtained62. Alignments of epitopes and non-epitope containing antigens, essential and nonessential genes, were used to calculate pairwise dN/dS ratios for L4.3/LAM, L4.6.1/Uganda, L4.10/PGG3 and L4.1.2/Haarlem sublineages. The dN/dS measures were calculated using all polymorphic sites within each sublineage and reflect therefore both within-sublineage substitutions and transient polymorphisms. Pairwise dN and dS values within each sublineage were calculated using the R package seqinr using the kaks function. To avoid having undetermined pairwise dN/dS values due to dN or dS being zero, a mean dN/dS was then calculated per sequenced isolate by dividing its mean pairwise dN by its mean pairwise dS with respect to all other sequenced isolates within each sublineage. The statistical differences between epitopes and non-epitope regions of antigens within each sublineage were accessed by using Wilcoxon rank sum tests with continuity correction implemented in R version 3.2.2.

Reconstruction of geographical origin of L4.3/LAM

The software RASP69 was used to reconstruct the hypothetical geographic origin of the MTBC L4.3/LAM ancestor genotype. The Bayesian phylogeny of 294 isolates (including H37Rv as outgroup) and the corresponding continent of birth of the patient were loaded as distribution. We used the S-DIVA (a parsimony based method) as well as the Bayesian Binary Method (BBM) implementation in RASP. A set of trees from MrBayes89 was used to correct for phylogenetic uncertainty in the S-DIVA analysis. Populations were defined according to country of birth of the patients and according to the United Nations definition. The isolates from Turkey, Libya, Algeria and Morocco were in the category “Europe and Mediterranean”. RASP reconstruction was done without the outgroup (H37Rv). As we observed a single strain (from Ukraine) with a distinct, basal position in the phylogeny, we also performed a sensitivity analysis by excluding that isolate for the RASP analysis. With both methods, BBM as well as S-DIVA, the changes in proportions of continents were minor. With BBM, the proportion of “Europe/Mediterranean” for the “L4.3/LAM ancestor” decreased to 98.8%, and with S-DIVA, the proportion of “Europe/Mediterranean” decreased to 99.0% when excluding this basal isolate.
  85 in total

Review 1.  Host specificity determinants as a genetic continuum.

Authors:  Morgan W B Kirzinger; John Stavrinides
Journal:  Trends Microbiol       Date:  2011-12-21       Impact factor: 17.079

2.  Whole-genome sequencing and social-network analysis of a tuberculosis outbreak.

Authors:  Jennifer L Gardy; James C Johnston; Shannan J Ho Sui; Victoria J Cook; Lena Shah; Elizabeth Brodkin; Shirley Rempel; Richard Moore; Yongjun Zhao; Robert Holt; Richard Varhol; Inanc Birol; Marcus Lem; Meenu K Sharma; Kevin Elwood; Steven J M Jones; Fiona S L Brinkman; Robert C Brunham; Patrick Tang
Journal:  N Engl J Med       Date:  2011-02-24       Impact factor: 91.245

3.  Mycobacterium tuberculosis transmission in a country with low tuberculosis incidence: role of immigration and HIV infection.

Authors:  Lukas Fenner; Sebastien Gagneux; Peter Helbling; Manuel Battegay; Hans L Rieder; Gaby E Pfyffer; Marcel Zwahlen; Hansjakob Furrer; Hans H Siegrist; Jan Fehr; Marisa Dolina; Alexandra Calmy; David Stucki; Katia Jaton; Jean-Paul Janssens; Jesica Mazza Stalder; Thomas Bodmer; Beatrice Ninet; Erik C Böttger; Matthias Egger
Journal:  J Clin Microbiol       Date:  2011-11-23       Impact factor: 5.948

4.  Jalview Version 2--a multiple sequence alignment editor and analysis workbench.

Authors:  Andrew M Waterhouse; James B Procter; David M A Martin; Michèle Clamp; Geoffrey J Barton
Journal:  Bioinformatics       Date:  2009-01-16       Impact factor: 6.937

5.  Human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved.

Authors:  Iñaki Comas; Jaidip Chakravartti; Peter M Small; James Galagan; Stefan Niemann; Kristin Kremer; Joel D Ernst; Sebastien Gagneux
Journal:  Nat Genet       Date:  2010-05-23       Impact factor: 38.330

6.  Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans.

Authors:  Iñaki Comas; Mireia Coscolla; Tao Luo; Sonia Borrell; Kathryn E Holt; Midori Kato-Maeda; Julian Parkhill; Bijaya Malla; Stefan Berg; Guy Thwaites; Dorothy Yeboah-Manu; Graham Bothamley; Jian Mei; Lanhai Wei; Stephen Bentley; Simon R Harris; Stefan Niemann; Roland Diel; Abraham Aseffa; Qian Gao; Douglas Young; Sebastien Gagneux
Journal:  Nat Genet       Date:  2013-09-01       Impact factor: 38.330

7.  Population Genomics of Mycobacterium tuberculosis in Ethiopia Contradicts the Virgin Soil Hypothesis for Human Tuberculosis in Sub-Saharan Africa.

Authors:  Iñaki Comas; Elena Hailu; Teklu Kiros; Shiferaw Bekele; Wondale Mekonnen; Balako Gumi; Rea Tschopp; Gobena Ameni; R Glyn Hewinson; Brian D Robertson; Galo A Goig; David Stucki; Sebastien Gagneux; Abraham Aseffa; Douglas Young; Stefan Berg
Journal:  Curr Biol       Date:  2015-12-10       Impact factor: 10.834

8.  HIV infection disrupts the sympatric host-pathogen relationship in human tuberculosis.

Authors:  Lukas Fenner; Matthias Egger; Thomas Bodmer; Hansjakob Furrer; Marie Ballif; Manuel Battegay; Peter Helbling; Jan Fehr; Thomas Gsponer; Hans L Rieder; Marcel Zwahlen; Matthias Hoffmann; Enos Bernasconi; Matthias Cavassini; Alexandra Calmy; Marisa Dolina; Reno Frei; Jean-Paul Janssens; Sonia Borrell; David Stucki; Jacques Schrenzel; Erik C Böttger; Sebastien Gagneux
Journal:  PLoS Genet       Date:  2013-03-07       Impact factor: 5.917

9.  The influence of host and bacterial genotype on the development of disseminated disease with Mycobacterium tuberculosis.

Authors:  Maxine Caws; Guy Thwaites; Sarah Dunstan; Thomas R Hawn; Nguyen Thi Ngoc Lan; Nguyen Thuy Thuong Thuong; Kasia Stepniewska; Mai Nguyet Thu Huyen; Nguyen Duc Bang; Tran Huu Loc; Sebastien Gagneux; Dick van Soolingen; Kristin Kremer; Marianne van der Sande; Peter Small; Phan Thi Hoang Anh; Nguyen Tran Chinh; Hoang Thi Quy; Nguyen Thi Hong Duyen; Dau Quang Tho; Nguyen T Hieu; Estee Torok; Tran Tinh Hien; Nguyen Huy Dung; Nguyen Thi Quynh Nhu; Phan Minh Duy; Nguyen van Vinh Chau; Jeremy Farrar
Journal:  PLoS Pathog       Date:  2008-03-28       Impact factor: 6.823

10.  Elucidating emergence and transmission of multidrug-resistant tuberculosis in treatment experienced patients by whole genome sequencing.

Authors:  Taane G Clark; Kim Mallard; Francesc Coll; Mark Preston; Samuel Assefa; David Harris; Sam Ogwang; Francis Mumbowa; Bruce Kirenga; Denise M O'Sullivan; Alphonse Okwera; Kathleen D Eisenach; Moses Joloba; Stephen D Bentley; Jerrold J Ellner; Julian Parkhill; Edward C Jones-López; Ruth McNerney
Journal:  PLoS One       Date:  2013-12-11       Impact factor: 3.240

View more
  128 in total

Review 1.  Genetics and evolution of tuberculosis pathogenesis: New perspectives and approaches.

Authors:  Michael L McHenry; Scott M Williams; Catherine M Stein
Journal:  Infect Genet Evol       Date:  2020-01-22       Impact factor: 3.342

2.  Resistance and Susceptibility to Mycobacterium tuberculosis Infection and Disease in Tuberculosis Households in Kampala, Uganda.

Authors:  Catherine M Stein; Sarah Zalwango; LaShaunda L Malone; Bonnie Thiel; Ezekiel Mupere; Mary Nsereko; Brenda Okware; Hussein Kisingo; Christina L Lancioni; Charles M Bark; Christopher C Whalen; Moses L Joloba; W Henry Boom; Harriet Mayanja-Kizza
Journal:  Am J Epidemiol       Date:  2018-07-01       Impact factor: 4.897

Review 3.  Ecology and evolution of Mycobacterium tuberculosis.

Authors:  Sebastien Gagneux
Journal:  Nat Rev Microbiol       Date:  2018-02-19       Impact factor: 60.633

4.  Bacterial Genotyping of Central Nervous System Tuberculosis in South Africa: Heterogenic Mycobacterium tuberculosis Infection and Predominance of Lineage 4.

Authors:  L M van Leeuwen; P Versteegen; S D Zaharie; S L van Elsland; A Jordaan; E M Streicher; R M Warren; M van der Kuip; A M van Furth
Journal:  J Clin Microbiol       Date:  2019-07-26       Impact factor: 5.948

5.  Novel methods included in SpolLineages tool for fast and precise prediction of Mycobacterium tuberculosis complex spoligotype families.

Authors:  David Couvin; Wilfried Segretier; Erick Stattner; Nalin Rastogi
Journal:  Database (Oxford)       Date:  2020-12-15       Impact factor: 3.451

Review 6.  Systems proteomics approaches to study bacterial pathogens: application to Mycobacterium tuberculosis.

Authors:  Amir Banaei-Esfahani; Charlotte Nicod; Ruedi Aebersold; Ben C Collins
Journal:  Curr Opin Microbiol       Date:  2017-10-13       Impact factor: 7.934

7.  Preparation and Evaluation of Potent Pentafluorosulfanyl-Substituted Anti-Tuberculosis Compounds.

Authors:  Garrett C Moraski; Ryan Bristol; Natalie Seeger; Helena I Boshoff; Patricia Siu-Yee Tsang; Marvin J Miller
Journal:  ChemMedChem       Date:  2017-06-27       Impact factor: 3.466

8.  Molecular epidemiology of M. tuberculosis in Ethiopia: A systematic review and meta-analysis.

Authors:  Daniel Mekonnen; Awoke Derbie; Asmamaw Chanie; Abebe Shumet; Fantahun Biadglegne; Yonas Kassahun; Kidist Bobosha; Adane Mihret; Liya Wassie; Abaineh Munshea; Endalkachew Nibret; Solomon Abebe Yimer; Tone Tønjum; Abraham Aseffa
Journal:  Tuberculosis (Edinb)       Date:  2019-08-07       Impact factor: 3.131

9.  Model-based integration of genomics and metabolomics reveals SNP functionality in Mycobacterium tuberculosis.

Authors:  Ove Øyås; Sonia Borrell; Andrej Trauner; Michael Zimmermann; Julia Feldmann; Thomas Liphardt; Sebastien Gagneux; Jörg Stelling; Uwe Sauer; Mattia Zampieri
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-30       Impact factor: 11.205

10.  China's tuberculosis epidemic stems from historical expansion of four strains of Mycobacterium tuberculosis.

Authors:  Qingyun Liu; Aijing Ma; Lanhai Wei; Yu Pang; Beibei Wu; Tao Luo; Yang Zhou; Hong-Xiang Zheng; Qi Jiang; Mingyu Gan; Tianyu Zuo; Mei Liu; Chongguang Yang; Li Jin; Iñaki Comas; Sebastien Gagneux; Yanlin Zhao; Caitlin S Pepperell; Qian Gao
Journal:  Nat Ecol Evol       Date:  2018-11-05       Impact factor: 15.460

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.