| Literature DB >> 27798628 |
David Stucki1,2, Daniela Brites1,2, Leïla Jeljeli3,4, Mireia Coscolla1,2, Qingyun Liu5, Andrej Trauner1,2, Lukas Fenner1,2,6, Liliana Rutaihwa1,2, Sonia Borrell1,2, Tao Luo7, Qian Gao5, Midori Kato-Maeda8, Marie Ballif1,2,6, Matthias Egger6, Rita Macedo9, Helmi Mardassi4, Milagros Moreno10, Griselda Tudo Vilanova11, Janet Fyfe12, Maria Globan12, Jackson Thomas13, Frances Jamieson14, Jennifer L Guthrie14, Adwoa Asante-Poku15, Dorothy Yeboah-Manu15, Eddie Wampande16, Willy Ssengooba16,17, Moses Joloba16, W Henry Boom18, Indira Basu19, James Bower19, Margarida Saraiva20,21, Sidra E G Vaconcellos22, Philip Suffys22, Anastasia Koch23, Robert Wilkinson23,24,25, Linda Gail-Bekker23, Bijaya Malla1,2, Serej D Ley1,2,26, Hans-Peter Beck1,2, Bouke C de Jong27, Kadri Toit28, Elisabeth Sanchez-Padilla29, Maryline Bonnet29, Ana Gil-Brusola30, Matthias Frank31, Veronique N Penlap Beng32, Kathleen Eisenach33, Issam Alani34, Perpetual Wangui Ndung'u35, Gunturu Revathi36, Florian Gehre27,37, Suriya Akter27, Francine Ntoumi31,38, Lynsey Stewart-Isherwood39, Nyanda E Ntinginya40, Andrea Rachow41, Michael Hoelscher41, Daniela Maria Cirillo42, Girts Skenders43, Sven Hoffner44, Daiva Bakonyte45, Petras Stakenas45, Roland Diel46, Valeriu Crudu47, Olga Moldovan48, Sahal Al-Hajoj49, Larissa Otero50, Francesca Barletta50, E Jane Carter51,52, Lameck Diero52, Philip Supply53, Iñaki Comas54,55, Stefan Niemann3,56, Sebastien Gagneux1,2.
Abstract
Generalist and specialist species differ in the breadth of their ecological niches. Little is known about the niche width of obligate human pathogens. Here we analyzed a global collection of Mycobacterium tuberculosis lineage 4 clinical isolates, the most geographically widespread cause of human tuberculosis. We show that lineage 4 comprises globally distributed and geographically restricted sublineages, suggesting a distinction between generalists and specialists. Population genomic analyses showed that, whereas the majority of human T cell epitopes were conserved in all sublineages, the proportion of variable epitopes was higher in generalists. Our data further support a European origin for the most common generalist sublineage. Hence, the global success of lineage 4 reflects distinct strategies adopted by different sublineages and the influence of human migration.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27798628 PMCID: PMC5238942 DOI: 10.1038/ng.3704
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Figure 1Definition and global frequency of Lineage 4 sublineages.
(a) We defined 10 sublineages based on the analysis of 72 MTBC Lineage 4 genome sequences published previously21,22. Sublineages were labeled according to Coll et al.27 (whenever possible) and previous designations based on spoligotyping (see Supplementary Fig. 1). Black triangles indicate sublineages identified as specialists, black circles indicate generalists. Filled shapes indicate sublineages, for which we performed deep genomic analyses. (b) Global proportion of each sublineage. A total of 3,366 MTBC Lineage 4 isolates were screened for sublineage-specific SNPs. L4.3/LAM was the most frequent sublineage globally.
Figure 2Global distribution of Lineage 4 sublineages.
Pie charts showing proportions of the 10 Lineage 4 sublineages among all MTBC Lineage 4 isolates in each country. Circle sizes correspond to the number of isolates analyzed per country. A total of 3,366 MTBC Lineage 4 isolates were included. Color codes are as in Fig. 1.
Figure 3Country-specific proportions of sublineages reveal generalists and specialists.
(a) The generalist sublineages L4.1.2/Haarlem, L4.3/LAM and L4.10/PGG3 were found globally at high proportions. (b) The locally restricted specialist sublineages L4.1.3/Ghana, L4.5, L4.6.1/Uganda and L4.6.2/Cameroon occurred at high frequencies in only a few countries and were restricted to certain geographical regions. Intensity of red indicates proportion of the sublineage among all Lineage 4 isolates in each country. Countries with fewer than three isolates in total are shown as “no data” and are filled white. A total of 3,366 Lineage 4 isolates were included in this analysis. The color scale for all sublineages is as indicated in Panel a, except for sublineage L4.1.3/Ghana (separate scale shown).
Figure 4Pair-wise ratios of rates of nonsynonymous to synonymous substitutions (dN/dS) in generalist and specialist sublineages for different gene categories.
Abbreviations: Epi – experimentally confirmed human T cell epitopes; nEpi – non-epitope regions of T-cell antigens, both obtained from the Immune Epitope Database60; Ess – essential genes62; nEss – non-essential genes62. Wilcoxon rank sum tests: L4.6.1/Uganda (N=203) Epi vs nEpi, W=4952, p<0.001; L4.6.1/Uganda (N=203) Ess vs nEss, W=1415, p<0.001; L4.3/LAM (N=293) Epi vs nEpi, W=74540, p<0.001, L4.3/LAM (n=293) Ess vs nEss W=45067, p-value=0.29; L4.1.2/Haarlem (N=228) Epi vs nEpi, W=6561, p<0.001, L4.1.2/Haarlem (N=228) Ess vs nEss W=13369, p<0.001; L4.10/PGG3 (N=301) Epi vs nEpi, W= 27335, p<0.001, L4.10/PGG3 (N=301) Ess vs nEss W= 3103, p<0.001.
Figure 5Frequency distribution of the number of epitopes with nonsynonymous variants in generalist and specialist sublineages.
A total of 1,226 T cell epitopes were included in the analysis. The number above each bar corresponds to epitope counts. Generalist sublineages L4.3/LAM, L4.1.2/Haarlem and (L4.10/PGG3. Specialist sublineage L4.6.1/Uganda. Tests: L4.6.1/Uganda vs L4.3/LAM Χ2= 27.04, p<0.001; L4.6.1/Uganda vs L4.1.2/Haarlem Χ2=15.75, p<0.001; L4.6.1/Uganda vs L4.1.2/PGG3 Χ2= 68.24, p<0.001.
Figure 6Genome-based phylogeny and diversity by continent of 293 strains of the L4.3/LAM sublineage.
(a) Bayesian phylogeny with label colors indicating continent of strain origin: blue, Europe/Mediterranean; red, Sub-Saharan Africa; yellow, America; pink, Asia. Numbers on nodes indicate posterior probabilities. Pie charts indicate reconstructed ancestral geographical regions of the internal nodes. The hypothetical L4.3/LAM-ancestor is labeled and a European origin for this ancestor was supported using a Bayesian Method (shown) and a Maximum Parsimony method (Supplementary Fig. 14). The pie colors correspond to the colors of the taxa labels. (b) Boxplot of pairwise genetic distances (number of polymorphisms) of L4.3/LAM strains by continent (p-values from Wilcoxon rank sum test). (c) Nucleotide diversity per site (π), measured by continent. Error bars indicate 95% confidence intervals. MTBC isolates from countries of the continent group “Oceania“ (UN category; including Australia and New Zealand, Melanesia, Micronesia and Polynesia) were excluded for the genetic diversity analysis in panels B and C due the low number of samples.