| Literature DB >> 34787541 |
Yuttapong Thawornwattana1,2, Surakameth Mahasirimongkol3, Hideki Yanai4, Htet Myat Win Maung5,6, Zhezhe Cui6,7, Virasakdi Chongsuvivatwong6, Prasit Palittapongarnpim1,8.
Abstract
Mycobacterium tuberculosis (Mtb) lineage 2 (L2) strains are present globally, contributing to a widespread tuberculosis (TB) burden, particularly in Asia where both prevalence of TB and numbers of drug resistant TB are highest. The increasing availability of whole-genome sequencing (WGS) data worldwide provides an opportunity to improve our understanding of the global genetic diversity of Mtb L2 and its association with the disease epidemiology and pathogenesis. However, existing L2 sublineage classification schemes leave >20 % of the Modern Beijing isolates unclassified. Here, we present a revised SNP-based classification scheme of L2 in a genomic framework based on phylogenetic analysis of >4000 L2 isolates from 34 countries in Asia, Eastern Europe, Oceania and Africa. Our scheme consists of over 30 genotypes, many of which have not been described before. In particular, we propose six main genotypes of Modern Beijing strains, denoted L2.2.M1-L2.2.M6. We also provide SNP markers for genotyping L2 strains from WGS data. This fine-scale genotyping scheme, which can classify >98 % of the studied isolates, serves as a basis for more effective monitoring and reporting of transmission and outbreaks, as well as improving genotype-phenotype associations such as disease severity and drug resistance. This article contains data hosted by Microreact.Entities:
Keywords: Beijing strain; Lineage 2; Mycobacterium tuberculosis; genotyping; phylogeny; whole genome sequencing
Mesh:
Year: 2021 PMID: 34787541 PMCID: PMC8743535 DOI: 10.1099/mgen.0.000697
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.Phylogenetic tree of 4,425 isolates in the discovery set estimated under the maximum-likelihood framework, rooted using the H37Rv reference strain (lineage 4). The first column of labels lists sixteen level-2 and level-3 genotypes (L2.2 not shown), the second column lists twenty-two level-4 genotypes and the last column lists two level-5 genotypes. Clades with notable features such as geographic specificity are highlighted. Previous or other names are given in parentheses; see Table 1 for references. Labels such as Bmyc13+means the clade contains Bmyc13 [19] as a major subclade and non-Bmyc13 isolates at the base. Bmyc2+Bmyc3 refers to a clade of both Bmyc2 and Bmyc3 strains. AA1SA refers to a South African subclade of L2.2.AA1 (32). RD142 refers to a subclade of L.2.2.M2.5 defined by the presence of a deletion in the RD142 region [14], which in turn contains Bmyc18 as a subclade. RD142 deletion was used to define L2.2.1.2 [17]. CAO, Central Asia Outbreak. An interactive version of the phylogeny is available online at Microreact (https://microreact.org/project/4P2iPeBx1Y66TyJfojNM1o).
Revised list of phylogenetically informative genotypes of Mtb L2. Ancestral Beijing genotypes are listed as in the branching order in the phylogeny in Fig. 1. Level refers to the hierarchy level in the nomenclature; level 1 is the entire L2. Other names lists previously proposed genotypes based on phylogenetic analysis of WGS data. Genotypes with no other names are newly proposed in this study. Bmyc names are included to illustrate that most of the groups in this Bmyc scheme, an early SNP-based classification [19], do not correspond to monophyletic clades, except for those labelled in bold (also shown Fig. S1).
|
Genotype |
Level |
Other names |
Bmyc name (Mestre) |
Description |
|---|---|---|---|---|
|
L2.1 |
2 |
Proto-Beijing [ |
|
Non-Beijing L2 |
|
L2.2 |
2 |
Beijing |
|
|
|
L2.2.AA1 |
3 |
Asia Ancestral 1 [ |
Bmyc2, |
Contains clades from Japan/South Korea, Indonesia and a recent outbreak in South Africa (AA1SA clade) |
|
L2.2.A |
3 |
– |
Bmyc4 |
Associated with Japan |
|
L2.2.AA2 |
3 |
Asia Ancestral 2 [ |
Bmyc4 |
Contains large clades from Thailand and from Japan/South Korea |
|
L2.2.B |
3 |
– |
Bmyc6 |
|
|
L2.2.AA3 |
3 |
Asia Ancestral 3 [ |
|
Contains several large clades from Thailand and from Vietnam |
|
L2.2.AA3.1 |
4 |
– |
|
Vietnam-majority |
|
L2.2.AA3.2 |
4 |
– |
|
Thailand-majority |
|
L2.2.C |
3 |
– |
Bmyc26 |
Mostly from Japan and South Korea |
|
L2.2.D |
3 |
– |
Bmyc26 |
Mostly from China |
|
L2.2.E |
3 |
– |
Bmyc26 |
Mostly from China |
|
L2.2.AA4 |
3 |
Asia Ancestral 4 [ |
Bmyc26/10 |
Mostly from Thailand |
|
L2.2.M1 |
3 |
– |
|
|
|
L2.2.M1.1 |
4 |
Pacific RD150 [ |
Bmyc10 |
Contains large clades from Vietnam, Thailand, Papua New Guinea and South Africa |
|
L2.2.M1.2 |
4 |
– |
Bmyc10 |
Mostly from China and Vietnam |
|
L2.2.M1.3 |
4 |
– |
Bmyc10 |
Mostly from Vietnam |
|
L2.2.M1.4 |
4 |
– |
Bmyc10 |
Mostly from China |
|
L2.2.M2 |
3 |
Asian African 2 [ |
|
|
|
L2.2.M2.1 |
4 |
– |
Bmyc10 |
Contains a large clade mostly from Vietnam, and a large clade from multiple African countries |
|
L2.2.M2.2 |
4 |
– |
Bmyc10 |
From diverse countries |
|
L2.2.M2.3 |
4 |
– |
Bmyc10 |
Contains large clades from Vietnam and from Thailand, and a small clade from South Africa and Mozambique |
|
L2.2.M2.4 |
4 |
– |
Bmyc10 |
A small clade, mostly from China |
|
L2.2.M2.5 |
4 |
– |
Bmyc10 |
Contains large clades from Vietnam and from Thailand, and a small clade of Bmyc18 within an RD142 clade (L2.2.1.2) [ |
|
L2.2.M3 |
3 |
Asian African 3 [ |
|
Contains a large clade from Thailand that may be associated with drug resistance and recurring local outbreaks |
|
L2.2.M4 |
3 |
– |
|
|
|
L2.2.M4.1 |
4 |
Bmyc22 [ |
|
Mostly from Thailand |
|
L2.2.M4.2 |
4 |
– |
Bmyc10 |
All from Thailand |
|
L2.2.M4.3 |
4 |
– |
Bmyc10 |
|
|
L2.2.M4.4 |
4 |
– |
Bmyc10 |
Mostly from South Africa |
|
L2.2.M4.5 |
4 |
Europe/Russia B0/W148 [ |
|
Mostly from Russia, Central Asia and Eastern Europe |
|
L2.2.M4.6 |
4 |
– |
Bmyc10 |
|
|
L2.2.M4.7 |
4 |
– |
Bmyc10 |
Mostly from Nepal |
|
L2.2.M4.8 |
4 |
– |
Bmyc10 |
Contains a clade from South Africa and Malawi |
|
L2.2.M4.9 |
4 |
Central Asian [ |
Bmyc10 |
Mostly from Russia, Central Asia and Eastern Europe |
|
L2.2.M4.9.1 |
5 |
Central Asia Outbreak (CAO) [ |
Bmyc10 |
|
|
L2.2.M4.9.2 |
5 |
Clade A [ |
Bmyc10 |
|
|
L2.2.M5 |
3 |
– |
Bmyc10 |
|
|
L2.2.M6 |
3 |
– |
|
|
|
L2.2.M6.1 |
4 |
Asian African 1 [ |
Bmyc10 |
|
|
L2.2.M6.2 |
4 |
– |
Bmyc10 |
|
Fig. 2.Geographical distribution Mtb L2 by country and region. Pie charts show proportions of isolates from each location by sublineages. Pie sizes are proportional to the total number of isolates from each location. An interactive version of this map is available online at Microreact (https://microreact.org/project/4P2iPeBx1Y66TyJfojNM1o).
Fig. 3.Comparison with three existing schemes for SNP-based genotyping of 4,425 L2 strains. Colour intensity of the arcs represents the proportion of lineage-specific SNPs in each scheme present in our genomic dataset. It is always one in our scheme (d) since the same dataset was used to derive the scheme. They are mostly close to one in other schemes, except for a few places, e.g. some 2.2.M1.1 in (c) and 2.2.1.2 in (a). The number of lineage-specific SNPs are given in parentheses after each genotype name. For (d), only level-3 genotypes are labelled. Notice several genotypes in the previous schemes are defined by SNPs which are not actually specific to the intended genotype because the samples used were not sufficiently representative of the actual strain diversity, for example, L2.2.1.2 in Coll et al. [22] or Pacific RD150 in Shitikov et al. [26].