| Literature DB >> 36160240 |
Zhuochong Liu1, Zhonghua Jiang1, Wei Wu1, Xinyi Xu1, Yudong Ma1, Xiaomei Guo2, Senlin Zhang2, Qun Sun1.
Abstract
Mycobacterium tuberculosis complex (MTBC), the main cause of TB in humans and animals, is an extreme example of genetic homogeneity, whereas it is still nevertheless separated into various lineages by numerous typing methods, which differ in phenotype, virulence, geographic distribution, and host preference. The large sequence polymorphism (LSP), incorporating region of difference (RD) and H37Rv-related deletion (RvD), is considered to be a powerful means of constructing phylogenetic relationships within MTBC. Although there have been many studies on LSP already, focusing on the distribution of RDs in MTBC and their impact on MTB phenotypes, a crumb of new lineages or sub-lineages have been excluded and RvDs have received less attention. We, therefore, sampled a dataset of 1,495 strains, containing 113 lineages from the laboratory collection, to screen for RDs and RvDs by structural variant detection and genome assembly, and examined the distribution of RvDs in MTBC, including RvD2, RvD5, and cobF region. Consistent with genealogical delineation by single nucleotide polymorphism (SNP), we identified 125 RDs and 5 RvDs at the species, lineage, or sub-lineage levels. The specificities of RDs and RvDs were further investigated in the remaining 10,218 strains, suggesting that most of them were highly specific to distinct phylogenetic groups, could be used as stable genetic markers in genotyping. More importantly, we identified 34 new lineage or evolutionary branch specific RDs and 2 RvDs, also demonstrated the distribution of known RDs and RvDs in MTBC. This study provides novel details about deletion events that have occurred in distinct phylogenetic groups and may help to understand the genealogical differentiation.Entities:
Keywords: H37Rv-related deletion; Mycobacterium tuberculosis complex; large sequence polymorphism; region of difference; structural variant
Year: 2022 PMID: 36160240 PMCID: PMC9493256 DOI: 10.3389/fmicb.2022.984582
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 6.064
FIGURE 1Maximum-likelihood phylogenetic tree of MTBC strains in sampled dataset. Clades were shrunk by lineage or sub-lineage, and the size of external nodes did not represent the number of strains.
FIGURE 2Characteristics of deletions in MTBC strains. (A) Deletions per genome distribution among MTBC strains. (B) Individual deletion length distribution among lineages. (C) Total deletions length per genome distribution among MTBC strains. (D) Total uncoverage length per genome distribution among MTBC strains. (E) Detection efficiency of deletions distribution among MTBC strains.
FIGURE 3Deletion patterns of RDs and RvDs within MTBC.
FIGURE 4Overlapping RDs. Only the RDs mentioned in the text are shown, and do not include RD317 and RD306, and the deletion in RD5 region and RD1 region. (A) RD105 and RD105ext. (B) RD505 and RD505ext. (C) RD307 and RD147c. (D) RD7 and RD713. (E) RD 743 and RD174. (F) RDs in RD516 region. (G) RD8 and RD236a. (H) RDs in N-RD25 region.
FIGURE 5Gene covariation of RvD4496 region in lineage 5, lineage 8 and H37Rv. Gray block represents pseudogenes.
FIGURE 6RDs in M. caprae. The blue branch represents the M. caprae strains and the black is M. canetti as the root. RD502cap_1 and RD502cap_2 are missing in all strains and the remaining RDs are only deleted in partial strains. RD528 is the largest deletion in length (38,328 bp) detected in this study and is deleted in 10 strains within the sampled dataset. Moreover, deletion of RD528 is detected in 4 additional strains out of all M. caprae strains.