| Literature DB >> 30177918 |
Tingting Yang1,2, Jun Zhong1, Ju Zhang1, Cuidan Li1,2, Xia Yu3, Jingfa Xiao1,2,4, Xinmiao Jia5, Nan Ding1, Guannan Ma1, Guirong Wang3, Liya Yue1, Qian Liang3, Yongjie Sheng6, Yanhong Sun6, Hairong Huang3, Fei Chen1,2,7.
Abstract
Tuberculosis (TB) has surpassed HIV as the leading infectious disease killer worldwide since 2014. The main pathogen, Mycobacterium tuberculosis (Mtb), contains ~4,000 genes that account for ~90% of the genome. However, it is still unclear which of these genes are primary/secondary, which are responsible for generality/individuality, and which interconvert during evolution. Here we utilized a pan-genomic analysis of 36 Mtb genomes to address these questions. We identified 3,679 Mtb core (i.e., primary) genes, determining their phenotypic generality (e.g., virulence, slow growth, dormancy). We also observed 1,122 dispensable and 964 strain-specific secondary genes, reflecting partially shared and lineage-/strain-specific individualities. Among which, five L2 lineage-specific genes might be related to the increased virulence of the L2 lineage. Notably, we discovered 28 Mtb "Super Core Genes" (SCGs: more than a copy in at least 90% strains), which might be of increased importance, and reflected the "super phenotype generality." Most SCGs encode PE/PPE, virulence factors, antigens, and transposases, and have been verified as playing crucial roles in Mtb pathogenicity. Further investigation of the 28 SCGs demonstrated the interconversion among SCGs, single-copy core, dispensable, and strain-specific genes through copy number variations (CNVs) during evolution; different mutations on different copies highlight the delicate adaptive-evolution regulation amongst Mtb lineages. This reflects that the importance of genes varied through CNVs, which might be driven by selective pressure from environment/host-adaptation. In addition, compared with Mycobacterium bovis (Mbo), Mtb possesses 48 specific single core genes that partially reflect the differences between Mtb and Mbo individuality.Entities:
Keywords: Mycobacterium tuberculosis (Mtb); adaptive evolution; copy number variation (CNV); core gene; host-adaptation; infectious disease; pan-genome; selection pressure
Year: 2018 PMID: 30177918 PMCID: PMC6109687 DOI: 10.3389/fmicb.2018.01886
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Pan-genome of Mtb. (A) Gene accumulation curves of the pan-genome (blue) and core-genome (green). The blue boxes denote the Mtb pan-genome size for each genome for comparison. The green boxes show the Mtb core genome size for each genome for comparison. The curve is the least squares fit of the power law for the average values. (B) Curve (red) for the number of new genes with an increase in the number of Mtb genomes.
Figure 2Flower plot showing the core, dispensable, and strain-specific genes of the 36 Mtb strains. The flower plot displays the core gene number (in the center), the dispensable gene number (in the annulus), and the strain-specific gene number (in the petals) for the 36 Mtb strains. The numbers under the strain name denote the total number of related genes. Different colors indicate different lineage strains: L1 strains, blue; L2 strains, green; L3 strains, salmon; L4 strains, gold.
Figure 3The three levels of multi-copy core genes in the 36 Mtb strains. Rows represent the 36 Mtb strains and columns represent the “multi-copy core genes” (i.e., core genes with more than one copy in at least one strain). The “multi-copy core genes” were classified into one of three levels: “super-core genes (SCGs)”, “high-level core genes (HCGs)”, or “middle-level core genes (MCGs)”. SCGs were defined as those core genes containing more than one copy in more than 90% of strains; HCGs were defined as those core genes containing more than one copy in 50–90% of strains, and; MCGs were defined as those core genes containing more than one copy in at least one strain. The color intensity indicates the copy number for each gene.
Figure 4Schematic diagram showing the interconversion of the 28 Mtb SCGs during evolution. Rows represent the 36 Mtb strains, 13 Mbo strains, and 5 STB strains; columns represent the 28 Mtb SCGs. The color intensity indicates the copy number of each SCG.
Figure 5Schematic diagram showing the interconversion of SCGs, core genes, dispensable genes, and strain-specific genes following evolution under different conditions. The “interconversion” means that the strain-specific genes, dispensable genes, core genes and SCGs may be transformed into one other during evolution through CNVs. The increases in gene copies (Gain) lead to the conversion from strain-specific genes to dispensable genes, to core genes, and to SCGs. In contrast, the decreases in gene copies (Loss) result in the transformation from SCGs to core genes, to dispensable genes, and to strain-specific genes. Previous research indicated that the interconversion among these genes might be driven by the selection pressure from environment/host adaptation.
Figure 6Ka/Ks ratios for all types of genes in Mtb strains. The schematic diagram shows the Ka/Ks ratios for all-genes, SCGs, HCGs, MCGs, single-copy core genes, and dispensable genes in Mtb strains. Different colors indicate different kinds of genes: all-genes, gray; SCGs, red; HCGs, yellow; MCGs, green; single-copy core genes, purple; dispensable genes, blue. The boxplots show the median values.