| Literature DB >> 28198668 |
Junjie Yang1,2,3, Sheng Yang4,5,6.
Abstract
BACKGROUND: Corynebacterium glutamicum is a non-pathogenic bacterium widely used in industrial amino acid production and metabolic engineering research. Although the genome sequences of some C. glutamicum strains are available, comprehensive comparative genome analyses of these species have not been done. Six wild type C. glutamicum strains were sequenced using next-generation sequencing technology in our study. Together with 20 previously reported strains, we present a comprehensive comparative analysis of C. glutamicum genomes.Entities:
Keywords: Comparative genomics; Corynebacterium glutamicum; Pan-genome; Production of amino acids
Mesh:
Substances:
Year: 2017 PMID: 28198668 PMCID: PMC5310272 DOI: 10.1186/s12864-016-3255-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Detail Descriptions and allelic profile of the strains used in this study
| No. | Group | ST | Strains | Synonym | Descriptions | Ancestorc | Chromosome/Draft contigsa | Genome size (bp) | C + G content (%) |
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | ATCC13032 |
| – | NC_003450.1 | 3,309,401 | 53.81 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| 2 | 1 | 1 | ATCC13032 |
| – | NC_006958.1 | 3,282,708 | 53.84 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| 3 | 1 | 1 | K51 | Substrain of ATCC13032 | ATCC13032 | NC_020519.1 | 3,309,400 | 53.8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| 4 | 1 | 1 | MB001 | prophage-free variant of ATCC 13032 with a 6% reduced genome | ATCC13032 | NC_022040.1 | 3,079,253 | 54.21 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| 5 | 1 | 1 | ATCC21300 | Producing lysine, derived from ATCC13032 | ATCC13032 | DDBJ SRA: DRR001643 b | 3,243,227 | 53.84 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| 6 | 2 | 2 | ATCC13869 |
| “wild-type | – | LOQU01000000 | 3,311,939 | 54.25 | 4.5 kb/X03987.1 | 1 | 2 | 2 | 4 | 2 | 4 | 4 |
| 7 | 3 | 3 | ATCC13870 |
| “wild-type | – | LOQV01000000 | 3,360,227 | 54.02 | 4 | 6 | 5 | 5 | 6 | 1 | 1 | |
| 8 | 4 | 4 | ATCC14067 |
| “wild-type | – | AGQQ02000000 | 3,311,083 | 54.15 | 3 | 2 | 4 | 6 | 2 | 2 | 2 | |
| 9 | 4 | 5 | ATCC21493 |
| Producing arginine, derived from ATCC 14067 (SIIM B234) | ATCC14067 | LOQX01000000 | 3,275,235 | 54.10 | 3 | 2 | 4 | 6 | 2 | 5 | 2 | |
| 10 | 4 | 11 | SYPS-062 | L-serine overproduction | unknown | JXBH01000000 | 3,214,861 | 53.96 | 3 | 2 | 4 | 6 | 2 | 2 | 5 | ||
| 11 | 4 | 11 | SYPS-062-33a | L-serine overproduction, derived from SYPS-062 | unknown | JYEG01000000 | 3,211,995 | 53.95 | 3 | 2 | 4 | 6 | 2 | 2 | 5 | ||
| 12 | 4 | 4 | ATCC15168 |
| L-isoleucine production | unknown | CP011309 | 3,338,699 | 54.14 | 3 | 2 | 4 | 6 | 2 | 2 | 2 | |
| 13 | 5 | 6 | R |
| – | NC_009342.1 | 3,363,299 | 54.13 | 5 | 3 | 7 | 3 | 3 | 2 | 1 | ||
| 14 | 6 | 7 | AS1.299 |
| “wild-type | – | LOQS01000000 | 3,109,311 | 54.18 | 2 | 5 | 3 | 5 | 4 | 3 | 3 | |
| 15 | 7 | 8 | 617(B1) | A glutamate producing strain previously used in China(=CICC 10117, SIIM B1) | – | LOQY01000000 | 3,174,403 | 54.26 | 22 kb | 1 | 2 | 4 | 7 | 7 | 3 | 2 | |
| 16 | 7 | 13 | B253 | An important lysine-producing strain in China | unknown | CP010451 | 3,229,314 | 54.26 | 22 kb/CP010452 | 1 | 2 | 4 | 7 | 9 | 3 | 2 | |
| 17 | 8 | 9 | T6-13 |
| “wild-type | – | LOQW01000000 | 3,263,419 | 53.98 | 5 | 4 | 6 | 2 | 5 | 3 | 1 | |
| 18 | 8 | 9 | SCgG1 | Hyper-producing glutamate | unknown | NC_021351.1 | 3,350,620 | 53.93 | 5 | 4 | 6 | 2 | 5 | 3 | 1 | ||
| 19 | 8 | 9 | SCgG2 | Hyper-producing glutamate | unknown | NC_021352.1 | 3,350,619 | 53.93 | 5 | 4 | 6 | 2 | 5 | 3 | 1 | ||
| 20 | 8 | 9 | Z188 | Hyper-producing glutamate | unknown | AKXP01000000 | 3,283,833 | 53.93 | 5 | 4 | 6 | 2 | 5 | 3 | 1 | ||
| 21 | 8 | 9 | S9114 | A strain for industrial production of glutamate | T6-13 | AFYA01000000 | 3,262,889 | 53.90 | 5 | 4 | 6 | 2 | 5 | 3 | 1 | ||
| 22 | 8 | 9 | AS1.542 |
| “wild-type | – | LOQT01000000 | 3,298,702 | 53.93 | 5 | 4 | 6 | 2 | 5 | 3 | 1 | |
| 23 | 8 | 10 | MT |
| A mutant of AS1.542, producing arginine | AS1.542 | AQPS01000000 | 3,346,700 | 53.91 | 6 | 4 | 6 | 2 | 5 | 3 | 1 | |
| 24 | 8 | 10 | SYPA5-5 |
| A mutant of AS1.542, producing arginine | AS1.542 | JPDH01000000 | 3,268,761 | 53.91 | 6 | 4 | 6 | 2 | 5 | 3 | 1 | |
| 25 | 9 | 12 | ATCC 21831(AR0) | Producing L-arginine | unknown | CP007722 | 3,192,886 | 54.14 | 17 kb/CP007723 | 7 | 7 | 2 | 8 | 8 | 1 | 1 | |
| 26 | 9 | 12 | AR1 | Producing L-arginine, derived from ATCC 21831 | unknown | CP007724 | 3,162,487 | 54.13 | 17 kb/CP007725 | 7 | 7 | 2 | 8 | 8 | 1 | 1 |
aDDBJ/EMBL/GenBank accession number
bSRA: Sequence Read Archive
cAccording to references, ATCC/CGMCC record or DDBJ/EMBL/GenBank record
Fig. 1Phylogenetic trees based on the genome sequence of 26 C. glutamicum strains. YS314 was designated the out-group. The dendrogram was calculated by the CVTree Web interface using a composition vector (CV) approach. Figtree was used to draw the phylogenetic tree and produce the figure
Fig. 2Pan-genome calculation of C. glutamicum using nine strains. a Core genes and pan genes calculation. The blue line shows the pan-genome development using, with the asymptotic value of y = 1161× x0.416 + 1821. The green line shows the core genes calculation, with the asymptotic value of y = 1364 × e(−0.802 × x) + 2359, where 2359 is the number of core genes regardless of how many genomes are added into the C. glutamicum pan-genome. b New (unique) genes of the pan-genome. The horizontal dashed line (orange) indicates the asymptotic value with the function of y = 612 × x-0.68. The figures were produced by PanGP
Glutamate dehydrogenase(GDH) and cspB genes detected in strains
| Group | Strain | Synonym | GDH-NADP+
| GDH-NAD+
|
|
|---|---|---|---|---|---|
| 1 | ATCC13032 | + | - | - | |
| 2 | ATCC13869 |
| + | + | + |
| 3 | ATCC13870 |
| + | + | + |
| 4 | ATCC14067 |
| + | + | + |
| 5 | R | + | + | + | |
| 6 | AS1.299 |
| + | - | + |
| 7 | B1(617) | + | - | + | |
| 8 | T6-13 |
| + | + | + |
| 8 | AS1.542 |
| + | + | + |
| 9 | ATCC21831 (AR0) | + | - | - |
SNP and InDel distribution in amino acid biosynthetic pathway
| Strains | Production | Ref. genome | SNP and InDel in genes | Gene description |
|---|---|---|---|---|
| ATCC21300 | lysine | ATCC13032 |
|
|
| B253 | lysine | B1 |
|
|
| ATCC21493 | arginine | ATCC14067 | KIQ_011285: p.Gly159Asp; | KIQ_011285: arginine repressor |
| SYPS-062 | serine | ATCC14067 | KIQ_000725: p.Leu103Phe; | KIQ_000725: serine acetyltransferase |
| SYPS-062-33a | serine | ATCC14067 | KIQ_000725: p.Leu103Phe; | |
| ATCC15168 | isoleucine | ATCC14067 | KIQ_005265: p.Ser248Phe; | KIQ_005265:2-isopropylmalate synthase; |
| MT | arginine | AS1.542 |
|
|
| SYPA5-5 | arginine | AS1.542 |
| |
| SCgG1 | glutamate | T6-13 |
|
|
| SCgG2 | glutamate | T6-13 |
| |
| Z188 | glutamate | T6-13 |
| |
| S9114 | glutamate | T6-13 |
|
Fig. 3Phylogenomic trees of ATCC 14067, AS1.542, T6-13, and related strains. a ATCC 14067 and related strains. b AS1.542, T6-13, and related strains. The blue lines show the branch from AS1.542 to the arginine-producing strains MT and SYPA5-5; the red lines show the branch from T6-13 to the glutamate-producing strains SCgG1, SCgG2, Z188,and S9114. Wombac was used to finds genome SNPs and build phylogenomic trees for these strains. Figtree was used to draw the phylogenetic trees and produce the figures
Fig. 4Pipeline for genome sequence analysis of amino acid producing C. glutamicum strains. The major steps are marked in red (MLST), yellow (phylogenomic analysis using SNPs) and blue (SNPs/Indels/SVs detection and annotation)