| Literature DB >> 32153639 |
Chaoyang Li1, Yunlin Zhao1, Zhenggang Xu1,2, Guiyan Yang1, Jiao Peng1, Xiaoyun Peng2.
Abstract
Lack of complete genomic information concerning Vicia sepium (Fabaceae: Fabeae) precludes investigations of evolution and populational diversity of this perennial high-protein forage plant suitable for cultivation in extreme conditions. Here, we present the complete and annotated chloroplast genome of this important wild resource plant. V. sepium chloroplast genome includes 76 protein-coding genes, 29 tRNA genes, 4 rRNA genes, and 1 pseudogene. Its 124,095 bp sequence has a loss of one inverted repeat (IR). The GC content of the whole genome, the protein-coding, intron, tRNA, rRNA, and intergenic spacer regions was 35.0%, 36.7%, 34.6%, 52.3%, 54.2%, and 29.2%, respectively. Comparative analyses with plastids from related genera belonging to Fabeae demonstrated that the greatest variation in the V. sepium genome length occurred in protein-coding regions. In these regions, some genes and introns were lost or gained; for example, ycf4, clpP intron, and rpl16 intron deletions and rpl20 and ORF292 insertions were observed. Twelve highly divergent regions, 66 simple sequence repeats (SSRs) and 27 repeat sequences were also found in these regions. Detailed evolutionary rate analysis of protein-coding genes showed that Vicia species exhibit additional interesting characteristics including positive selection of ccsA, clpP, rpl32, rpl33, rpoC1, rps15, rps2, rps4, and rps7, and the evolutionary rates of atpA, accD, and rps2 in Vicia are significantly accelerated. These genes are important candidate genes for understanding the evolutionary strategies of Vicia and other genera in Fabeae. The phylogenetic analysis showed that Vicia and Lens are included in the same clade and that Vicia is paraphyletic. These results provide evidence regarding the evolutionary history of the chloroplast genome.Entities:
Keywords: Vicia sepium; chloroplast genome; comparative analysis; phylogenetic analysis; positive selection
Year: 2020 PMID: 32153639 PMCID: PMC7044246 DOI: 10.3389/fgene.2020.00073
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Genes predicted in the chloroplast genome of V. sepium.
| Category | Group of genes | Names of genes |
|---|---|---|
| Self-replication | Large subunit of ribosomal proteins |
|
| Small subunit of ribosomal proteins |
| |
| DNA dependent RNA polymerase |
| |
| rRNA genes |
| |
| tRNA genes |
| |
| Photosynthesis | Photosystem I |
|
| Photosystem II |
| |
| NADH dehydrogenase |
| |
| Cytochrome b6/f complex |
| |
| ATP synthase |
| |
| Rubisco |
| |
| Other genes | Maturase |
|
| Protease |
| |
| Envelope membrane protein |
| |
| Subunit acetyl-CoA-carboxylase |
| |
| C-type cytochrome synthesis gene |
| |
| Genes of unknow function | Conserved open reading frames |
|
One open reading frame, ORF292, could not be annotated. apseudogene; btrans-splicing gene; cduplicated gene.
Figure 1Gene map of the complete chloroplast genome of V. sepium. Genes inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. The different colors of the blocks represent different functional groups. The darker gray color of the inner circle corresponds to the GC content, and the lighter gray color corresponds to the AT content.
Lengths of introns and exons of the split genes in the V. sepium complete chloroplast genome.
| Gene name | Gene Location | Length (bp) | ||||||
|---|---|---|---|---|---|---|---|---|
| Strand | Start | End | Exon I | Intro I | Exon II | Intro II | Exon III | |
|
| - | 17,922 | 20,213 | 552 | 1,200 | 540 | ||
|
| + | 39,164 | 41,349 | 720 | 674 | 792 | ||
|
| + | 49,205 | 50,732 | 393 | 700 | 435 | ||
|
| + | 52,173 | 53,655 | 9 | 1,072 | 402 | ||
|
| - | 57,360 | 58,556 | 9 | 714 | 474 | ||
|
| - | 58,753 | 60,207 | 6 | 804 | 645 | ||
|
| - | 74,347 | 75,592 | 168 | 670 | 411 | ||
|
| - | 83,263 | 86,132 | 435 | 791 | 1,644 | ||
|
| + | 92,455 | 93,604 | 363 | 559 | 228 | ||
|
| + | 97,292 | 99,294 | 126 | 742 | 228 | 781 | 126 |
|
| + | 9,320 | 9,976 | 39 | 581 | 37 | ||
|
| - | 32,593 | 33,473 | 38 | 808 | 35 | ||
|
| - | 33,539 | 34,292 | 42 | 677 | 35 | ||
|
| + | 119,177 | 119,535 | 37 | 272 | 50 | ||
Figure 2The types and distribution of SSRs along the chloroplast genome of V. sepium. Different locations, including CDS, IGS, CDS and IGS, and intron regions, are represented as colored boxes.
Characteristics of twenty-one Fabeae species and Cicer arietinum.
| Species |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Length (bp) | 12,4095 | 12,2467 | 123,722 | 12,2174 | 12,2169 | 12,1958 | 120,837 | 122,967 | 126,421 | 125,459 | 124,287 | 124,242 | 123,911 | 123,895 | 123,734 | 123,153 | 122,438 | 122,165 | 121,263 | 121,020 | 120,289 | 125,319 |
| Genes | 110 | 109 | 110 | 108 | 110 | 108 | 107 | 108 | 108 | 110 | 109 | 109 | 109 | 108 | 108 | 108 | 109 | 108 | 110 | 109 | 107 | 108 |
| CDS genes* | 77 | 76 | 76 | 75 | 76 | 75 | 73 | 74 | 74 | 75 | 75 | 75 | 75 | 74 | 74 | 74 | 75 | 74 | 75 | 75 | 73 | 75 |
| tRNA genes | 29 | 29 | 30 | 29 | 30 | 29 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 31 | 30 | 30 | 29 |
| Introns** | 14 | 14 | 14 | 14 | 15 | 14 | 15 | 14 | 15 | 15 | 15 | 15 | 15 | 14 | 15 | 15 | 14 | 14 | 15 | 15 | 15 | 15 |
| CDS region | 53.9% | 52.9% | 53.2% | 53.9% | 54.3% | 53.9% | 52.7% | 50.9% | 53.3% | 52.2% | 53.0% | 52.9% | 53.0% | 52.6% | 53.1% | 54.0% | 53.5% | 53.5% | 53.7% | 55.3% | 54.1% | 52.3% |
| Intron region | 8.9% | 9.0% | 8.8% | 9.1% | 9.7% | 9.1% | 9.2% | 9.0% | 8.8% | 8.8% | 8.8% | 8.8% | 8.8% | 8.9% | 8.9% | 9.1% | 8.5% | 9.2% | 9.3% | 9.7% | 9.2% | 9.7% |
| IGS region | 31.9% | 32.7% | 32.6% | 31.5% | 30.4% | 31.5% | 32.5% | 34.7% | 32.6% | 33.7% | 32.7% | 32.8% | 32.8% | 33.1% | 32.6% | 31.4% | 32.5% | 31.8% | 31.5% | 29.4% | 31.0% | 32.6% |
| tRNA region | 1.7% | 1.7% | 1.8% | 1.8% | 1.9% | 1.8% | 1.8% | 1.7% | 1.7% | 1.8% | 1.8% | 1.8% | 1.8% | 1.8% | 1.8% | 1.8% | 1.8% | 1.8% | 1.9% | 1.9% | 1.8% | 1.8% |
| rRNA region | 3.6% | 3.7% | 3.6% | 3.7% | 3.7% | 3.7% | 3.7% | 3.7% | 3.6% | 3.6% | 3.7% | 3.7% | 3.6% | 3.7% | 3.6% | 3.7% | 3.7% | 3.7% | 3.7% | 3.8% | 3.7% | 3.6% |
| Genome GC | 35.0% | 35.2% | 34.6% | 34.8% | 34.8% | 34.8% | 34.9% | 34.4% | 35.0% | 34.9% | 34.9% | 34.9% | 34.9% | 34.9% | 34.8% | 34.7% | 35.0% | 34.9% | 35.0% | 35.1% | 35.2% | 33.9% |
| CDS GC | 36.7% | 36.9% | 36.6% | 36.8% | 36.8% | 36.8% | 36.9% | 36.6% | 36.7% | 36.7% | 36.7% | 36.7% | 36.7% | 36.8% | 36.6% | 36.6% | 36.7% | 36.7% | 36.9% | 36.9% | 37.0% | 36.3% |
| Intron GC | 34.6% | 34.7% | 34.5% | 34.4% | 34.1% | 34.4% | 34.3% | 34.1% | 34.6% | 34.7% | 34.7% | 34.7% | 34.7% | 34.7% | 34.6% | 34.1% | 35.0% | 34.3% | 34.5% | 34.2% | 34.3% | 33.5% |
| IGS GC | 29.2% | 29.4% | 28.0% | 28.2% | 28.1% | 28.2% | 28.5% | 28.4% | 29.4% | 29.1% | 28.8% | 28.9% | 29.0% | 28.9% | 28.8% | 28.4% | 28.9% | 28.8% | 28.6% | 28.6% | 29.0% | 27.0% |
| tRNA GC | 52.3% | 52.5% | 52.4% | 52.8% | 52.7% | 52.8% | 52.7% | 52.2% | 52.4% | 52.8% | 52.7% | 52.8% | 52.8% | 52.9% | 52.8% | 52.7% | 52.8% | 52.4% | 52.6% | 52.7% | 52.6% | 52.7% |
| rRNA GC | 54.2% | 54.2% | 54.2% | 54.0% | 53.9% | 54.0% | 54.0% | 53.8% | 54.0% | 54.2% | 54.1% | 54.1% | 54.2% | 53.9% | 54.2% | 54.1% | 54.0% | 54.0% | 54.3% | 53.6% | 53.7% | 54.2% |
| Gene losses |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
| Gene gains |
|
|
All of these species contain four rRNA genes except for L. ven (L. venosus). The full names of the twenty-two species are as follows: V. sep, V. sepium; V. sat, V. sativa; V. fab, V. faba; P. aby, P. abyssinicum; P. sat, P. sativum; P. satsub, P. sativum subsp. Elatius; P. ful, P. fulvum; L. cul, L. culinaris; L. pub, L. pubescens; L. ven, L. venosus; L. pal, L. palustris; L. jap, L. japonicus; L. och, L. ochroleucus; L. dav, L. davidii; L. lit, L. littoralis; L. inc, L. inconspicuus; L. gra, L. graminifolius; L. tin, L. tingitanus; L. cly, L. clymenum; L. sat, L.sativus; L. odo, L.odoratus; C. ari, C. arietinum. *pseudogenes: rpl23 in V. sepium, V. sativa, P. abyssinicum, P. sativum, P. sativum subsp. Elatius and L. sativus; ycf1 in L. culinaris; ycf4 in P. sativum. **intron gains: one intron added to tRNA-Gly (V. faba, P. sativum, L. sativus, V. faba, C. arietinum) and ycf2 (P. fulvum, L. pubescens, L. venosus, L. palustris, L. japonicus, L. ochroleucus, L. littoralis, L. inconspicuus, L. graminifolius, L. clymenum, L. odoratus). intron losses: one intron missing in clpP (L. graminifolius) and rpl16 (V. faba).
Figure 3The sequence identity of 22 Fabaceae species. The inner circle is the reference genome. Next circles represent the sequence identity between V. sepium and 21 other species. The outermost circle corresponds to the protein-coding genes and intergenic spacer regions. Genes with clockwise arrows represent reverse strands, while genes with the counterclockwise arrow represent forward strands.
Figure 4The Ka/Ks ratios of homologous protein-coding genes within and outside of the genus Vicia with V. sepium as the reference. White boxes represent the mean Ka/Ks values within the genus Vicia, and black boxes indicate the mean Ka/Ks values outside of the genus Vicia. The data are the arithmetic mean ± SE. Symbols under the gene names indicate levels of statistical significance between the species within Vicia and the species outside of Vicia: no symbol, P > 0.05, blank circle, P = 0.01–0.05; black circle, P < 0.01. The X-axis denotes the homologous genes.
Figure 5Synonymous and nonsynonymous divergence in the Fabeae chloroplast genes. All tree topologies were completely constrained as described in the Methods section. All trees were drawn to the same scale representing the number of substitutions per synonymous or nonsynonymous site.
Figure 6Phylogenetic relationships based on the conserved chloroplast protein-coding sequences of 21 Fabeae species and C. arietinum with the maximum likelihood (ML) method and the neighbor joining (NJ) method. C. arietinum was selected as the outgroup. Numbers on the left and right side at the branches represent bootstrap values of the ML method and the NJ method respectively.