| Literature DB >> 31652703 |
Yong-Bi Fu1, Pingchuan Li2, Bill Biligetu3.
Abstract
Chloroplast (cp) genomics will play an important role in the characterization of crop wild relative germplasm conserved in worldwide gene banks, thanks to the advances in genome sequencing. We applied a multiplexed shotgun sequencing procedure to sequence the cp genomes of 25 Avena species with variable ploidy levels. Bioinformatics analysis of the acquired sequences generated 25 de novo genome assemblies ranging from 135,557 to 136,006 bp. The gene annotations revealed 130 genes and their duplications, along with four to six pseudogenes, for each genome. Little differences in genome structure and gene arrangement were observed across the 25 species. Polymorphism analyses identified 1313 polymorphic sites and revealed an average of 277 microsatellites per genome. Greater nucleotide diversity was observed in the short single-copy region. Genome-wide scanning of selection signals suggested that six cp genes were under positive selection on some amino acids. These research outputs allow for a better understanding of oat cp genomes and evolution, and they form an essential set of cp genomic resources for the studies of oat evolutionary biology and for oat wild relative germplasm characterization.Entities:
Keywords: Avena; Crop wild relative; chloroplast gene; chloroplast genome; positive selection
Year: 2019 PMID: 31652703 PMCID: PMC6918232 DOI: 10.3390/plants8110438
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
List of 25 studied Avena species from six botanical sections and their cp genome assemblies.
| Section/Species | Pl | Raw Reads | CPG Size (bp) | CPG Region (bp) | GC% | NCBI Acc# | PGRC Acc# | |||
|---|---|---|---|---|---|---|---|---|---|---|
| LSC | IRb | SSC | IRa | |||||||
| Ventricosa | ||||||||||
|
| 2x | 3,399,167 | 135,681 | 79,793 | 21,614 | 12,660 | 21,614 | 38.41 | MG687301 | CN21992 |
|
| 2x | 3,037,449 | 135,557 | 79,667 | 21,619 | 12,652 | 21,619 | 38.43 | MG687303 | CN19205 |
|
| 2x | 3,932,778 | 135,560 | 79,669 | 21,619 | 12,653 | 21,619 | 38.43 | MG687291 | CN19256 |
| Agraria | ||||||||||
|
| 2x | 3,138,274 | 135,935 | 80,099 | 21,605 | 12,626 | 21,605 | 38.49 | MG687300 | CN25788 |
|
| 2x | 3,145,903 | 135,939 | 80,101 | 21,606 | 12,626 | 21,606 | 38.49 | MG687310 | CN3145 |
|
| 2x | 3,342,392 | 135,934 | 80,100 | 21,604 | 12,626 | 21,604 | 38.48 | MG687306 | CN79350 |
|
| 2x | 2,618,401 | 135,938 | 80,102 | 21,605 | 12,626 | 21,605 | 38.48 | MG687309 | CN22002 |
| Tenuicarpa | ||||||||||
|
| 2x | 4,285,394 | 135,955 | 80,147 | 21,598 | 12,612 | 21,598 | 38.52 | MG687297 | CN25449 |
|
| 2x | 2,906,067 | 135,925 | 80,101 | 21,602 | 12,620 | 21,602 | 38.51 | MG687302 | CN19458 |
|
| 2x | 4,092,127 | 136,006 | 80,168 | 21,606 | 12,626 | 21,606 | 38.48 | MG687299 | CN25859 |
|
| 2x | 4,372,976 | 135,944 | 80,109 | 21,605 | 12,625 | 21,605 | 38.49 | MG687296 | CN24315 |
|
| 2x | 4,425,054 | 135,879 | 80,166 | 21,603 | 12,507 | 21,603 | 38.52 | MG687295 | CN25936 |
|
| 2x | 4,288,420 | 135,728 | 79,881 | 21,605 | 12,637 | 21,605 | 38.53 | MG687305 | CN21407 |
|
| 4x | 4,432,963 | 135,945 | 80,129 | 21,602 | 12,612 | 21,602 | 38.49 | MG687294 | CN25868 |
|
| 4x | 4,362,663 | 135,946 | 80,111 | 21,605 | 12,625 | 21,605 | 38.49 | MG687311 | CN24462 |
| Pachycarpa | ||||||||||
|
| 4x | 4,001,844 | 135,967 | 80,130 | 21,603 | 12,631 | 21,603 | 38.5 | MG674209 | CN19178 |
|
| 4x | 4,160,684 | 135,887 | 80,102 | 21,604 | 12,577 | 21,604 | 38.51 | MG687298 | CN23057 |
|
| 4x | 5,199,242 | 135,892 | 80,108 | 21,604 | 12,576 | 21,604 | 38.51 | MG687312 | CN21989 |
| Ethiopica | ||||||||||
|
| 4x | 3,870,454 | 135,946 | 80,111 | 21,605 | 12,625 | 21,605 | 38.49 | MG687304 | CN22413 |
|
| 4x | 2,822,158 | 135,942 | 80,109 | 21,604 | 12,625 | 21,604 | 38.49 | MG687293 | CN22064 |
| Avena | ||||||||||
|
| 6x | 4,120,412 | 135,889 | 80,106 | 21,604 | 12,575 | 21,604 | 38.51 | MG687307 | CN21948 |
|
| 6x | 4,330,653 | 135,900 | 80,117 | 21,604 | 12,575 | 21,604 | 38.51 | MG687292 | CN24926 |
|
| 6x | 2,947,248 | 135,893 | 80,114 | 21,602 | 12,575 | 21,602 | 38.51 | MG687314 | CN25946 |
|
| 6x | 2,467,386 | 135,888 | 80,107 | 21,603 | 12,575 | 21,603 | 38.51 | MG687308 | CN20625 |
|
| 6x | 5,329,202 | 135,886 | 80,107 | 21,602 | 12,575 | 21,602 | 38.51 | MG687313 | CN24549 |
Note: Pl is for ploidy level. CPG is chloroplast genome. LSC, SSC, and IR are large single copy, small single copy and inverted repeat regions, respectively. NCBI Acc# are the accession numbers for the cp assemblies deposited in the National Center for Biotechnology Information (NCBI). PGRC Acc# are the accession numbers for the studied samples obtained from the oat collection at Plant Gene Resources of Canada (PGRC).
Figure 1Gene maps of three selected Avena chloroplast genomes: (A) A. eriantha (2x), (B) A. insularis (4x), and (C) A. sativa (6x). Each map is represented in, moving counterclockwise from the right. The larger circle represents the layout of chloroplast genes distribution as per their transcription direction: outside boxes and inside boxes show the counterclockwise and clockwise transcription. The color of the gene box indicates the functional group that the gene belongs to. The smaller circle represents the CG content plot in the corresponding sample. LSC, large single copy region; SSC, small single copy region. IRa/b, inverted repeats. Intron-containing genes are marked by a ‘∗’ symbol and pseudogenes are marked by a ‘Ψ’ symbol.
List of 130 genes and their duplications found in the plastids of 25 Avena species.
| Category | Gene* |
|---|---|
| Subunits of photosystem I |
|
| Subunits of photosystem II |
|
| Subunits of cytochrome b/f complex |
|
| Subunits of ATP synthase |
|
| Large subunit of rubisco |
|
| Subunits of NADH-dehydrogenase |
|
| Proteins of large ribosomal subunit |
|
| Proteins of small ribosomal subunit |
|
| Subunits of RNA polymerase |
|
| Cytochrome c biogenesis |
|
| Transfer RNAs |
|
| Ribosomal RNAs |
|
| Maturase |
|
| Protease |
|
| Conserved hypothetical genes |
|
| Envelope membrane protein |
|
| Translation initiation factor |
|
* The superscript a means the gene contains intron(s). The number in parentheses after a gene shows the number of duplications for the gene in the other genome regions.
Figure 2Percent identity plot by mVISTA of nine Avena chloroplast genome assemblies representing four diploid, three tetraploid, and two hexaploid Avena species, using A. eriantha as reference. Vertical scale indicates the percentage of identity ranging from 98.385% to 99.377%. Coding regions are in blue and non-coding regions are in orange.
Simple sequence repeat (SSR) polymorphism found in the plastids of 25 Avena species.
| SSR type | A | A | A | A | A | A | A | A | A | A | A | C | C | C | C | C | C | AT | AT | AT | AG |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Repeat count | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 8 | 9 | 10 | 11 | 12 | 14 | 5 | 6 | 7 | 5 |
|
| 62 | 33 | 14 | 5 | 1 | 1 | 2 | 0 | 0 | 0 | 1 | 4 | 3 | 1 | 0 | 0 | 0 | 4 | 0 | 1 | 1 |
|
| 58 | 32 | 12 | 8 | 1 | 1 | 2 | 0 | 0 | 0 | 1 | 3 | 1 | 3 | 0 | 1 | 0 | 3 | 1 | 0 | 1 |
|
| 57 | 30 | 15 | 8 | 1 | 1 | 2 | 0 | 0 | 0 | 1 | 3 | 1 | 3 | 1 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 67 | 27 | 17 | 6 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 7 | 4 | 0 | 0 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 67 | 27 | 16 | 7 | 1 | 0 | 2 | 1 | 1 | 0 | 0 | 7 | 2 | 2 | 0 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 67 | 29 | 15 | 5 | 1 | 1 | 2 | 1 | 1 | 0 | 0 | 9 | 2 | 0 | 0 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 67 | 29 | 15 | 2 | 4 | 2 | 1 | 1 | 1 | 0 | 0 | 7 | 4 | 0 | 0 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 66 | 29 | 17 | 4 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 6 | 2 | 2 | 0 | 0 | 0 | 3 | 0 | 1 | 1 |
|
| 66 | 27 | 13 | 7 | 4 | 1 | 1 | 0 | 0 | 1 | 0 | 5 | 4 | 0 | 0 | 0 | 1 | 3 | 0 | 1 | 1 |
|
| 68 | 30 | 15 | 2 | 2 | 2 | 3 | 1 | 0 | 0 | 0 | 7 | 2 | 2 | 0 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 66 | 28 | 14 | 7 | 1 | 0 | 1 | 1 | 2 | 0 | 0 | 7 | 1 | 2 | 1 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 63 | 26 | 17 | 8 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 6 | 1 | 3 | 0 | 0 | 0 | 3 | 0 | 1 | 1 |
|
| 63 | 28 | 12 | 9 | 3 | 0 | 2 | 0 | 0 | 0 | 0 | 7 | 1 | 0 | 2 | 0 | 0 | 3 | 0 | 1 | 1 |
|
| 62 | 29 | 12 | 9 | 2 | 0 | 1 | 1 | 0 | 0 | 0 | 5 | 4 | 0 | 0 | 0 | 0 | 3 | 0 | 1 | 1 |
|
| 66 | 27 | 14 | 7 | 2 | 0 | 1 | 2 | 1 | 0 | 0 | 7 | 1 | 2 | 1 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 63 | 30 | 11 | 8 | 2 | 0 | 2 | 0 | 1 | 0 | 0 | 6 | 1 | 3 | 0 | 0 | 0 | 3 | 0 | 1 | 1 |
|
| 66 | 29 | 11 | 8 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 5 | 3 | 0 | 2 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 63 | 29 | 12 | 8 | 2 | 0 | 1 | 1 | 0 | 0 | 0 | 7 | 0 | 1 | 2 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 66 | 27 | 13 | 9 | 1 | 0 | 1 | 1 | 2 | 0 | 0 | 7 | 1 | 2 | 1 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 66 | 27 | 13 | 10 | 0 | 0 | 2 | 1 | 1 | 0 | 0 | 7 | 3 | 0 | 1 | 0 | 0 | 4 | 1 | 0 | 1 |
|
| 64 | 30 | 9 | 9 | 2 | 0 | 1 | 1 | 0 | 0 | 0 | 7 | 1 | 0 | 2 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 64 | 30 | 11 | 8 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 7 | 0 | 1 | 2 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 64 | 32 | 10 | 7 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 7 | 2 | 1 | 0 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 64 | 29 | 9 | 11 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 7 | 1 | 2 | 0 | 0 | 0 | 3 | 1 | 0 | 1 |
|
| 64 | 29 | 10 | 9 | 2 | 0 | 1 | 1 | 0 | 0 | 0 | 7 | 3 | 0 | 0 | 0 | 0 | 3 | 1 | 0 | 1 |
| Mean | 64.4 | 28.9 | 13.1 | 7.2 | 1.6 | 0.4 | 1.5 | 0.6 | 0.5 | 0.0 | 0.1 | 6.3 | 1.9 | 1.2 | 0.6 | 0.0 | 0.0 | 3.4 | 0.7 | 0.3 | 1.0 |
Figure 3Nucleotide diversity (Pi) from the sliding window analysis of 25 complete Avena chloroplast genome assemblies (window length: 2000 bp, step size 200 bp). X-axis: position of the window midpoint, Y-axis: nucleotide diversity within each window.
Log-likelihood values (InL) and parameter estimates under models of variable ω ratios among sites of 19,941 codons in 25 Avena chloroplast genomes.
| Model Code | InL | p-value | Estimates of Parameters | Count of PSSb | |
|---|---|---|---|---|---|
| for LRTa | NEB | BEB | |||
| M0 (One-Ratio) | –82795.020 | ω = 0.11878 | |||
| M3 (Discrete) | –82781.913 | 0.000013 | p0 = 0.10199, p1 = 0.85429, (p2 = 0.04372), | 0(114) | |
| ω0 = 0.00000, ω1 = 0.00000, ω3 = 2.81808 | |||||
| M1a (Nearly neutral) | –82785.669 | p0 = 0.90153, (p1 = 0.09847), ω0 = 0.00000, (ω1 = 1.00000) | |||
| M2a (Positive selection) | –82781.919 | 0.011758 | p0 = 0.94578, p1 = 0.02070, (p2 = 0.03352), | 114(0) | 6(0) |
| ω0 = 0.00110, (ω1 = 1.00000), ω2 = 2.98551 | |||||
| M7 (Beta) | –82784.258 | p = 0.00500, q = 0.06086 | |||
| M8 (Beta and ω) | –82781.940 | 0.049250 | p0 = 0.97686, p = 0.05383, q = 1.11417, (p1=0.02314), ω=3.81079 | 114(6) | 6(0) |
| M8a (Beta and ω > 1) | –82787.155 | p0 = 0.95611, p = 0.07461, q = 0.93085, (p1 = 0.04389), ω = 1.00000 | |||
| M8 (Beta and ω) | –82781.940 | 0.000671 | p0 = 0.97686, p = 0.05383, q = 1.11417, (p1 = 0.02314), ω = 3.81079 | 114(6) | 6(0) |
a LRT = likelihood ration test. b PSS = positively selected site; NEB = Naïve Empirical Byes analysis; BEB = Bayesian Empirical Bayes analysis; and the first number is the count of PSS with posterior probabilities >50%, and the second number (in parenthesis) is the count of PSS with posterior probabilities >95%.