| Literature DB >> 32878602 |
Qing Liu1,2, Xiaoyu Li3,4, Mingzhi Li5, Wenkui Xu5, Trude Schwarzacher3,6, John Seymour Heslop-Harrison7,8.
Abstract
BACKGROUND: Oat (Avena sativa L.) is a recognized health-food, and the contributions of its different candidate A-genome progenitor species remain inconclusive. Here, we report chloroplast genome sequences of eleven Avena species, to examine the plastome evolutionary dynamics and analyze phylogenetic relationships between oat and its congeneric wild related species.Entities:
Keywords: Avena; Chloroplast genome; Evolution rate; Insertions/deletions; Intermolecular recombination; Phylogenomics; Single nucleotide polymorphisms; Tandem repeats
Mesh:
Year: 2020 PMID: 32878602 PMCID: PMC7466839 DOI: 10.1186/s12870-020-02621-y
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Fig. 1Chloroplast genome map of Avena sativa_Liu312 (GWHAOPK01000000; the outer circle and rings “s-t”) and GView comparison [25] of thirteen Avena species and three outgroup species plastomes (rings “c-r”). Genes belonging to different functional groups are shown in different colors. Genes shown on the outside of outer circle are transcribed counter-clockwise and on the inside clockwise. The tRNA genes are represented by one letter code of amino acids with anticodons. LSC, large single copy region; IR, inverted repeat; SSC, small single copy region. Rings “a-b” from the innermost ring denote GC skews and GC content deviations from A. sativa_Liu312 plastome GC content, respectively; rings “c-r” denote the plastome sequence comparison by BLAST between A. sativa_Liu312 and other species plastomes outwards in turn: A. eriantha_Liu435, A. ventricosa_Liu275, A. atlantica_Liu437, A. wiestii_Liu439, A. strigosa_Liu 315, A. nuda_Liu443, A. hirtula_Liu299, A. sterilis_NC031650.1, A. sativa_NC027468.1, A. brevis_Liu289, A. sativa_Liu312, A. murphyi_Liu442, A. longiglumis_Liu438, wheat (Triticum aestivum_NC002762.1), maize (Zea mays_NC00166.1), and rice (Oryza sativa_NC008155.1); plastome similar and highly divergent locations are represented by continuous and interrupted track lines (except for 14 non-overlapping tracking bins across 16 rings “c-r”), respectively, with a heuristic considering the total number of plastome bases (135,887 bp to 135,998 bp) contributing to hits and their scores with 0–14 non-overlapping 10 kbp tracking bins. Rings “s-t” denote AT and GC content of Avena sativa_Liu312 plastome (GWHAOPK01000000) by OGDRAW [26]
The quantity and quality of the sequencing data and coverage depth of the assembled chloroplast genomes for eleven Avena species
| Taxa | Voucher | Raw data (Gbp) | Clean data (bp) | Clean reads count | Chloroplast genome reads | Average coverage depth (×) | Maximum coverage depth (×) | Chloroplast genome size (bp) | GC content | Genome Warehouse accession |
|---|---|---|---|---|---|---|---|---|---|---|
| Liu 437 | 32.6 | 26,768,534,000 | 107,074,136 | 2,903,043 | 5339 | 9373 | 135,940 | 38.48 | GWHAOPC01000000 | |
| Liu 289 | 36.4 | 30,183,435,500 | 120,733,742 | 1,243,680 | 2288 | 2746 | 135,889 | 38.50 | GWHAOPA01000000 | |
| Liu 435 | 29.8 | 24,521,134,000 | 98,084,536 | 1,361,250 | 2504 | 5454 | 135,909 | 38.41 | GWHAOPE01000000 | |
| Liu 299 | 37.0 | 30,738,036,000 | 122,952,144 | 1,095,702 | 2015 | 2643 | 135,937 | 38.48 | GWHAOPJ01000000 | |
| Liu 438 | 29.6 | 24,297,359,000 | 97,189,436 | 1,115,888 | 2052 | 4569 | 135,962 | 38.49 | GWHAOPH01000000 | |
| Liu 442 | 28.0 | 22,981,249,000 | 91,924,996 | 1,240,082 | 2281 | 5085 | 135,890 | 38.51 | GWHAOPF01000000 | |
| Liu 443 | 41.9 | 33,688,961,000 | 134,755,844 | 1,773,953 | 3263 | 5956 | 135,935 | 38.48 | GWHAOPD01000000 | |
| Liu 312 | 69.4 | 57,552,411,500 | 230,209,646 | 1,685,727 | 3101 | 4021 | 135,903 | 38.51 | GWHAOPK01000000 | |
| Liu 315 | 37.7 | 31,317,309,000 | 125,269,236 | 1,255,620 | 2309 | 2927 | 135,935 | 38.48 | GWHAOPI01000000 | |
| Liu 275 | 17.4 | 14,267,461,000 | 57,069,844 | 1,039,219 | 1912 | 3178 | 135,910 | 38.41 | GWHAOPG01000000 | |
| Liu 439 | 15.4 | 12,685,761,500 | 50,743,046 | 888,788 | 1634 | 3004 | 135,998 | 38.48 | GWHAOPB01000000 |
Genes present in Avena plastomes
| Category of genes | Group of genes (gene number) | Gene name |
|---|---|---|
| Self replication | Ribosomal RNAs (8) | |
| Transfer RNAs (39) | ||
| Ribosomal protein (small subunit) (16) | ||
| Ribosomal protein (large subunit) (11) | ||
| DNA dependent RNA polymerase (4) | ||
| Translation-related gene (1) | ||
| Genes for photosynthesis | Subunits of photosystem I (5) | |
| Subunits of photosystem II (15) | ||
| Subunits of cytochrome b/f complex (6) | ||
| Subunits of ATP synthase (6) | ||
| Subunits of NADH dehydrogenase (13) | ||
| ATP-dependent protease subunit (1) | ||
| Rubisco large subunit (1) | ||
| Other genes | Maturase (1) | |
| Envelop membrane protein (1) | ||
| c-type cytochrome biogenesis (1) | ||
| Genes of unknown function | Conserved open reading frames (2) |
a Gene containing a single intron;
b Gene containing two introns;
c Two gene copies in the IRs;
d Gene divided into two independent transcription units;
e Duplicated gene in LSC region
Genes with intron(s) in Avena sativa plastome
| Gene | Region | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
| LSC | 145+ | 825 | 407+ | |||
| SSC | 550− | 1021 | 539− | |||
| IRB | 777− | 712 | 756− | |||
| IRA | 777+ | 712 | 756+ | |||
| LSC | 6+ | 759 | 642+ | |||
| LSC | 8+ | 741 | 475+ | |||
| IRB | 391− | 663 | 431− | |||
| IRA | 391+ | 663 | 431+ | |||
| LSC | 9− | 1052 | 402− | |||
| LSC + IRB | 114− | – | 232− | 799 | 29− | |
| LSC + IRA | 114− | – | 232+ | 799 | 29+ | |
| LSC | 40− | 826 | 230− | |||
| IRB | 38+ | 811 | 35+ | |||
| IRA | 38− | 811 | 35− | |||
| LSC | 23− | 677 | 48− | |||
| IRB | 37− | 807 | 35− | |||
| IRA | 37+ | 807 | 35+ | |||
| LSC | 37− | 2435 | 35− | |||
| LSC | 35+ | 33 | 50+ | |||
| IRB | 39+ | 596 | 37+ | |||
| IRA | 39− | 596 | 37− | |||
| LSC | 126− | 755 | 226− | 725− | 161− |
Superscript+: exon is transcribed counter-clockwise in Fig. 1;
Superscript−: exon is transcribed clockwise in Fig. 1;
Hyphen–: spliceosomal intron;
a,b: The rps12 gene is divided into 5′-rps12 in LSC region, a 3′-rps12 in IRB region and b 3′-rps12 in IRA region
Fig. 2Mauve alignment. a Oat, rice, wheat, and dandelion (Taraxacum amplum) plastomes from this study and NCBI revealed similarities and differences in syntenical blocks. Two rearrangements with respect to the dicot plastome with LSC and IRB intermolecular recombination please see Fig. 3. b Mauve alignment of eleven Avena plastomes revealing no interspecific rearrangement. c Mauve alignment of Avena sativa (GWHAOPK01000000 and NC027468.1) plastomes revealing no intraspecific rearrangement. Each colored block is a region of collinear sequence among investigated species plastomes. Blocks shown above and below the line are in opposite orientations
Fig. 3Schematic diagram showing postulated intermolecular recombination events across Avena sativa (top) and Taraxacum amplum (bottom). a Gene order and orientation within LSC intermolecular recombination. Roman numeral arrows denote the sequences of recombination events: I, Duplication of trnfM gene; II, Intermolecular recombination of trnC-GCA-trnR-UCU and psbD-trnfM regions; II, Inversion of trnC-GCA-trnE-UUC region. b Gene order and orientation within IRB intermolecular recombination. Roman numeral arrows denote the sequences of recombination events: I, Loss of ycf1 gene; II, Duplication of rps15-ndhH region; III, Inversion of rps15-ndhF region. The vertical dashed lines indicate the rearrangement start-end base position. Gene transcribed in forward and reverse directions are indicated above and below the middle line, respectively
Fig. 4Sequence comparison of thirteen Avena species, wheat, maize, rice, and Taraxacum amplum plastomes. The mVISTA based similarity graphical information portrays sequence identity to A. sativa_Liu312 as a reference plastome. Grey arrows above the alignment denote the gene orientation. A cut-off of 50% identity is used for the plots. In each plot, the Y-scale axis represents percent identity (50 to 100%). Dashed rectangles indicate highly divergent regions compared with the reference plastome
Fig. 5Nucleotide diversity (Pi) values using the aligned Avena plastome of (a) ten most polymorphic single copy genes and (b) ten most polymorphic intergenic regions. Regions are oriented according to the midpoint positions in plastome sequences with top 10 Pi values marked by red triangles
Fig. 6Circos plot based on complete chloroplast genome alignment between eleven Avena species presented here available in Genome Warehouse database and two Avena species available in NCBI (NC031650.1 and NC027468.1) with A. sativa_Liu312 as a reference. All data in rings showing the location relationship between tandem repeats, insertions/deletions (indels) and single nucleotide polymorphisms (SNPs) by R package circlize [35] in the non-overlapping 150 bp bins. For plastome alignment, rings “a-b” show the location of substitutions and SNPs, respectively. Dots with a high relative positions in rings “a-b” represent more polymorphic loci in the 150 bp window. Rings “c-d” show the location of deletions and insertions, respectively. Ring “e” shows tandem repeat (7 bp–95 bp) location by Phobos [31] with parameters of repeat length ≤ 100 bp and sequence identify ≥85%. The relatively height of columns in rings “c-e” represent the relative number of polymorphic loci belonging to deletions, insertions, or tandem repeats within the 150 bp window, respectively. Ring “f” shows A. sativa_Liu312 chloroplast map with coding genes labeled in green, rRNAs in yellow and tRNAs in blue. Blue shadows denote the surrounding location of tandem repeats and deletions; Red shadows denote the surrounding location of tandem repeats and insertions. LSC, SSC, IRs denote large single-copy, small single-copy, and inverted repeat regions of Avena plastome alignment. Total 702 tandem repeats, 141 insertions, 100 deletions, 992 SNPs, and 33 substitutions shown in the diagram
Spearman’s Rho correlation analysis result among tandem repeats, indels and SNPs using R v.3.5.3 [36] with correlation strengths of Akoglu [37] based on plastome alignments between eleven Avena species presented here available in Genome Warehouse database and two Avena species available in NCBI (NC031650.1 and NC027468.1) with A. sativa_Liu312 as a reference (150 bp windows)
| Tandem repeats and indels | Tandem repeats and SNPs | Indels and SNPs | |
|---|---|---|---|
| Rho | 0.3585 | 0.2607 | 0.2606 |
| 2.20 × 10−16*** | 1.48 × 10−15*** | 1.53 × 10− 15*** |
***Correlation was strongly significant at p < 0.01
Fig. 7Codon content of 20 amino acids and stop codons in protein-coding genes of Avena plastomes. The histogram on the left-hand side of each amino acid denotes codon usage within Avena plastomes, and the right-hand side denotes the codon RSCU [38] values. Colors correspond to codons listed underneath the columns
Fig. 8Maximum likelihood tree inferred from Avena plastome sequences. Two clades are identified: clade I includes A-genome diploids, tetraploid A. murphyi and hexaploid A. sativa, and clade II includes C-genome diploids. Node support denotes the maximum likelihood bootstrap value. Pink, red and green taxa correspond to A-, C-genome diploid and polyploid species in Avena, respectively