| Literature DB >> 32282915 |
Chaehee Lee1, Tracey A Ruhlman1, Robert K Jansen1,2.
Abstract
Plastid genomes (plastomes) of land plants have a conserved quadripartite structure in a gene-dense unit genome consisting of a large inverted repeat that separates two single copy regions. Recently, alternative plastome structures were suggested in Geraniaceae and in some conifers and Medicago the coexistence of inversion isomers has been noted. In this study, plastome sequences of two Cyperaceae, Eleocharis dulcis (water chestnut) and Eleocharis cellulosa (gulf coast spikerush), were completed. Unlike the conserved plastomes in basal groups of Poales, these Eleocharis plastomes have remarkably divergent features, including large plastome sizes, high rates of sequence rearrangements, low GC content and gene density, gene duplications and losses, and increased repetitive DNA sequences. A novel finding among these features was the unprecedented level of heteroplasmy with the presence of multiple plastome structural types within a single individual. Illumina paired-end assemblies combined with PacBio single-molecule real-time sequencing, long-range polymerase chain reaction, and Sanger sequencing data identified at least four different plastome structural types in both Eleocharis species. PacBio long read data suggested that one of the four E. dulcis plastome types predominates.Entities:
Keywords: RDR; chloroplast; homologous recombination; plastid genome; rearrangement; repeat
Mesh:
Year: 2020 PMID: 32282915 PMCID: PMC7426004 DOI: 10.1093/gbe/evaa076
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Schematic representations of Eleocharis dulcis and E. cellulosa plastomes. (A) Unit-genome maps and repetitive DNA content of E. dulcis and E. cellulosa plastomes. Completed Eleocharis plastome sequences were submitted to OGDRAW (Lohse et al. 2013) to generate physical maps and Circoletto (Darzentas 2010) to visualize repetitive DNA. Structural type and plastome size (in parentheses) are shown below the species name. Syntenic blocks (numbered 1–6) detected by progressiveMauve (Darling et al. 2010) are depicted by open boxes; the negative symbol (–) indicates reverse-oriented strands relative to the reference (Typha). Red boxes indicate blocks that that vary in order among different structural types, whereas blocks encompassed by black boxes maintain their positions across types. Genes indicated in red font are located near the end of each syntenic block and were employed as annealing sites for PCR confirmation of plastome arrangements. The vertical black bar at the top of each linear map is provided for scale. The circular representation of each species is below each linear map with syntenic block numbers shown. Dispersed repeats and IR are shown within the circular map in blue and red, respectively. (B) Each syntenic block is illustrated with a different color and follows the same numbering convention as in (A). Representative genes near the end of each syntenic block are shown at the top. Gene symbols in white font indicate those in long-range PCR and correspond to the red gene symbols in (A). Different syntenic block arrangements are highlighted in red boxes for both Eleocharis species. IR, inverted repeat; SSC, small single copy region.
Verification of Multiple Structural Types in Eleocharis Plastomes
| Species | Type | Syntenic Block | Primer F | Primer R | Size (bp) | PCR | PCR Band Size (bp) | Sanger Seq. | PacBio Seq. |
|---|---|---|---|---|---|---|---|---|---|
|
| 1 | 2 – 3 | E_petN | E_petD | 8,188 | + | 5,500/6,200/8,100 | + | |
| 3 – (−4) | E_rps2 | E_rpoB | 4,842 | + | ∼5,000 | + | |||
| (−4) – (−5) | E_rpoC2 | E_rpl20 | 4,368 | + | ∼4,300 | + | |||
| (−5) – 6 | E_psbK | E_rps8 | 8,550 | + | ∼8,500 | ||||
| 2 | 2 – 5 | E_petN | E_psbK | 11,127 | + |
| + | ||
| 5 – (−3) | E_rpl20 | E_rps2 | 4,381 | + | ∼4,300 | + | + | ||
| (−3) – 4 | E_petD | E_rpoC2 | 5,261 | + |
| + | + | ||
| 4 – 6 | E_rpoB | E_rps8 | 5,181 | + | ∼5,000 | + | + | ||
| 3 | 2 – 5 | E_petN | E_psbK | 11,128 | + |
| + | ||
| 5 – 4 | E_rpl20 | E_rpoC2 | 4,368 | + | ∼4,300 | + | |||
| 4 – (−3) | E_rpoB | E_rps2 | 4,842 | + | ∼5,000 | + | |||
| (−3) – 6 | E_petD | E_rps8 | 5,610 | + | ∼5,500 | + | |||
| 4 | 2 – (−4) | E_petN | E_rpoB | 4,869 | + | ∼4,000 | + | ||
| (−4) – 3 | E_rpoC2 | E_petD | 5,257 | + | ∼4,300 | + | + | ||
| 3 – (−5) | E_rps2 | E_rpl20 | 4,381 | + | ∼4,300 | + | + | ||
| (−5) – 6 | E_psbK | E_rps8 | 8,550 | + | ∼8,500 | ||||
|
| 1 | (−2) – 3 | E_ndhJ | E_petD | 5,387 | + | ∼5,400 | n/a | n/a |
| 3 – (−4) | E_rps2 | E_rpoB | 4,816 | + | ∼5,000 | n/a | n/a | ||
| (−4) – 5 | E_rpoC2 | E_psbK | 9,000 | + | ∼8,500 | n/a | n/a | ||
| 5 – 6 | E_rpl20 | E_rps8 | 3,025 | + | ∼4,500 | n/a | n/a | ||
| 2 | (−2) – 5 | E_ndhJ | E_psbK | 8,326 | n/a | n/a | |||
| 5 – 4 | E_rpl20 | E_rpoC2 | 2,685 | + | ∼2,700 | n/a | n/a | ||
| 4 – (−3) | E_rpoB | E_rps2 | 4,846 | + | ∼4,500 | n/a | n/a | ||
| (−3) – 6 | E_petD | E_rps8 | 6,400 | n/a | n/a | ||||
| 3 | (−2) – 3 | E_ndhJ | E_petD | 5,387 | + | ∼5,000 | n/a | n/a | |
| 3 – (−4) | E_rps2 | E_rpoB | 4,816 | + | ∼4,500 | n/a | n/a | ||
| (−4) – (−5) | E_rpoC2 | E_rpl20 | 2,685 | + | ∼2,700 | n/a | n/a | ||
| (−5) – 6 | E_psbK | E_rps8 | 9,339 | n/a | n/a | ||||
| 4 | (−2) – 3 | E_ndhJ | E_petD | 5,387 | + | ∼5,000 | n/a | n/a | |
| 3 – (−5) | E_rps2 | E_rpl20 | 4,346 | + | ∼4,300 | n/a | n/a | ||
| (−5) – 4 | E_psbK | E_rpoC2 | 9,001 | + | ∼8,000 | n/a | n/a | ||
| 4 – 6 | E_rpoB | E_rps8 | 3,495/2,312 | + | ∼5,000 | n/a | n/a |
Note.—+ indicates that adjacencies of syntenic blocks were confirmed. F, forward; R, reverse; n/a, not available.
Entire junction was sequenced.
Band size in bold indicates the one with higher intensity.
Boundaries of junction were sequenced.
Almost entire junction was sequenced.
Summary of Major Features of Eleocharis, Hypolytrum, and Basal Poales Plastid Genomes
| Family | Cyperaceae | Bromeliaceae | Typhaceae | ||
|---|---|---|---|---|---|
| Taxon |
|
|
|
|
|
| Genome size (bp) | 199,561 | 193,234 | 180,648 | 159,636 | 161,572 |
| LSC (% of genome) | 117,896 (59.1) | 112,881 (58.4) | 95,644 (52.9) | 87,482 (54.8) | 89,140 (55.2) |
| SSC (% of genome) | 9,601 (4.8) | 10,311 (5.3) | 8,150 (4.5) | 18,622 (11.7) | 19,652 (12.2) |
| IR (% of genome) | 36,032 (18.1) | 35,021 (18.1) | 38,427 (21.3) | 26,766 (16.8) | 26,390 (16.3) |
| Total number of genes | 139 | 132 | 135 | 131 | 131 |
| Number of unique genes | 105 | 105 | 110 | 113 | 113 |
| Number of unique protein-coding genes (duplicated in IR) | 72 (14) | 72 (12) | 76 (12) | 79 (6) | 79 (6) |
| Number of unique tRNA genes (duplicated in IR) | 29 (10) | 29 (9) | 30 (8) | 30 (8) | 30 (8) |
| Number of unique rRNA genes (duplicated in IR) | 4 (4) | 4 (4) | 4 (4) | 4 (4) | 4 (4) |
| Number of genes with introns | 17 | 17 | 18 | 18 | 18 |
| GC content (%) | 32.6 | 32.8 | 34.9 | 37.4 | 36.6 |
| GC content of IR/LSC/SSC (%) | 37.6/30.2/25.5 | 37.8/30.4/25.8 | 38.5/32.6/28.1 | 42.7/35.4/31.4 | 42.4/34.4/30.5 |
| Genic DNA (% of genome, GC [%]) | 76,207 (38.2, 38.4) | 74,394 (38.5, 38.5) | 98,905 (54.8, 38.1) | 91,317 (57.2, 40.2) | 91,212 (56.4, 39.9) |
| Intergenic spacers (% of genome, GC [%]) | 123,354 (61.8, 29.1) | 118,840 (61.5, 29.3) | 81,743 (45.2, 31.1) | 68,611 (42.8, 33.7) | 70,541 (43.6, 32.4) |
| Gene density | 0.70 | 0.68 | 0.75 | 0.82 | 0.81 |
| Putative gene losses |
|
|
| — | — |
| Putative gene duplications |
|
|
| — | — |
Note.—The number in ( ) in putative gene duplications indicates the number of copies. Asterisk (*) on rpl36 gene indicates gene duplication in only plastome type 4 of E. cellulosa.
. 2.—Eleocharis plastomes exhibit atypical size and GC content. Plastome size and GC content of 3,656 angiosperms in the NCBI Genome database were plotted. Parasitic (yellow) and autotrophic (teal) species are indicated with different colors. The two Eleocharis plastomes are labeled with red font. Other Cyperaceae and larger plastomes, including Pelargonium and Annona cherimola, are indicated with teal font. The Erodium plastomes with high GC content are also represented with a teal label. bp, basepairs.
. 3.—Whole-plastome alignment of five Poales species. Newly completed plastomes of Eleocharis dulcis and E. cellulosa and publicly available basal Cyperaceae and Poales plastomes from NCBI (Hypolytrum nemorum, Ananas comosus, and Typha latifolia) were analyzed by progressiveMauve to identify LCBs with the Typha plastome as a reference. One copy of the inverted repeat was removed before the analysis and numerals at top indicate size in kilobases (kb). The corresponding LCBs among five plastomes are shaded and connected with a line of the same color. The histogram inside each block shows pairwise nucleotide sequence identity. LCBs that are flipped across the plane indicate an inverted strand.
Pairwise Comparison of Breakpoint and Reversal Distances for Eleocharis, Hypolytrum, and Basal Poales Plastomes
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
| — | ||||
|
| 0/0 | — | |||
|
| 11/7 | 11/7 | — | ||
|
| 25/20 | 25/20 | 16/13 | — | |
|
| 24/21 | 24/21 | 16/14 | 5/4 | — |
Statistics of Dispersed and Tandem Repeats in Eleocharis, Hypolytrum, and Basal Poales Plastomes
| Family | Cyperaceae | Bromeliaceae | Typhaceae | ||
|---|---|---|---|---|---|
| Species |
|
|
|
|
|
| Genome size (IRa excluded) | 163,529 | 158,213 | 142,221 | 132,862 | 134,642 |
| GC % | 31.6 | 31.8 | 34 | 36.3 | 35.5 |
| Dispersed repeats (DRs) | |||||
| Length of DR | 39,752 | 31,118 | 12,520 | 1,495 | 1,210 |
| GC % of DR | 30.9 | 30.2 | 32.6 | 35.9 | 33.7 |
| GC % without DR | 31.8 | 32 | 33.9 | 36.3 | 35.5 |
| % of DR in genome | 24.3 | 19.7 | 8.8 | 1.1 | 0.9 |
| Tandem repeats (TRs) | |||||
| Length of TR | 5,864 | 2,833 | 6,638 | 2,057 | 3,270 |
| GC % of TR | 25.9 | 25.5 | 27.8 | 18.4 | 13.5 |
| GC % without TR | 31.8 | 31.9 | 34.3 | 36.6 | 36 |
| % of TR in genome | 3.6 | 1.8 | 4.7 | 1.5 | 2.4 |
| Total repeats | |||||
| Length of total repeats | 42,216 | 32,695 | 16,718 | 3,552 | 4,436 |
| GC % of total repeats | 30.4 | 29.9 | 31.1 | 25.7 | 19.1 |
| GC % without total repeats | 32 | 32.1 | 34.3 | 36.6 | 36 |
| % of total repeats in genome | 25.8 | 20.7 | 11.8 | 2.7 | 3.3 |
. 4.—Repetitive DNA content in five Poales plastomes. (A) The number of dispersed repeats in different size classes. (B) The proportion of plastome that represents dispersed repeats in different size classes. bp, basepairs.