| Literature DB >> 31796775 |
D Blaine Marchant1,2,3, Emily B Sessa4,5, Paul G Wolf6,7, Kweon Heo8, W Brad Barbazuk4,5, Pamela S Soltis9,5,10, Douglas E Soltis4,9,5,10.
Abstract
Ferns are notorious for possessing large genomes and numerous chromosomes. Despite decades of speculation, the processes underlying the expansive genomes of ferns are unclear, largely due to the absence of a sequenced homosporous fern genome. The lack of this crucial resource has not only hindered investigations of evolutionary processes responsible for the unusual genome characteristics of homosporous ferns, but also impeded synthesis of genome evolution across land plants. Here, we used the model fern species Ceratopteris richardii to address the processes (e.g., polyploidy, spread of repeat elements) by which the large genomes and high chromosome numbers typical of homosporous ferns may have evolved and have been maintained. We directly compared repeat compositions in species spanning the green plant tree of life and a diversity of genome sizes, as well as both short- and long-read-based assemblies of Ceratopteris. We found evidence consistent with a single ancient polyploidy event in the evolutionary history of Ceratopteris based on both genomic and cytogenetic data, and on repeat proportions similar to those found in large flowering plant genomes. This study provides a major stepping-stone in the understanding of land plant evolutionary genomics by providing the first homosporous fern reference genome, as well as insights into the processes underlying the formation of these massive genomes.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31796775 PMCID: PMC6890710 DOI: 10.1038/s41598-019-53968-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Ceratopteris genome assembly statistics.
| Cytometric Genome Size | 11.25 Gbp |
| Chromosome number | 39 |
| Meraculous Contigs | 15,871,274 contigs |
| Total Size | 4.21 Gbp |
| N50% | 300 bp |
| Gaps | 0 |
| % GC | 36 |
| 988,403 scaffolds | |
| Total Size | 2.69 Gbp |
| N50 | 3,376 bp |
| % Gaps | 0.5 |
| % GC | 36 |
| 626,576 scaffolds | |
| Total Size | 4.25 Gbp |
| N50 | 16,289 bp |
| % Gaps | 37 |
| % GC | 38 |
| 133,755 scaffolds | |
| Total Size | 2.79 Gbp |
| N50 | 22,401 bp |
| % Gaps | 44 |
| % GC | 38 |
| 35 scaffolds | |
| Total Size | 3.03 Mbp |
| N50 | 97,182 bp |
| % Gaps | 0 |
| % GC | 39 |
Figure 1Polyploidy analyses of three fern species. (A) Paralog-age distribution analyses and associated SiZER plots of three fern species. Upper panels are Ks-based histograms (0.05 bins) of paralogs in Ceratopteris richardii, Azolla filiculoides, and Equisetum giganteum. Lower panels are SiZER plots of the above paralog-age distribution data and associated smoothing functions where blue indicates significant (α = 0.05) increases, red significant decreases, purple insignificance, and gray too few data points to determine. The white lines show the effective window widths for each bandwidth. Both upper and lower panels are on the same x-axis. (B) MAPS analysis across land plants and the associated WGD events (shown as stars). The percentages of subtrees that contain gene duplications shared by the descendent species of a given node are above the phylogeny (connected by dotted lines). Dates are based on Testo and Sundue[70] and Morris et al.[67].
Figure 2Fluorescent in situ hybridizations of Ceratopteris chromosome spreads. The fluorescent probes are of 100–150 Kbp DNA fragments from BACs of Ceratopteris. Primary “diploid” localizations (red bands labeled with arrows) are shown in all four panels, while weak secondary localizations, most likely reflecting repetitive elements, are apparent in (C); note scattered faint red staining in addition to the two strong primary signals. BACs are from wells A12 (A), B3 (B), A8 (C), and B9 (D) in Plate CR_Ba #624, Green Plant BAC Library Project, Clemson University Genomics Institute.
Figure 3Repeat composition, lengths, and insertion timing for representative embryophyte genome assemblies. (A) Genome proportions of repetitive and non-repetitive elements for seven taxa spanning land plants, as well as BAC.SubSample, using genome-based analyses. Genome sizes and N50s for analyzed genome assemblies are also provided. (B) Mean repeat element lengths based on genome assembly analyses (A) for seven embryophyte taxa and BAC.SubSample. (C) Genome proportion of repetitive and non-repetitive elements using read-based clustering analyses[111]. (D) LTR RT insertion dates in Ceratopteris based on the CFern v1.1A and BAC.SubSample assemblies. Insertion dates were inferred from the similarity of long terminal repeat regions of the LTR RTs and a neutral substitution rate of 6.5 × 10−9 per site per year.
Ceratopteris repeat diversity and composition.
| Class | Order | Superfamily | Element count | Length (bp) | % Genome |
|---|---|---|---|---|---|
| Retrotransposon | LINE | Uncategorized | 1458 | 523887 | 0.02 |
| RTE-BovB | 421 | 128815 | 0.00 | ||
| Jockey | 434 | 127827 | 0.00 | ||
| R1 | 2565 | 1241869 | 0.04 | ||
| RTE-X | 8733 | 3937956 | 0.14 | ||
| L2 | 19047 | 3652063 | 0.13 | ||
| L1-Tx1 | 23635 | 15522256 | 0.56 | ||
| L1 | 47105 | 20464731 | 0.73 | ||
| LTR | Uncategorized | 23507 | 7859618 | 0.28 | |
| DIRS | 361 | 45453 | 0.00 | ||
| Pao | 1494 | 462814 | 0.02 | ||
| Gypsy-Troyka | 5331 | 3393861 | 0.12 | ||
| ERV1 | 8083 | 6191165 | 0.22 | ||
| Gypsy | 329706 | 207014935 | 7.42 | ||
| Copia | 812470 | 460237954 | 16.50 | ||
| DNA Transposon | Uncategorized | 3289 | 693293 | 0.02 | |
| hAT-Tip100 | 374 | 251768 | 0.01 | ||
| CMC-Mirage | 416 | 153347 | 0.01 | ||
| MULE-MuDR | 425 | 41133 | 0.00 | ||
| TcMar | 530 | 83430 | 0.00 | ||
| hAT-hATw | 622 | 343180 | 0.01 | ||
| Harbinger | 627 | 224160 | 0.01 | ||
| PiggyBac | 1230 | 122417 | 0.00 | ||
| Dada | 1981 | 1130399 | 0.04 | ||
| CMC-EnSpm | 2339 | 1276711 | 0.05 | ||
| Sola | 2739 | 1565090 | 0.06 | ||
| hAT | 2774 | 819873 | 0.03 | ||
| Maverick | 4082 | 2443709 | 0.09 | ||
| hAT-Ac | 4625 | 1727566 | 0.06 | ||
| PIF-Harbinger | 4982 | 1214259 | 0.04 | ||
| En-Spm | 10115 | 5487072 | 0.20 | ||
| hAT-Tag1 | 15720 | 6076071 | 0.22 | ||
| Helitron | 4260 | 812693 | 0.03 | ||
Genome composition and LTR-RT statistics in sampled land plant genomes.
| Genome Size | 0.11 | 0.2 | 0.48 | 0.87 | 2.1 | 11.25 | 20 |
| % GC | 45.3 | 42 | 33.7 | 37.5 | 46.9 | 37.7 | 37.6 |
| N50 (Kbp) | 1750 | 1366 | 17435 | 4927 | 217959 | 22 | 8 |
| Recent LTR RT | 166 | 30 | 1217 | 11 | 4561 | 22 | 509 |
| Ancient LTR RT | 33 | 24 | 16 | 55 | 45 | 82 | 276 |