| Literature DB >> 34697247 |
Qin Qiao1, Patrick P Edger2, Li Xue3, Jie Lu1, Yichen Zhang1, Qiang Cao1, Alan E Yocca4,5, Adrian E Platts4, Steven J Knapp6, Marc Van Montagu7,8, Yves Van de Peer7,8,9,10, Jiajun Lei11, Ticao Zhang12.
Abstract
Strawberry (Fragaria spp.) has emerged as a model system for various fundamental and applied research in recent years. In total, the genomes of five different species have been sequenced over the past 10 y. Here, we report chromosome-scale reference genomes for five strawberry species, including three newly sequenced species' genomes, and genome resequencing data for 128 additional accessions to estimate the genetic diversity, structure, and demographic history of key Fragaria species. Our analyses obtained fully resolved and strongly supported phylogenies and divergence times for most diploid strawberry species. These analyses also uncovered a new diploid species (Fragaria emeiensis Jia J. Lei). Finally, we constructed a pan-genome for Fragaria and examined the evolutionary dynamics of gene families. Notably, we identified multiple independent single base mutations of the MYB10 gene associated with white pigmented fruit shared by different strawberry species. These reference genomes and datasets, combined with our phylogenetic estimates, should serve as a powerful comparative genomic platform and resource for future studies in strawberry.Entities:
Keywords: MYB transcription factors; comparative genomics; genetic differentiation; pan-genome; strawberry (Fragaria spp.)
Mesh:
Year: 2021 PMID: 34697247 PMCID: PMC8609306 DOI: 10.1073/pnas.2105431118
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Genome assembly and annotation of five newly sequenced species and F. iinumae (16)
| Assembly parameters |
|
|
|
|
|
|
| Mating system | SC | SC | SC | SI | SI | SI |
| Predicted genome size (Mb) | 265.56 | 305.90 | 290.79 | 265.75 | 282.33 | 229.61 |
| Predicted heterozygous | 0.18% | 0.23% | 0.15% | 1.31% | 1.02% | 0.61% |
| Illumina reads (250 bp) | 14.21G | 13.72G | / | 15.25G | 19.41G | / |
| Illumina reads (450 bp) | 14.21G | 17.76G | / | 15.17G | 14.74G | / |
| Pacbio reads | 45.77G | 48.97G | / | 35.46G | 37.71G | / |
| Illumina reads (350 bp) | / | / | 10.06G | / | / | 46.37G |
| Nanopore reads | / | / | 61.71G | / | / | 51.29G |
| Total reads | 74.19G | 80.45G | 71.77G | 65.88G | 71.86G | 97.66G |
| Total sequence coverage depths | 279.37 | 262.99 | 246.81 | 247.9 | 256.82 | 425.33 |
| Genome coverage ≥ 4X | 99.80% | 98.13% | 98.21% | 99.78% | 89.82% | 94.48% |
| Assembled genome size (MB) | 240.58 | 288.43 | 288.97 | 239.83 | 279.04 | 223.08 |
| Total number contigs | 94 | 425 | 870 | 291 | 726 | 382 |
| Length of contig N50 (MB) | 10.67 | 3.16 | 4.29 | 1.29 | 0.908 | 9.83 |
| Number of contig N50 | 8 | 28 | 20 | 59 | 96 | 8 |
| Length of contig N90 (MB) | 3.13 | 0.68 | 0.89 | 0.46 | 0.26 | 1.96 |
| Number of contig N90 | 22 | 93 | 75 | 176 | 314 | 26 |
| Anchored chromosomes Size (MB) | 239.09 | 255.41 | 266.88 | 219.81 | 252.82 | 211.2 |
| Percent of anchored chromosomes | 99.38% | 88.55% | 92.35% | 91.96% | 90.60% | 94.67% |
| GC content | 39.70% | 39.70% | 42.76% | 38.50% | 42.76% | 38.16% |
| Gene numbers | 23,665 | 24,491 | 28,131 | 25,411 | 23,853 | 24,779 |
| BUSCO assessment | 94.80% | 94.50% | 93.30% | 90.70% | 83.40% | 94.50% |
SC: self-compatible; SI: self-incompatibility.
Fig. 1.Evolution of seven diploid Fragaria genomes. (A) Circular representation of the comparative genome analysis of seven diploid Fragaria species (also see ). Collinear alignment was conducted with F. vesca as a reference. The outer layer of the colored blocks represents the seven chromosomes of Fragaria with tick marks every 5 Mb in size. Tracks displayed with seven genomes from outside to inside: F. vesca, F. daltoniana, F. iinumae, F. mandschurica, F. nilgerrensis, F. pentaphylla, F. viridis. The plots within each track exhibit densities of genes (red), RNA transposons (orange), DNA transposons (blue), and other types of genome components (white) from inside to outside, respectively. (B) 4DTv of in-paralogous (solid lines) and orthologous (dashed lines) genes of Fragaria spp. 4DTv of orthologous pairs between species are shown with different colors. (C) The flower plot displays the core orthogroups number (in the center), the orthogroups in a subset of species (in the annulus), and the species-specific orthogroups (in the petals) for the seven Fragaria species. (D) Modeling the core and dispensable Fragaria pan-genome. Square dots show the variation of shared gene content (core genome). Triangle dots show the variation of the species-specific gene content (dispensable genome). Curves are fitted separately for core and dispensable mean values shown in red. (E) Expanded and contracted gene families (Left), species-specific expanded and contracted gene families as well as positively selected genes (Right) in seven strawberry species.
Fig. 2.Geographic distribution and phylogeny of seven diploid Fragaria species and outgroup species. (Upper Left) Geographic distribution of seven diploid Fragaria species with available reference genomes from this and our previous study (15, 16). Mapped ranges are adapted from Global Biodiversity Information Facility, Chinese Virtual Herbarium, and our own collections. (Lower Right) The phylogenetic tree of Rosales with three species (C. sativa, M. notabilis Z. jujuba) used as outgroups. The estimated divergence times are shown above branch nodes. Red dots mark the fossil calibration points used to estimate the divergence times.
Fig. 3.Phylogenetic tree and structure of 128 samples from key diploid species in Fragaria. (A) ML tree of diploid species in Fragaria with two outgroups based on whole-genome polymorphisms. Black dots show the white fruit type accessions. (B) Structure bar plots showing the assignment probabilities. (C) Estimated gene flow between eight species. Heat-map colors represent the migration weight for each pairwise comparison. (D) Demographic history of eight species in Fragaria. FCH: F. chinensis; FDA: F. daltoniana; FEM: F. emeiensis; FII: F. iinumae; FMA: F. mandschurica; FNG: F. nilgerrensis; FNUr: red fruit type of F. nubicola; FNUw: white fruit type of F. nubicola; FPE: F. pentaphylla; FVE: F. vesca; FVI: F. viridis; XG: Xixiabangma glaciation.
Fig. 4.Evolution of the R2R3-MYB gene family in Fragaria. (A, Left) Phylogenetic tree of MYB1 and MYB 10. (Right) Conserved anthocyanin-promoting motif “RPRPRTF” of MYB10 and conserved anthocyanin-repressive motif “LNLDLTLSIKT” of MYB1. Similar residues are highlighted, with the homology level ranging from black (100% identity), grey (>75%), to light blue (>50%). (B) Nucleotide and amino acid variation of MYB10 between red and white fruits in four wild strawberry species. Fragaria × ananassa (Fa), F. daltoniana (FDA), F. iinumae (FII), F. mandschurica (FMA), F. nilgerrensis (FNG), F. pentaphylla (FPE), F. vesca (FVE), F. viridis (FVI), M. domestica (Md), R. chinensis (Rosa), and R. occidentalis (Rubus).