| Literature DB >> 28537264 |
Chenxi Xu1, Chen Jiao2, Honghe Sun2, Xiaofeng Cai1, Xiaoli Wang1, Chenhui Ge1, Yi Zheng2, Wenli Liu2, Xuepeng Sun2, Yimin Xu2, Jie Deng3, Zhonghua Zhang3, Sanwen Huang3, Shaojun Dai1, Beiquan Mou4, Quanxi Wang1, Zhangjun Fei1,2,5, Quanhua Wang1.
Abstract
Spinach is an important leafy vegetable enriched with multiple necessary nutrients. Here we report the draft genome sequence of spinach (Spinacia oleracea, 2n=12), which contains 25,495 protein-coding genes. The spinach genome is highly repetitive with 74.4% of its content in the form of transposable elements. No recent whole genome duplication events are observed in spinach. Genome syntenic analysis between spinach and sugar beet suggests substantial inter- and intra-chromosome rearrangements during the Caryophyllales genome evolution. Transcriptome sequencing of 120 cultivated and wild spinach accessions reveals more than 420 K variants. Our data suggests that S. turkestanica is likely the direct progenitor of cultivated spinach and spinach domestication has a weak bottleneck. We identify 93 domestication sweeps in the spinach genome, some of which are associated with important agronomic traits including bolting, flowering and leaf numbers. This study offers insights into spinach evolution and domestication and provides resources for spinach research and improvement.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28537264 PMCID: PMC5458060 DOI: 10.1038/ncomms15275
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Summary of spinach genome assembly.
| N90 | 1,554 | 71,235 | 5,121 | 6,093 | 11,883 | 3,878 |
| N80 | 4,762 | 40,488 | 81,129 | 2,246 | 103,609 | 1,409 |
| N70 | 8,537 | 27,590 | 155,870 | 1,489 | 205,174 | 730 |
| N60 | 12,418 | 19,540 | 229,174 | 1,033 | 395,765 | 370 |
| N50 | 16,570 | 13,759 | 319,471 | 711 | 919,290 | 201 |
| N25 | 31,281 | 4,483 | 626,780 | 218 | 3,106,702 | 51 |
| Longest | 185,618 | 1 | 3,292,865 | 1 | 9,343,782 | 1 |
| Total | 830,856,911 | 215,350 | 869,796,885 | 78,264 | 996,306,834 | 77,702 |
Only contigs and scaffolds ≥500 bp were included in the genome assembly.
Figure 1Spinach genome landscape.
(a) Ideogram of the six spinach pseudochromosomes (in Mb scale). (b) Gene density represented as number of genes per Mb. (c) Percentage of coverage of repeat sequences per Mb. (d) Transcription state. The transcription level was estimated by read counts per million mapped reads in 1-Mb windows. (e) GC content in 1-Mb windows. The six spinach pseudo-chromosomes represented 47% of the genome assembly. This figure was generated using Circos (http://circos.ca/).
Figure 2Comparative genomic analysis among spinach and other plant species.
(a) Phylogenetic relationship and gene clusters of 11 plant species. A maximum parsimony (MP) species tree (left) was constructed using protein sequences of the 2,047 single-copy genes. Bars (right) represent the number of genes in different categories for each species. Common: genes that are found in at least 10 of the 11 species. Monocots: genes that are only found in the two monocots, rice and Brachypodium; Eudicots: genes that are found in at least eight of nine eudicots but not in the two monocots; Caryophyllales: genes that are only found in spinach and sugar beet; Species-specific: genes with no homologues in other species. (b) Syntenic relationships between spinach and sugar beet genomes. (c) Ks distribution of homologous gene pairs in spinach, sugar beet and Arabidopsis. The probability density of Ks was estimated using the ‘density' function in R.
Figure 3Geographic distribution and population structure of the 120 spinach accessions.
(a) Geographic information of the 120 accessions. The number of samples is represented by the dot size on the world map. The red star indicates the suggested origin location of spinach, and the arrows suggest the domestication history of spinach. Commercial cultivars provided by Chinese seed companies are plotted in China and commercial cultivars provided by American/European companies are plotted in United States. (b) PCA plot of non-commercial spinach accessions. (c) Phylogenetic tree of all spinach accessions inferred from transcriptome SNPs, with S. tetrandra Sp42 and Sp43 as the outgroup. The pink arrow indicates the two S. tetrandra (Sp39 and Sp40) that were grouped to S. turkestanica and the blue arrows indicate the two S. turkestanica (Sp47 and Sp48) that were clustered with cultivars. (d) Model-based clustering analysis of the 51 non-commercial (right) and 69 commercial (left) spinach accessions, given different number of groups (K=2 to 6). The y axis quantifies subgroup membership, and the x axis shows different accessions. W1: S. tetrandra; W2: S. turkestanica; As1: East Asia; As2: Central/West Asia; Eu: Europe; Af: Africa; Na: North America; CH Co: companies in China; US & EU Co: companies in United States and Europe.
Figure 4Genome-wide scan of selective sweeps and GWAS of bolting.
(a) Distribution of XP-CLR scores across the spinach genomes. The black horizontal dashed line refers to the top 1% threshold. Arrows and the short interval indicate positions of known SNP markers and QTL, respectively, for different traits. (b) Manhattan plot of the GWAS for spinach bolting trait. The significance threshold (1 × 10−4) is indicated by the black horizontal dashed line.