| Literature DB >> 32709963 |
Honghao Lv1, Yong Wang1, Fengqing Han1, Jialei Ji1, Zhiyuan Fang1, Mu Zhuang1, Zhansheng Li1, Yangyong Zhang2, Limei Yang3.
Abstract
Cabbage (Brassica oleracea var. capitata) is an important vegetable crop widely grown throughout the world, providing plentiful nutrients and health-promoting substances. To facilitate further genetics and genomic studies and crop improvement, we present here a high-quality reference genome for cabbage. We report a de novo genome assembly of the cabbage double-haploid line D134. A combined strategy of single-molecule real-time (SMRT) sequencing, 10× Genomics and chromosome conformation capture (Hi-C) produced a high quality cabbage draft genome. The chromosome-level D134 assembly is 529.92 Mb in size, 135 Mb longer than the current 02-12 reference genome, with scaffold N50 length being raised as high as 38 times. We annotated 44,701 high-quality protein-coding genes, and provided full-length transcripts for 45.59% of the total predicted gene models. Moreover, we identified novel genomic features like underrated TEs, as well as gene families and gene family expansions and contractions during B. oleracea evolution. The D134 draft genome is a cabbage reference genome assembled by SMRT long-read sequencing combined with the 10× Genomics and Hi-C technologies for scaffolding. This high-quality cabbage reference genome provides a valuable tool for improvement of Brassica crops.Entities:
Mesh:
Year: 2020 PMID: 32709963 PMCID: PMC7381634 DOI: 10.1038/s41598-020-69389-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Statistics of the D134 assembly.
| D134 final assembly (including unanchored contigs) | 10×-genomics improved assembly (preliminary assembly 1) | Hi-C improved assembly (preliminary assembly 2) | |
|---|---|---|---|
| Total assembly length(Mb) | 574.91 | 575.74 | 575.39 |
| Longest scaffold (Mb) | 74.84 | 19.76 | 71.59 |
| Number of contigs | 902 | 870 | 883 |
| N50 contig length (Mb) | 3.17 | 3.68 | 3.30 |
| Number of scaffolds | 682 | 757 | 695 |
| N50 scaffold length (Mb) | 57.73 | 8.13 | 56.58 |
| N90 scaffold length (Mb) | 48.45 | 1.20 | 46.34 |
Comparison of basic sequence statistics among D134, 02-12, TO1000 and HDEM.
| D134 (cabbage) | 02-12 (cabbage) | TO1000 (kale-like) | HDEM (broccoli) | |
|---|---|---|---|---|
| Sequenced genome size (Mb) | 529,919,152 | 514,430,932 | 488,954,160 | 554,977,060 |
| N50 contig length (Mb) | 3,591,417 | 28,316 | 21,938 | 9,813,309 |
| N50 scaffold length (Mb) | 61,740,326 | 1,419,759 | 48,366,697 | 58,257,932 |
| Maximum size (Mb) | 74,838,096 | 7,482,359 | 64,984,695 | 73,711,317 |
| Gaps | 329,204 | 37,382,998 | 40,561,975 | 9,959,204 |
| BUSCO (complete) | 95.7% | 96.3% | 95.8% | 95.7% |
Figure 1Distribution of genes in cabbage D134 and other representative plant species. (A) Orthologous genes found in different plant species. Ath, A. thaliana; Bdi, B. distachyon; Bol0, B. oleracea (D134); Bol1, B. oleracea (02-12); Bol2, B. oleracea (TO1000); Bra, B. rapa; Cru, C. rubella; Cpa, C. papaya; Csa, C. sativus; Dca, D. carota; Gra, G. raimondii; Hvu, H. vulgare; Osa, O. sativa; Ptr, P. trichocarpa; Rsa, R. sativus; Sly, S. lycopersicum; Vvi, V. vinifera; Zma, Z. mays. (B) Venn diagram showing unique and shared gene families among A. thaliana, B. rapa and B. oleracea (D134 and TO1000).
Figure 2Phylogenetic tree showing divergence times and the evolution of gene family sizes. The phylogenetic tree shows the topology and divergence times for 18 plant species. MRCA, most recent common ancestor. The number in parentheses is the number of gene families in the MRCA as estimated by CAFÉ.
Figure 3Genomic landscape of D134 and 02-12. Chromosomes, gene density, TE density, SNP density, indel density and best-hit gene pairs are in order from outside to inside in the Circos images.