| Literature DB >> 24852848 |
Shengyi Liu1, Yumei Liu2, Xinhua Yang3, Chaobo Tong1, David Edwards4, Isobel A P Parkin5, Meixia Zhao6, Jianxin Ma7, Jingyin Yu8, Shunmou Huang8, Xiyin Wang9, Junyi Wang10, Kun Lu11, Zhiyuan Fang12, Ian Bancroft13, Tae-Jin Yang14, Qiong Hu8, Xinfa Wang8, Zhen Yue10, Haojie Li15, Linfeng Yang10, Jian Wu12, Qing Zhou10, Wanxin Wang12, Graham J King16, J Chris Pires17, Changxin Lu10, Zhangyan Wu10, Perumal Sampath14, Zhuo Wang10, Hui Guo18, Shengkai Pan10, Limei Yang12, Jiumeng Min10, Dong Zhang18, Dianchuan Jin19, Wanshun Li10, Harry Belcram20, Jinxing Tu21, Mei Guan22, Cunkou Qi23, Dezhi Du24, Jiana Li11, Liangcai Jiang15, Jacqueline Batley25, Andrew G Sharpe26, Beom-Seok Park27, Pradeep Ruperao28, Feng Cheng12, Nomar Espinosa Waminal29, Yin Huang10, Caihua Dong8, Li Wang19, Jingping Li18, Zhiyong Hu8, Mu Zhuang12, Yi Huang8, Junyan Huang8, Jiaqin Shi8, Desheng Mei8, Jing Liu8, Tae-Ho Lee18, Jinpeng Wang19, Huizhe Jin18, Zaiyun Li21, Xun Li22, Jiefu Zhang23, Lu Xiao24, Yongming Zhou21, Zhongsong Liu22, Xuequn Liu30, Rui Qin30, Xu Tang18, Wenbin Liu10, Yupeng Wang18, Yangyong Zhang12, Jonghoon Lee14, Hyun Hee Kim31, France Denoeud32, Xun Xu10, Xinming Liang10, Wei Hua8, Xiaowu Wang12, Jun Wang33, Boulos Chalhoub20, Andrew H Paterson18.
Abstract
Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24852848 PMCID: PMC4279128 DOI: 10.1038/ncomms4930
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Summary of genome assembly and annotation of B. oleracea.
|
| ||||
|---|---|---|---|---|
| |
|
|
|
|
| Contig size (bp) | 3,527 | 26,828 | 199,461 | 502,114,421 |
| Contig number | 22,669 | 5,425 | ||
| Total number(>2 kb): 27,351 | ||||
| Scaffold size (bp) | 258,906 | 1,457,055 | 8,788,225 | 539,907,250 |
| Scaffold number | 388 | 224 | Anchored to chr. 72% | |
| Total number(>2 kb): 1,809 | ||||
*WG, whole genome, 20 × coverage reads were randomly sampled from all the genomic short reads libraries.
†The copy number of TEs was from the RepeatMasker results.
‡The ungapped regions were used to detect the percentage of TEs in the assembly. TE sizes are from the ungapped regions of B. oleracea 477,847,347 bp.
§Estimated by public Brassica ESTs and RNA-seq data.
Figure 1Genomic structure and gene retention rates in syntenic regions of B. oleracea and B. rapa.
(a) Segmental colinearity of the genomes of B. oleracea, B. rapa and A. thaliana. Syntenic blocks are defined and labelled from A to X (coloured) previously reported in A. thaliana20. (b) Time estimate of WGD and subsequent two Brassica species divergence. (c) Pattern of retention/loss of orthologous genes on each set of three subgenomic (LF, MF1 and MF2) blocks of B. oleracea and B. rapa corresponding to A. thaliana A to X blocks. The x axis denotes the physical position of each A. thaliana gene locus. The y axis denotes the proportion of orthologous genes retained in the B. oleracea and B. rapa subgenomic blocks around each A. thaliana gene, where 500 genes flanking each side of a certain gene locus were analysed, giving a total window size of 1,001 genes.
Figure 2TE comparison analyses in B. oleracea and B. rapa.
(a) TE copy number and total length in each assembly and B. oleracea–B. rapa syntenic blocks. (b) The number of intact LTR (Copia-like and Gypsy-like) birthed at different times (million years ago, MYA) in the syntenic regions of B. oleracea and B. rapa. (c) The comparison of TE distribution and composition in B. oleracea–B. rapa syntenic blocks along B. oleracea chromosomes. We divided B. oleracea–B. rapa syntenic region into non-overlapping sliding 200 kb windows to compare TE contents. For each window, the ratio log10(B. oleracea/B. rapa) was calculated for total syntenic block length (blue line), LTR length (purple line), gene length (yellow point), exons length (red point) and intron length (green point). If B. oleracea > B. rapa in absolute length of TE composition in a compared window, the dot or line is above the line y=0. The corresponding B. rapa chromosome segments along B. oleracea C08 were indicated by coloured bars. All other B. oleracea chromosomes are showed in Supplementary Fig. 31. (d) Phylogeny of the Copia-like elements as an example of LTR-RTs of the syntenic regions in B. rapa and B. oleracea. The neighbor-joining (NJ) trees were generated based on the conserved RT domain nucleotide sequences using the Kimura two-parameter method68 in MEGA4 (ref. 69).
Figure 3The duplicated genes derived from tandem duplication and whole-genome duplications in Brassica genomes.
(a) A Venn diagram showing shared and specific tandem duplication events in A. thaliana, B. rapa and B. oleracea. (b,c) Distribution of tandem genes and WGT/WGD-derived paralogues in the KEGG pathway maps in B. oleracea (bol), B. rapa (bra) and A. thaliana (ath). For each KEGG pathway map, the proportion of the number of duplicated genes or paralogues to the total genes was calculated (x axis) and the number of maps whose tandem gene proportion fell in a range was shown on the y axis. (d) Oxidative phosphorylation pathway enriched by WGT-derived paralogous genes in the Brassica genomes. The gene copy number for each KO enzyme in B. oleracea, B. rapa and A. thaliana were shown (dash-connected) under the KO enzyme number.
Figure 4Divergence of Brassica paralogous and orthologous genes in B. oleracea and B. rapa.
(a) Genome-wide gene conversion in B. oleracea. The conversion in B. rapa is showed in Supplementary Fig. 64. (b) The ratio of differentially expressed duplicated gene pairs derived from different duplications: alpha whole-genome duplication (α-WGD), Brassiceae-lineage WGT, tandem duplication (TD). Bol, B. oleracea; Bra, B. rapa. C: callus; R: root; St: stem; L: leaf; F: flower; Si: silique. The differentially expressed duplicated gene pairs were defined as fold change >2 and false discovery rate (FDR) <0.05 or gene pair where expression was detected for only one gene within gene pairs (FDR <0.05). (c) Box and whisker plots for differentiated expression for three subgenomes (LF, MF1 and MF2) in flower tissue of B. oleracea and B. rapa. For the other tissues, see Supplementary Fig. 67. (d) The duplicated gene pairs belonging to transcription factors (TFs) and its related GO terms contain a significantly lower ratio of differentially expressed duplicated gene pairs than the average at the genome-wide level in leaf (values given) and other tissues (values not presented) (Supplementary Table 45). (e) The GO terms (left) in which the duplicated gene pairs contain a significantly higher ratio of differentially expressed duplicated gene pairs than the average ratio at the genome-wide level in leaf and other tissues (Supplementary Table 46). Values from one tissue were presented and the other tissues were indicated with abbreviated letters to the right if expression in these tissues is significantly higher. (f) Expression variation caused by divergence (either different variants or differential expression of the same variants) of alternative splicing (AS) variants in WGT paralogous gene pairs with identical numbers of exons and in Bol–Bra orthologous gene pairs. IRES denotes types of intron retention and exon skipping.
Figure 5Whole-genome-wide comparison of genes involved in glucosinolate metabolism pathways in B. oleracea and its relatives.
(a) Aliphatic and indolic GSL biosynthesis and catabolism pathways in A. thaliana, B. oleracea and B. rapa. The copy number of GSL biosynthetic genes in A. thaliana, B. rapa and B. oleracea are listed in square brackets, respectively. Potential anticancer substances/precursors are highlighted in blue bold. Two important amino acid chain elongation and side-chain modification loci MAMs and AOP2 are highlighted in red bold, within the number in the green bracket representing the number of non-functional genes. (b,c) The neighbour-joining (NJ) trees of MAM and AOP genes were generated based on the aligned coding sequences and 100 bootstrap repeats. The silenced genes are indicated by red hollow circle, expressed functional genes are represented by red solid disc and green rectangle. In A. thaliana ecotype Columbia there are just MAM1 and MAM3. (d) Three B. oleracea AOP2 loci among which are one functional AOP2 and two mutated AOP2. 1MOI3M: 1-methoxyindol-3-ylmethyl GSL; 1OHI3M: 1-hydroxyindol-3-ylmethyl GSL; 3MSOP: 3-methylsulfinylpropyl GSL; 3MTP: 3-methylthiopropyl GSL; 3PREY: 2-Propenyl GSL; 4BTEY: 3-butenyl GSL; 4MOI3M: 4-methoxyindol-3-ylmethyl GSL; 4OHB, 4-hydroxybutyl GSL; 4OHI3M: 4-hydroxyindol-3-ylmethyl GSL; 4MSOB: 4-methylsulfinylbutyl GSL; 4MTB, 4-methylthiobutyl GSL; AITC: allyl isothiocyanate; I3C: indole-3-carbinol; I3M: indolyl-3-methyl GSL; DIM: 3,3′-diindolymethane; MAM: methylthioalkylmalate; AOP: 2-oxoglutarate-dependent dioxygenase.