| Literature DB >> 32377365 |
Min-Jie Hu1, Wei-Hong Sun2,3, Wen-Chieh Tsai4, Shuang Xiang2,3, Xing-Kai Lai5, De-Qiang Chen2,3, Xue-Die Liu2, Yi-Fan Wang2, Yi-Xun Le2, Si-Ming Chen2,6, Di-Yang Zhang3, Xia Yu3, Wen-Qi Hu3, Zhuang Zhou3, Yan-Qiong Chen3, Shuang-Quan Zou2,3, Zhong-Jian Liu3,7.
Abstract
The mangrove Kandelia obovata (Rhizophoraceae) is an important coastal shelterbelt and landscape tree distributed in tropical and subtropical areas across East Asia and Southeast Asia. Herein, a chromosome-level reference genome of K. obovata based on PacBio, Illumina, and Hi-C data is reported. The high-quality assembled genome size is 177.99 Mb, with a contig N50 value of 5.74 Mb. A large number of contracted gene families and a small number of expanded gene families, as well as a small number of repeated sequences, may account for the small K. obovata genome. We found that K. obovata experienced two whole-genome polyploidization events: one whole-genome duplication shared with other Rhizophoreae and one shared with most eudicots (γ event). We confidently annotated 19,138 protein-coding genes in K. obovata and identified the MADS-box gene class and the RPW8 gene class, which might be related to flowering and resistance to powdery mildew in K. obovata and Rhizophora apiculata, respectively. The reference K. obovata genome described here will be very useful for further molecular elucidation of various traits, the breeding of this coastal shelterbelt species, and evolutionary studies with related taxa.Entities:
Keywords: Evolution; Genome
Year: 2020 PMID: 32377365 PMCID: PMC7195387 DOI: 10.1038/s41438-020-0300-x
Source DB: PubMed Journal: Hortic Res ISSN: 2052-7276 Impact factor: 6.793
Fig. 1Morphological features of the flower and fruit of K. obovata.
a K. obovata trees in a coastal wetland. b Flowers. c Young fruits. d Cone-like fruits
The statistical results of Hi-C assembly
| Assembly | Size (bp) |
|---|---|
| Illumina sequencing assembly | |
| Scaffold N50 | 279,548 |
| Scaffold N90 | 28,239 |
| Longest Scaffold | 1,696,757 |
| Total Scaffold length | 178,438,058 |
| PacBio sequencing assembly | |
| Contig N50 | 5,743,053 |
| Contig N90 | 2,939,642 |
| Longest Contig | 13,452,090 |
| Total Contig length | 177,986,124 |
| BUSCO | 97.3% |
| Hi-C assembly | |
| Scaffold N50 | 10,026,007 |
| Scaffold N90 | 7,500,541 |
| Longest Contig | 13,797,742 |
| Total Contig length | 178,014,124 |
Fig. 2Intensity signal heat map of the Hi-C chromosome
Fig. 3The expansion and contraction of gene families.
The green number indicates the number of expanded gene families, and the red number indicates the number of contracted gene families. The blue color in the circle shows the gene families whose copy numbers are constant, while the orange color represents the proportion of 11,968 gene families in the most recent common ancestor that have expanded or contracted during late differentiation
Fig. 4Ks distributions between K. obovata and R. apiculata and K. obovata and V. vinifera and within K. obovata and R. apiculata.
Peaks of intraspecies Ks distributions indicate ancient whole-genome polyploidization events, and peaks of interspecies Ks distributions indicate speciation events
Fig. 5Collinear point diagram and Ks values corresponding to the collinear blocks.
a The collinear point diagram of K. obovata. b Distribution of log10 (Ks) values of the collinear blocks in K. obovata. c The collinear point diagram of R. apiculata. d Distribution of log10 (Ks) values of the collinear blocks in R. apiculata. The ordinate of b, d is the number of gene pairs corresponding to the Ks value, and the abscissa is the log10 (Ks) value
MADS-box genes in Arabidopsis thaliana, Oryza sativa, Phalaenopsis equestris, K. obovata, and R. apiculata
| Category | |||||
|---|---|---|---|---|---|
| Type II (total) | 45 | 44 | 29 | 31 | 34 |
| MIKCc | 39 | 39 | 28 | 27 | 31 |
| MIKC* | 6 | 5 | 1 | 4 | 3 |
| Type I (total) | 61 | 31 | 22 | 12 | 31 |
| Mα | 25 | 12 | 10 | 6 | 19 |
| Mβ | 20 | 9 | 0 | 1 | 6 |
| Mγ | 16 | 10 | 12 | 5 | 6 |
| Total | 106 | 75 | 51 | 43 | 65 |
aThe whole-genome sequence of A. thaliana was extracted from the NCBI database, BioProject: PRJNA477266 (ref. [14])
bThe whole-genome sequence of O. sativa was extracted from rice.plantbiology.msu.edu/
cThe whole-genome sequence of P. equestris was extracted from the NCBI database, BioProject: PRJNA192198 (ref. [15])
dThe whole-genome sequence of R. apiculata was extracted from http://evolution.sysu.edu.cn/Sequences.html
Fig. 6Phylogenetic analysis of MADS-box genes from A. thaliana, O. sativa, P. equestris, K. obovata, and R. apiculata.
a Phylogenetic tree of type I MADS-box genes. b Phylogenetic tree of type II MADS-box genes. The number on the left in parentheses represents the homologous MADS genes of K. obovata, and the number on the right represents the homologous MADS genes of R. apiculata. The bolded gene ID numbers beginning with “Ko” represent the gene IDs of K. obovata; those beginning with “Ra” represent the gene IDs of R. apiculata
Fig. 7Phylogenetic reconstruction of the NLR proteins in K. obovata and R. apiculata.
The NBS domain of human apoptotic protease-activating factor-1 (APAF-1) is located at the root of the tree. The bolded gene ID numbers beginning with “Ko” represent the gene IDs of K. obovata; those beginning with “Ra” represent the gene IDs of R. apiculata