| Literature DB >> 28011721 |
Yong-Min Kim1, Seungill Kim2, Namjin Koo1, Ah-Young Shin3, Seon-In Yeom4, Eunyoung Seo2, Seong-Jin Park1, Won-Hee Kang4, Myung-Shin Kim2, Jieun Park2, Insu Jang1, Pan-Gyu Kim1, Iksu Byeon1, Min-Seo Kim1, JinHyuk Choi1, Gunhwan Ko1, JiHye Hwang5, Tae-Jin Yang2, Sang-Bong Choi6, Je Min Lee7, Ki-Byung Lim7, Jungho Lee8, Ik-Young Choi9, Beom-Seok Park5, Suk-Yoon Kwon3, Doil Choi2, Ryan W Kim1.
Abstract
Hibiscus syriacus (L.) (rose of Sharon) is one of the most widespread garden shrubs in the world. We report a draft of the H. syriacus genome comprised of a 1.75 Gb assembly that covers 92% of the genome with only 1.7% (33 Mb) gap sequences. Predicted gene modeling detected 87,603 genes, mostly supported by deep RNA sequencing data. To define gene family distribution among relatives of H. syriacus, orthologous gene sets containing 164,660 genes in 21,472 clusters were identified by OrthoMCL analysis of five plant species, including H. syriacus, Arabidopsis thaliana, Gossypium raimondii, Theobroma cacao and Amborella trichopoda. We inferred their evolutionary relationships based on divergence times among Malvaceae plant genes and found that gene families involved in flowering regulation and disease resistance were more highly divergent and expanded in H. syriacus than in its close relatives, G. raimondii (DD) and T. cacao. Clustered gene families and gene collinearity analysis revealed that two recent rounds of whole-genome duplication were followed by diploidization of the H. syriacus genome after speciation. Copy number variation and phylogenetic divergence indicates that WGDs and subsequent diploidization led to unequal duplication and deletion of flowering-related genes in H. syriacus and may affect its unique floral morphology.Entities:
Keywords: Diploidization; Hibiscus syriacus; Homeolog; Multivoltinism; Whole Genome Duplication
Mesh:
Substances:
Year: 2017 PMID: 28011721 PMCID: PMC5381346 DOI: 10.1093/dnares/dsw049
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Summary of H. syriacus genome assembly
| Number of scaffolds | 77,492 |
| Total length of scaffolds | 1,748 Mb |
| N50 of scaffolds | 140 kb |
| Longest (shortest) length of scaffolds | 1.54 Mb (500 bp) |
| Number of contigs | 172,672 |
| Total length of contigs | 1,715 Mb |
| N50 of contigs | 30.0 kb |
| Longest (shortest) length of contigs | 643 kb (87 bp) |
| Number of gap sequences | 33 Mb (1.9%) |
| GC content | 34.04% |
| Total size of TEs | 1,095 Mb (57.6%) |
Statistics of H. syriacus gene models
| Protein-coding loci | Total CDS length (bp) | Avg. CDS length (bp) | Avg. Exon length (bp) | Avg. Intron length (bp) | |
|---|---|---|---|---|---|
| 87,603 | 104,087,809 | 1,188 | 239 | 383 | |
| 28,798 | 33,494,538 | 1,857 | 231 | 502 | |
| 40,976 | 45,237,504 | 1,104 | 244 | 339 | |
| 27,206 | 24,861,465 | 1,212 | 265 | 164 |
aCacao genome paper19
bCotton genome paper10
cTAIR10 annotation (http://www.arabidopsis.org)
Figure 1Distribution of orthologous gene families of H. syriacus, G. raimondii, T. cacao, A. trichopoda and A. thaliana, from which 169,570 sequences were clustered into 9,076 groups. The number of clustered groups and genes in each species are shown on the left and center, and total gene numbers are shown on the right.
Figure 2Collinearity block detection and calculation of gene duplication times. (A) Collinearity blocks of the T. cacao genome were detected in G. raimondii and H. syriacus. (B) Calculation of divergence times of individual gene families. Circles and triangles indicate H. syriacus and G. raimondii, respectively, and shade boxes indicate each WGD. (C) Divergence time of Malvales plants. H. syriacus diverged from the H. syriacus-G. raimondii common ancestor 22.28 MYA. Red (H. syriacus) and green (G. raimondii) stars indicate WGD and blue circles indicate diploidization events.
Figure 3Phylogenetic tree of photoperiod/circadian clock genes. (A) The evolutionary history of these genes was inferred using the minimum evolution method. Blue (A. trichopoda), green (A. thaliana), pink (G. raimondii), orange (T. cacao) and red (H. syriacus) indicate genes from each species. (B) Expression patterns of photoperiod/circadian clock genes in petals, ovaries, roots, and leaves.
Comparison of flowering-time gene copy numbers
| Regulators | Copy number | ||||
|---|---|---|---|---|---|
| AT5G15840 | 6 | 1 | 1 | 0 | |
| AT2G40080 | 7 | 1 | 4 | 1 | |
| AT4G16280 | 2 | 1 | 1 | 1 | |
| AT1G68050 | 4 | 1 | 2 | 1 | |
| AT3G04610 | 3 | 1 | 2 | 1 | |
| AT1G65480 | 2 | 1 | 1 | 1 | |
| AT1G22770 | 5 | 1 | 2 | 0 | |
| AT5G61850 | 4 | 1 | 1 | 1 | |
| AT1G01060 | 7 | 1 | 3 | 1 | |
| AT5G57380 | 7 | 1 | 2 | 1 | |
| AT2G45660 | 4 | 1 | 2 | 1 | |
| AT5G03840 | 4 | 2 | 2 | 0 | |
| AT1G24260 | 3 | 2 | 2 | 0 | |
| AT1G09570 | 4 | 1 | 1 | 1 | |
| AT2G18790 | 4 | 0 | 1 | 1 | |
| AT5G35840 | 2 | 1 | 1 | 1 | |
| AT4G18130 | 4 | 1 | 0 | 0 | |
Comparative NBS-LRR gene family numbers
| Predicted domain | Class | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| TIR-NBS-LRR | TNL | 68 | 26 | 14 | 19 | 87 | 19 | 0 | 9 |
| TIR-NBS | TN | 9 | 1 | 3 | 6 | 17 | 4 | 2 | 2 |
| % on NBS genes | 17 | 9 | 6 | 9 | 61 | 7 | 0.4 | 10 | |
| CC-NBS- LRR | CNL | 183 | 220 | 202 | 116 | 52 | 138 | 337 | 27 |
| CC-NBS | CN | 77 | 24 | 25 | 37 | 3 | 19 | 104 | 27 |
| NBS-LRR | NL | 81 | 28 | 34 | 39 | 8 | 110 | 70 | 18 |
| NBS | N | 54 | 4 | 9 | 50 | 3 | 32 | 14 | 29 |
| % on NBS genes | 84 | 91 | 94 | 91 | 39 | 93 | 99.6 | 90 | |
| Total NBS genes | 472 | 303 | 287 | 267 | 170 | 322 | 527 | 112 | |
| % on total genes | 0.53 | 0.81 | 0.97 | 0.77 | 0.63 | 1.22 | 1.35 | 0.41 | |
| Total no. of genes | 87,603 | 37,505 | 29,452 | 34,727 | 27,206 | 26,346 | 39,049 | 26,846 |
Figure 4Phylogenetic relationships of NBS-LRR genes with >80% bootstrap values. (A) Phylogenetic relationships of predicted NBS-LRR genes in H. syriacus. Red (TNL-A, TNL-B, CNL-5 and RPW-CNL subgroups) indicates expanded subgroups of the H. syriacus genome compared to other plant genomes (Supplementary Table S8). (B) Detailed phylogenetic relationships of expanded TNL subgroups are shown. Intact NB-ARC domains of H. syriacus (red), G. raimondii (green), T. cacao (blue), S. lycopersicum (orange), A. thaliana (light blue) and V. vinifera (purple) were used in the phylogenetic construction.