| Literature DB >> 26684618 |
Ziwen He1, Zhang Zhang1, Wuxia Guo1, Ying Zhang2, Renchao Zhou1, Suhua Shi1.
Abstract
Nypa fruticans (Arecaceae) is the only monocot species of true mangroves. This species represents the earliest mangrove fossil recorded. How N. fruticans adapts to the harsh and unstable intertidal zone is an interesting question. However, the 60 gene segments deposited in NCBI are insufficient for solving this question. In this study, we sequenced, assembled and annotated the transcriptome of N. fruticans using next-generation sequencing technology. A total of 19,918,800 clean paired-end reads were de novo assembled into 45,368 unigenes with a N50 length of 1,096 bp. A total of 41.35% unigenes were functionally annotated using Blast2GO. Many genes annotated to "response to stress" and 15 putative positively selected genes were identified. Simple sequence repeats were identified and compared with other palms. The divergence time between N. fruticans and other palms was estimated at 75 million years ago using the genomic data, which is consistent with the fossil record. After calculating the synonymous substitution rate between paralogs, we found that two whole-genome duplication events were shared by N. fruticans and other palms. These duplication events provided a large amount of raw material for the more than 2,000 later speciation events in Arecaceae. This study provides a high quality resource for further functional and evolutionary studies of N. fruticans and palms in general.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26684618 PMCID: PMC4684314 DOI: 10.1371/journal.pone.0145385
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Distributions of the mean coverage of unigenes (a) and the length of unigenes and contigs (b).
Summary statistics of assembly and annotation for Nypa fruticans.
|
| |
| Total number of raw reads | 24,607,427×2 |
| Total number of clean reads | 19,918,800×2 |
| Total number of contigs | 51,702 |
| Unigenes (contigs after removing redundancy) | 45,368 |
| Mean length of unigenes (bp) | 722 |
| Median length of unigenes (bp) | 460 |
| N50 value of unigenes (bp) | 1,096 |
| Longest unigene (bp) | 9,795 |
| GC content | 47.7% |
|
| |
| NR-blast | 32,260 (71.11%) |
| InterProScan | 24,650 (54.33%) |
| Blast2GO | 18,761 (41.35%) |
| KEGG pathway | 2,956 (6.52%) |
|
| |
| Di-nucleotide motifs | 4,436 (55.74%) |
| Tri-nucleotide motifs | 3,323 (41.76%) |
| Tetra-nucleotide motifs | 174 (2.19%) |
| Penta-nucleotide motifs | 10 (0.13%) |
| Hexa-nucleotide motifs | 15 (0.19%) |
The N50 value indicates that 50% of the entire assembly is contained in sequences equal to or larger than this value. NR: NCBI non-redundant protein database, InterProScan: a protein signature recognition tool, KEGG: Kyoto Encyclopedia of Genes and Genomes, SSR: simple sequence repeat.
Fig 2GO functional classification of N. fruticans.
GO functional classification (level 2) of the annotated 18,761 unigenes.
Putative positively selected genes in N. fruticans.
| Gene of | Orthologous gene in | Orthologous gene in | Annotations |
|---|---|---|---|
| Nfr_c12132_g1_i1 | LOC_Os06g36170.1 | AT2G40570 | initiator tRNA phosphoribosyl transferase family protein, putative, expressed |
| Nfr_CL1573Contig1 | LOC_Os01g14610.2 | AT3G12530 | PSF2—putative GINS complex subunit, expressed |
| Nfr_CL1871Contig1 | LOC_Os01g64990.1 | -- | GPI transamidase subunit PIG-U domain containing protein, expressed |
| Nfr_c17518_g1_i1 | LOC_Os09g25950.1 | AT4G28020 | regulator protein, putative, expressed |
| Nfr_c32630_g1_i1 | LOC_Os01g67100.1 | AT1G65470 | expressed protein |
| Nfr_c20436_g1_i2 | LOC_Os01g46580.1 | AT1G30825 | actin-related protein 2/3 complex subunit 2, putative, expressed |
| Nfr_c22406_g1_i2 | LOC_Os08g40590.2 | AT4G25850, AT4G25860, AT5G57240 | oxysterol-binding protein, putative, expressed |
| Nfr_c20352_g2_i1 | LOC_Os11g30560.1 | AT5G50600, AT5G50700 | dehydrogenase/reductase, putative, expressed |
| Nfr_c20446_g3_i1 | LOC_Os02g03220.1 | AT4G00800 | protein-binding protein, putative, expressed |
| Nfr_c21523_g1_i2 | LOC_Os07g17400.1 | AT5G58787, AT5G01520, AT1G24440 | zinc finger, RING-type, putative, expressed |
| Nfr_c19525_g1_i1 | LOC_Os05g41100.1 | AT3G24080 | protein kri1, putative, expressed |
| Nfr_CL39Contig1 | LOC_Os02g46750.1 | AT5G16810 | expressed protein |
| Nfr_c19922_g2_i1 | LOC_Os02g51480.1 | AT4G39620 | PPR repeat domain containing protein, putative, expressed |
| Nfr_c22757_g2_i2 | LOC_Os06g06370.1 | AT1G08030 | expressed protein |
| Nfr_CL325Contig1 | LOC_Os11g43610.1 | AT4G02485 | oxidoreductase, 2OG-Fe oxygenase family protein, putative, expressed |
The positively selected genes were identified using the improved branch-site model. The Benjamini-Hochberg correction for multiple testing was used (FDR < 0.05).
a These genes were functionally described in detail in the main text.
Fig 3Frequency of different SSR motif types.
Fig 4Phylogenetic analysis and divergence time estimation.
(A) The phylogenetic tree of the five monocots. The results are 100% supported by the 1,000 bootstraps analysis. (B) Estimation of divergence time. Blue bars indicate 95% confidence intervals.
Fig 5Whole-genome duplication events of four palms.
(A) The distribution of synonymous substitution rate (Ks) of the paralogs is displayed in red and blue for N. fruticans and P. dactylifera, respectively. Peaks I and II indicated two whole-genome duplication events in both species. The yellow bars showed the distribution of Ks of the orthologs between the two species. Peak III indicated that the divergence time of the two palms was close to the recent duplication events (peak II). (B) The distribution of Ks of paralogs of E. guineensis and C. nucifera. The same two whole-genome duplication events can be found for the two species.