| Literature DB >> 25193496 |
Ya-Yi Huang1, Chueh-Pai Lee2, Jason L Fu1, Bill Chia-Han Chang2, Antonius J M Matzke3, Marjori Matzke3.
Abstract
Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop.Entities:
Keywords: RNA-seq; coconut; endosperm; epigenetics; monocot
Mesh:
Year: 2014 PMID: 25193496 PMCID: PMC4232540 DOI: 10.1534/g3.114.013409
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Putative RdDM factors and related RNAi proteins identified in coconut tissues
| Name | Transcript ID | Length | BLASTX best hit organism | Accession | ||||
|---|---|---|---|---|---|---|---|---|
| nucleotide | amino acid | Name and ID | Length (AA) | Coverage (%) | Identity (%) | |||
| RPA1 | Endo_Locus_19151 | 5417 | 1661 | 1748 | 95 | 56 | KJ851186 | |
| RPB1 | Leaf_211 | 5318 | 1536 | 1855 | 86 | 94 | KJ851187 | |
| NRPD1 | Leaf_Locus_3193 | 1712 | 568 | 1386 | 99 | 57 | KJ851188 | |
| NRPE1 | Endo_Locus_10009 | 5244 | 1672 | 2017 | 95 | 52 | KJ851189 | |
| DCL1 | Endo_Locus_3719 | 5490 | 1534 | 1971 | 93 | 85 | KJ851190 | |
| DCL3a | Endo_Locus_25736 | 3777 | 1206 | 1648 | 95 | 61 | KJ851191 | |
| DCL3b | Endo_Locus_6769 | 3715 | 966 | 1116 | 78 | 43 | KJ851192 | |
| DCL4 | Embryo_Locus_7881 | 5576 | 1617 | 1632 | 86 | 63 | KJ851193 | |
| AGO1 | Endo_Locus_673 | 3410 | 864 | 1085 | 76 | 89 | KJ851194 | |
| AGO2 | Embryo_Locus_4735 | 918 | 251 | 1034 | 82 | 46 | KJ851195 | |
| AGO4 | Endo_Locus_25417 | 2258 | 659 | 913 | 87 | 77 | KJ851196 | |
| AGO10 | Embryo_Locus_15517 | 1079 | 313 | 995 | 31 | 48 | KJ851197 | |
| RDR 1 | Endo_Locus_54052 | 406 | 134 | 1121 | 99 | 74 | KJ851198 | |
| RDR 2 | Embryo_Locus_43307 | 2041 | 616 | 1127 | 90 | 67 | KJ851199 | |
| RDR 5 | Leaf_Locus_25001 | 1809 | 590 | 899 | 97 | 60 | KJ851200 | |
| RDR 6 | Endo_Locus_13061 | 3140 | 711 | 1197 | 75 | 34 | KJ851201 | |
| MET1 | Endo_Locus_26121 | 4775 | 1518 | 1543 | 98 | 97 | KJ851202 | |
| DRM | Leaf_Locus_2627 | 2022 | 554 | 591 | 82 | 86 | KJ851203 | |
| CMT | Endo_Locus_4191 | 2245 | 603 | 925 | 94 | 89 | KJ851204 | |
| ROS1 | Endo_Locus_23310 | 620 | 108 | 1573 | 52 | 67 | KJ851205 | |
| DRD1 | Endo_Locus_31250 | 445 | 93 | 899 | 62 | 62 | KJ851206 | |
Note: In a search of a list of date palm expressed genes (Supplementary Data 1, Al-Mssallem et al. 2013), we could find one RPA1 (KacstDP.mRNA.S000004.203), three AGO proteins (KacstDP.mRNA.S000249.22, KacstDP.mRNA.S001251.1 and KacstDP.mRNA.S000009.162) and one RDR2 (KacstDP.mRNA.S000670.5) proteins. In a BLAST search of oil palm unannotated, de novo assembled transcripts (Supplemental Data 1, 2 and 3, Dussert et al. 2013), we identified four RPB1 (CL1714Contig3, CLContig342, CL1Contig406 and CL1Contig7380), one NRPE1 (CL1715Contig1), one DCL3b (CL1841Contig3), five AGO1 (CL1Contig1334, CL1Contig4087, CL1Contig7749, CL1Contig2199 and CL1721Contig1), four AGO4 (CL1Contig2911, CL1Contig7017, CL665Contig1 and CL665Contig3), one RDR1 (CL1Contig877), one DRM (CL1Contig4186) and one DRD1 (CL3348Contig3) proteins.
Sequencing results and de novo assembly
| Sequencing Results and | Embryo | Endosperm | Leaf |
|---|---|---|---|
| Sequencing results | |||
| Total Illumina reads (No) | 81,128,552 | 103,080,366 | 121,151,552 |
| Average read length (bp) | 168.5 | 101 | 126 |
| Total base (No) | 13,670,161,012 | 10,411,116,966 | 15,265,095,552 |
| Total reads after QT (No) | 78,553,435 | 99,253,456 | 120,061,360 |
| Average read length after QT (bp) | 158.3 | 97.9 | 120.36 |
| Total clean base (No) | 12,435,008,760 | 9,682,020,834 | 13,609,335,965 |
| Insert size (bp) | 430 | 306 | 286 |
| Total transcripts (No) | 86,254 | 229,866 | 159,509 |
| Total unigenes (No) | 58,211 | 61,152 | 33,446 |
| Average contig length of unigene (bp) | 732 | 684 | 744 |
| Unigenes with multiple hits (No) | 24,857 | 29,731 | 26,064 |
| Unigenes with unique hits (No) | 23,836 | 22,278 | 20,844 |
| N50 | 951 | 969 | 912 |
| GC content of unigene (%) | 41 | 45 | 48 |
Figure 1Venn chart showing unique and shared unigenes found in three coconut transcriptomes.
Figure 2Expression profile of top 50 expressed genes in the three tissues. The colors denote absence (white) and presence (red) of a particular gene transcript. Photosynthetic genes are almost exclusively found in the leaf transcriptome. Seed storage (7S globulin) and heat shock proteins are prominent in the embryo. Translational and cytoskeleton proteins are abundant in embryo and in endosperm, but rarely found in leaf. Cell wall–associated hydrolase and major intrinsic proteins are evenly distributed in three tissues. Uncharacterized proteins exist in all three tissues, but unigenes without matched sequences in GenBank are found only in embryo and in leaf, not in endosperm.
Top expressed genes identified in three tissues of coconut
| Embryo | Endosperm | Leaf | |||
|---|---|---|---|---|---|
| Transcript Annotation | FPKM | Transcript Annotation | FPKM | Transcript Annotation | FPKM |
| Metallothionein type 2a-FL | 12328.2 | Alpha-tubulin | 11020.0 | Chloroplast RubisCO small subunit | 71647.2 |
| 7S globulin | 8069.0 | Dehydration responsive protein | 7930.5 | Os08g0560900 | 29745.4 |
| Aldose reductase-like protein | 2412.7 | Unnamed protein product | 6377.5 | Ubiquitin | 17929.8 |
| Long chain acyl-CoA synthetase 4-like | 2410.7 | Elongation factor 1-alpha | 5994.5 | Mitochondrial protein | 17284.7 |
| Cell wall–associated hydrolase | 1910.6 | Translationally controlled tumor protein | 5012.9 | ASCAB9 | 13504.6 |
| GA-stimulated transcript-like protein 6 | 1817.0 | Cell wall–associated hydrolase | 4972.3 | Chloroplast chlorophyll a/b binding protein | 12838.5 |
| Zn3H1 domain-containing protein 49-like | 1391.0 | SORBIDRAFT_04g032970 | 4285.5 | Photosystem I reaction center subunit XI | 12695.6 |
| 1-Cys peroxiredoxin | 1322.4 | Sorbitol dehydrogenase-like protein | 4042.6 | No annotation | 8909.8 |
| Eukaryotic translation initiation factor 1A-like | 1316.9 | RRNA intron-encoded homing endonuclease | 3847.1 | Unknown | 8812.3 |
| OsI_08334 | 1304.3 | Ribosomal protein L32 | 3650.3 | Early light-induced protein 2 | 8496.1 |
| AKIN beta1 | 1239.4 | Polyubiquitin | 3581.0 | Unnamed protein product | 6946.9 |
| Heat shock protein 17a | 1179.5 | Metallothionein type 2a-FL | 3522.5 | Oxygen-evolving enhancer protein 2 | 5528.4 |
| Tonoplast intrinsic protein | 1175.2 | Early nodulin 55-2 precursor | 3063.2 | Predicted protein | 5031.4 |
| Actin | 1089.8 | Unknown | 2991.1 | POPTRDRAFT_726168 | 5019.5 |
| ZEAMMB73_780902 | 1013.8 | Thiazole biosynthetic enzyme | 2620.5 | Glycine hydroxymethyltransferase | 4332.1 |
| No annotation | 974.0 | Predicted protein | 2568.3 | Cell wall–associated hydrolase | 4316.6 |
| Aldose reductase isoform 1 | 963.7 | Glutaredoxin-1 | 2499.4 | Lipid transfer protein | 4310.7 |
| ZEAMMB73_726804 | 881.6 | Annexin | 2453.6 | Probable histone H2A.4 isoform 1 | 4241.2 |
| ZEAMMB73_749085 | 857.3 | SORBIDRAFT_01g005010 | 2298.0 | LOC100788142 | 4080.4 |
| Histidine decarboxylase | 820.8 | Thioredoxin h1 | 2243.8 | MTR_5g051050 | 3984.3 |
| OsI_32485 | 764.1 | 1-Cys peroxiredoxin | 2138.4 | 40S ribosomal protein s2 | 3525.6 |
| Unnamed protein product | 736.0 | Enolase | 2113.7 | Senescence-associated protein 4 | 3431.4 |
| 60S ribosomal protein L7-like | 693.9 | Calmodulin | 2091.7 | Glycolate oxidase | 3425.2 |
Figure 3Species distribution of coconut transcripts (FPKM >1.00) resulting from de novo assembly. Sections <2% are not labeled.
Figure 4Analysis of GO enrichment at level eight.
Figure 5Relative abundance of RdDM-associated gene transcripts found in three tissues of coconut.
Figure 6(A) Relative quantity of four RdDM-related genes (DRM, NRPD1, NRPE1, and MET1) in three different developmental stages of endosperm. (B) Relative quantity of the same four genes in two different developmental stages of embryo.