| Literature DB >> 26322066 |
Rosario Carmona1, Adoración Zafra2, Pedro Seoane3, Antonio J Castro2, Darío Guerrero-Fernández4, Trinidad Castillo-Castillo5, Ana Medina-García5, Francisco M Cánovas3, José F Aldana-Montes5, Ismael Navas-Delgado5, Juan de Dios Alché2, M Gonzalo Claros6.
Abstract
Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species.Entities:
Keywords: annotation; database; olive; pistil; pollen; reproduction; transcriptome
Year: 2015 PMID: 26322066 PMCID: PMC4531244 DOI: 10.3389/fpls.2015.00625
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Gene libraries used in ReprOlive.
| Gene library | Tissue | Developmental stage | Sequencing method | Raw reads | Useful reads |
|---|---|---|---|---|---|
| PM-Subs | Pollen | Mature | Sanger | 666 | 518 |
| PM | Pollen | Mature | Pyrosequencing | 216,497 | 111,242 |
| PG1 | Pollen | 1 h germination | Pyrosequencing | 258,167 | 141,232 |
| PG5 | Pollen | 5 h germination | Pyrosequencing | 233,921 | 120,276 |
| S2 | Pistil | Stage2 | Pyrosequencing | 257,813 | 138,077 |
| S3 | Pistil | Stage3 | Pyrosequencing | 247,401 | 141,903 |
| S4 | Pistil | Stage4 | Pyrosequencing | 262,269 | 149,929 |
| S4-Subs | Pistil | Stage4 | Sanger | 480 | 256 |
| L | Leaf | Mature | Pyrosequencing | 223,399 | 41,178 |
| L-Subs | Leaf | Mature | Sanger | 403 | 251 |
| R1 | Root | Mature | Pyrosequencing | 231,237 | 25,899 |
| R2 | Root | Radicle | Pyrosequencing | 145,204 | 22,075 |
Main features of transcriptomes in ReprOlive based on Full-LengtherNext analyses.
| Feature | Pollen | Pistil | Vegetative | Reproductive |
|---|---|---|---|---|
| Number of useful reads | 373,268 | 430,165 | 89,403 | 803,433 |
| Mean length (nt) | 383 | 385 | 545 | 384 |
| Number of MIRA3 contigs | 54,754 | 73,823 | 42,310 | 116,298 |
| Number of Euler-SR contigs | 4,807 | 15,216 | 490 | 16,211 |
| Number of Contigs after CAP3 reconciliation | 28,094 | 60,964 | 39,425 | 73,589 |
| Number of TTs without chimeras and artifacts∗ | 27,823 | 60,400 | 38,919 | 72,846 |
| Mean length (nt) | 608 | 678 | 664 | 686 |
| N50 (nt) | 661 | 780 | 683 | 798 |
| Longest TT (nt) | 7,016 | 7,757 | 2,865 | 7,950 |
| Number of ncRNAs | 31 | 17 | 265 | 45 |
| Number of TTs with annotation | 24,861 | 54,129 | 36,700 | 63,965 |
| Number of TTs with ortholog | 21,607 | 46,910 | 32,076 | 55,356 |
| With unique ortholog IDs | 11,672 | 21,326 | 15,003 | 23,568 |
| With ortholog from | 21,233 | 46,924 | 31,945 | 54,890 |
| Unique RefSeq IDs | 9,769 | 16,565 | 12,489 | 17,612 |
| With ortholog from | 21,312 | 47,038 | 31,980 | 55,067 |
| Unique TAIR10 IDs | 8,922 | 14,656 | 11,247 | 15,503 |
| Number of TTs coding a complete protein | 2,809 | 7,137 | 3,559 | 9,157 |
| Unique, complete proteins | 1,976 | 4,822 | 2,220 | 5,835 |
| Number of TTs without ortholog | 6,185 | 13,473 | 6,578 | 17,445 |
| Likely coding for a complete protein | 170 | 446 | 242 | 628 |
| Likely coding for an incomplete protein | 2610 | 5,312 | 2,523 | 6,486 |
| Number of representative TTs | 13,589 | 25,720 | 17,340 | 28,972 |
| 10,878 | 20,612 | 14,576 | 22,565 | |
| Unique RefSeq IDs | 8,281 | 13,901 | 10,349 | 14,706 |
| 10,900 | 20,658 | 14,581 | 22,638 | |
| Unique TAIR10 IDs | 7,842 | 12,883 | 9,756 | 13,584 |