| Literature DB >> 32226802 |
Fitri Indriani1, Ulfah J Siregar2, Deden D Matra3, Iskandar Z Siregar2.
Abstract
Shorea balangeran Burk locally known as balangeran has been widely used as recommended species for tropical peat swamp forest restoration, due to the capability of these species to grow in waterlogged and dry areas. However, the information concerning genetic basis of adaptation to ecological condition variation is limited and no transcriptome study has been reported in this context. Here we reported two sets of transcriptome data from a sample of leaf and basal stem that were taken from seedlings growing in potted media containing peat and mineral soil. The raw reads are stored in the DDBJ platform with accession number DRA008633.Entities:
Keywords: Adaptation; RNA-seq; Shorea balangeran; Transriptome
Year: 2019 PMID: 32226802 PMCID: PMC7093797 DOI: 10.1016/j.dib.2019.104998
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
The properties of reads and assembled sequences of balangeran.
| Features | Numbers | ||
|---|---|---|---|
| Leaf | Basal Stem | Merged | |
| Reads | |||
| Number of reads | 64,101,942 | 56,537,051 | 120,638,993 |
| Number of bases | 9,615,291,300 | 8,480,557,650 | 18,095,848,950 |
| Number of post-trimming reads | 62,400,243 | 54,917,915 | 117,318,158 |
| Number of post-trimming bases | 9,360,036,450 | 8,237,687,250 | 17,597,723,700 |
| Transcripts | |||
| Number of transcript | 279,598 | 574,875 | – |
| Number of bases | 175,610,736 | 342,696,076 | – |
| Length range (bp) | 201-16,510 | 201-16,960 | – |
| Average (bp) | 628.08 | 596.12 | – |
| N50 (bp) | 940 | 839 | – |
| GC contents (%) | 42.28 | 45.56 | – |
| Contigs | |||
| Number of contig | 187,297 | 440,665 | 180,291 |
| Number of bases | 118,677,247 | 252,486,917 | 197,305,352 |
| Length range (bp) | 201-16,510 | 201-16,960 | 201-17,014 |
| Average (bp) | 633.63 | 572.97 | 1094.37 |
| N50 (bp) | 918 | 762 | 1489 |
| GC contents (%) | 42.6 | 46.2 | 44.3 |
Constructed by Trinity Program.
Constructed by CAP3, cd-hit-est, and corset (only for merged contig) programs.
Functional annotation of balangeran contigs using several database.
| Database Source | Number (percentage) |
|---|---|
| Contig Number | 180,291 |
| Non-redundant protein (nr) NCBI | 113,998 (63.62) |
| Non-redundant Nucleotide (nt) NCBI | 53,407 (29.62) |
| Swiss-Prot UniProt | 78,407 (43.49) |
| TrEMBL UniProt | 90,875 (50.40) |
Open Reading Frames (ORFs) prediction characteristics of balangeran contigs using TransDecoder.
| Features | Contigs Number (percentage) |
|---|---|
| ORF contig | 130,314 |
| ORFs Type : | |
| a. 5prime_partial | 31,209 (23.95) |
| b. 3prime_partial | 17,633 (13.53) |
| c. Internal | 17,104 (13.13) |
| d. Complete | 64,374 (49.40) |
Number and motif of microsatellite of balangeran contigs.
| Motifs | Number of Contigs (percentage) | ||
|---|---|---|---|
| Leaf | Basal Stem | Merged | |
| Mononucleotide | 26,259 (72.93) | 48,786 (68.83) | 44,626 (70.30) |
| Dinucleotide | 3939 (10.94) | 6943 (9.80) | 6270 (9.88) |
| Trinucleotide | 5192 (14.42) | 13,443 (18.97) | 11,160 (17.58) |
| Tetranucleotide | 421 (1.17) | 1221 (1.72) | 995 (1.57) |
| Pentanucleotide | 142 (0.39) | 292 (0.41) | 267 (0.42) |
| Hexanucleotide | 54 (0.15) | 193 (0.27) | 164 (0.26) |
Specifications Table
| Subject | Agricultural and Biological Sciences: Forestry |
| Specific subject area | Molecular study in Forestry |
| Type of data | RNA Sequencing Data |
| How data were acquired | Illumina Hiseq 4000 |
| Data format | Raw sequencing reads and assembled contigs |
| Parameters for data collection | Leaf and basal steam of balangeran seedlings planted in waterlogged peat, dry peat, waterlogged mineral soil and dry mineral soil |
| Description of data collection | Total RNA was sequenced using Illumina Hiseq 4000 platform in NovogenAIT, Singapore |
| Data source location | Bogor, West Java Indonesia |
| Data accessibility | Repository name: DDBJ (DNA Data Bank of Japan) |
| Related research article | F. Indriani, D.D. Matra, U.J. Siregar, I.Z. Siregar |
This is the first transcriptome data of This data is beneficial to elucidate the molecular mechanism and gene pathway of This data allows further analysis to identify genes of interest that play roles in |