| Literature DB >> 33842678 |
Xudong Jiao1,2, Jiaxin Shi3, Song Qin1,2, Dong Huang1,4, Yinchu Wang1,2.
Abstract
Urechis unicinctus has a wide range of bioactive polypeptides with high edible, economic and medicinal values. As the key technical breakthrough, the artificial breeding is imperative. However, the seedling transport becomes a primary matter, which indicates the indispensability of realizing how Urechis unicinctus responses to various situations. We compared transcriptome of Urechis unicinctus under the dry and ultraviolet irradiation treatment and different temperature. The dataset of the organism in response to water-temperature variety was provided by using the Illumina Hiseq X Ten system, which will be helpful to understand the adaptation of Urechis unicinctus to changing temperature (low, high and room temperature) and open air (ultraviolet and desiccation). The assembly of the transcriptomes was carried out using the isoform sequencing (Iso-seq) method. The functions of expressed genes were annotated and categorized, while the DEGs were presented.Entities:
Keywords: RNA-seq; Transcriptome assembly; Urechis unicinctus
Year: 2021 PMID: 33842678 PMCID: PMC8020418 DOI: 10.1016/j.dib.2021.106941
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Statistics of length distribution before and after transcript corrections.
| Sample | Type | Total nucleotides | Total_number | Mean length | Min length | Max length | N50 | N90 |
|---|---|---|---|---|---|---|---|---|
| UU2019 | Before correct | 505,187,047 | 216,918 | 2329 | 177 | 14,121 | 2518 | 1493 |
| UU2019 | After correct | 504,952,370 | 216,918 | 2328 | 176 | 14,194 | 2516 | 1492 |
Sample: the name of the sample.
Type: the state of correction.
Total_nucleotides: the number of bases of the consensus.
Total_number: the number of the consensus.
Mean length: the average length of the consensus.
Min length: the minimum length of the consensus.
Max_length: the maximum of the consensus.
N50/N90: the total length of the consensus after being ranked in order of length and added up the length until it is no less than 50% or 90% of the consensus.
Read alignment summary of transcriptomes of Urechis unicinctus under desiccation, ultraviolet and high, low and room temperature.
| Sample | Raw Reads | Clean reads | Clean bases | Error(%) | Q20(%) | Q30(%) | GC(%) | Total mapped |
|---|---|---|---|---|---|---|---|---|
| DRY_1 | 49,895,240 | 48,303,224 | 7.25 G | 0.02 | 98.36 | 95.09 | 47.73 | 43,022,064(89.07%) |
| DRY_2 | 59,168,576 | 57,459,078 | 8.62 G | 0.02 | 98.33 | 95.00 | 47.13 | 50,889,032(88.57%) |
| DRY_3 | 53,897,626 | 51,886,052 | 7.78 G | 0.02 | 98.30 | 94.98 | 47.04 | 45,759,512(88.19%) |
| UV_1 | 56,072,498 | 54,080,562 | 8.11 G | 0.02 | 98.20 | 94.73 | 46.58 | 47,364,024(87.58%) |
| UV_2 | 59,620,598 | 58,202,014 | 8.73 G | 0.02 | 98.36 | 95.09 | 47.26 | 51,773,516(88.95%) |
| UV_3 | 62,171,266 | 60,422,202 | 9.06 G | 0.02 | 98.39 | 95.15 | 47.18 | 53,806,486(89.05%) |
| RT_1 | 46,423,864 | 45,102,198 | 6.77 G | 0.02 | 98.16 | 94.64 | 47.41 | 39,850,676(88.36%) |
| RT_2 | 61,925,716 | 60,173,960 | 9.03 G | 0.02 | 98.13 | 94.54 | 47.08 | 53,398,352(88.74%) |
| RT_3 | 52,836,082 | 51,009,134 | 7.65 G | 0.03 | 97.91 | 94.07 | 46.86 | 45,008,268(88.24%) |
| HT_1 | 60,880,848 | 59,486,478 | 8.92 G | 0.02 | 98.37 | 95.11 | 46.92 | 52,707,958(88.60%) |
| HT_2 | 63,003,754 | 61,551,050 | 9.23 G | 0.02 | 98.35 | 95.10 | 47.18 | 54,330,180(88.27%) |
| HT_3 | 57,418,292 | 55,689,904 | 8.35 G | 0.02 | 98.33 | 95.01 | 47.05 | 49,213,370(88.37%) |
| LT_1 | 51,508,474 | 50,009,866 | 7.5 G | 0.03 | 97.47 | 93.07 | 46.87 | 44,093,334(88.17%) |
| LT_2 | 58,596,068 | 57,487,684 | 8.62 G | 0.03 | 97.43 | 92.96 | 46.92 | 50,676,846(88.15%) |
| LT_3 | 59,399,558 | 58,374,036 | 8.76 G | 0.02 | 98.43 | 95.30 | 47.39 | 51,741,846(88.64%) |
Q20, Q30: Proportion of bases with Qphred >20, 30 (Qphred=−10log10(e)).
Raw reads: Original data from sequencing.
Clean Bases: Clean read numbers multiply read length (saved in G unit).
Clean Bases: Clean read numbers multiply read length (saved in G unit).
Error: Average sequencing error rate, calculated through Qphred= −10log10(e).
GC: Propotion of G and C in total bases.
| Subject | Biochemistry, Genetics and Molecular Biology |
| Specific subject area | Transcriptomics, Genomics |
| Type of data | Fastq read files |
| How data were acquired | Illumina HiSeq X Ten |
| Data format | Raw sequencing reads (fastq) |
| Parameters for data collection | Total RNA was collected from 6-month old |
| Description of data collection | Total RNA was obtained from 5 groups separately under conditions of UV, DRY and HT, LT and RT, where the RT group was considered as the control one and all groups had 3 parallel experiments.Sequencing was performed according to Illumina HiSeq X Ten. Clean reads were obtained by removing reads containing adapter and low-quality bases and subsequently mapped to the reference spliced by Trinity that is a transcriptome-splicing software combined with 3 separate software modules. The DEGs were analysed by DESeq2. |
| Data source location | Yantai institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai, Shandong, China |
| Data accessibility | The complete RNA-seq data of |