| Literature DB >> 31964888 |
Chaowei Zhou1,2, Shijun Xiao1,3, Yanchao Liu1, Zhenbo Mou1, Jianshe Zhou1, Yingzi Pan1, Chi Zhang1, Jiu Wang1, Xingxing Deng2, Ming Zou3, Haiping Liu4.
Abstract
The Schizothoracinae fishes, endemic species in the Tibetan Plateau, are considered as ideal models for highland adaptation and speciation investigation. Despite several transcriptome studies for highland fishes have been reported before, the transcriptome information of Schizothoracinae is still lacking. To obtain comprehensive transcriptome data for Schizothoracinae, the transcriptome of a total of 183 samples from 14 representative Schizothoracinae species, were sequenced and de novo assembled. As a result, about 1,363 Gb transcriptome clean data was obtained. After the assembly, we obtain 76,602-154,860 unigenes for each species with sequence N50 length of 1,564-2,143 bp. More than half of the unigenes were functionally annotated by public databases. The Schizothoracinae fishes in this work exhibited diversified ecological distributions, phenotype characters and feeding habits; therefore, the comprehensive transcriptome data of those species provided valuable information for the environmental adaptation and speciation of Schizothoracinae in the Tibetan Plateau.Entities:
Mesh:
Year: 2020 PMID: 31964888 PMCID: PMC6972879 DOI: 10.1038/s41597-020-0361-6
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Sample information for the species in the study.
| Genus | Species | Abbreviations | Geographic region | Drainage | Partial morphological feature | |
|---|---|---|---|---|---|---|
| Pairs of whiskers | Body scales | |||||
| Soco | Gongga, Tibet, China | YarlungZangbo River | 2 | small scale | ||
| Slis | Changdu, Tibet, China | Lancang River | 2 | small scale | ||
| Snuk | Bomi, Tibet, China | Nujiang River | 2 | small scale | ||
| Spla | Ali, Tibet, China | Shiquan River | 2 | small scale | ||
| Slab | Ali, Tibet, China | Shiquan River | 2 | small scale | ||
| Sdav | Ganzi, Sichuan, China | Jinsha River | 2 | small scale | ||
| Pkaz | Changdu, Tibet, China | Lancang River | 1 | moderate degeneration | ||
| Gnam | Bange, Tibet, China | Lake Namtso | 0 | absence | ||
| Gprz | Haibei, Qinghai, China | Lake Qinghai | 0 | absence | ||
| Geck | Xunhua, Qinghai, China | Yellow River | 0 | absence | ||
| Gsel | Bange, Tibet, China | Lake Siling Co | 0 | absence | ||
| Syou | Lazi, Tibet, China | YarlungZangbo River | 0 | absence | ||
| Spyl | Xunhua, Qinghai, China | Yellow River | 0 | absence | ||
| Pext | Gonghe, Qinghai, China | Yellow River | 0 | absence | ||
Sample collected for the transcriptome sequencing.
| Species | The number of samples | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Muscle | Liver | Spleen | Skin | Swim bladder | Gut | Eye | Gill | Kidney | Heart | Brain | Gonads | Vibrissa | Fat | Blood | Total | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 15 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 14 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 1 | 1 | — | 1 | 13 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | — | — | 1 | 12 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 1 | — | 1 | 13 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 14 | |
| 1 | 1 | 1 | 1 | — | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 13 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | — | 1 | 13 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | — | 13 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 14 | |
| 1 | 1 | 1 | — | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | 1 | 1 | 13 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | — | — | 12 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | — | — | 12 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | — | — | — | 12 | |
| Total | 14 | 14 | 14 | 13 | 13 | 14 | 14 | 14 | 14 | 13 | 13 | 13 | 7 | 3 | 10 | 183 |
The abbreviations of species were identical with those in Table 1. The short line represented the absence of the sample in the transcriptome sequencing.
Fig. 1Sample sites of 14 Schizothoracine species in our study. The abbreviations of species were identical with those in Table 1. The altitude was represented by the color bar from white (high alititude) to green (low altitude).
The statistics of the de novo transcriptome assembly.
| Species | Total size (Mb) | GC (%) | Unigene | Transcript | ||||
|---|---|---|---|---|---|---|---|---|
| Sequence number | N50 length (bp) | Longest (bp) | Sequence number | N50 length (bp) | Longest (bp) | |||
| 117.00 | 0.415 | 88,676 | 1,948 | 36,581 | 831,353 | 1,527 | 36,694 | |
| 104.06 | 0.422 | 79,073 | 1,946 | 33,187 | 667,802 | 1,573 | 33,187 | |
| 107.46 | 0.419 | 84,638 | 1,835 | 30,806 | 743,518 | 1,420 | 30,806 | |
| 98.95 | 0.419 | 83,169 | 1,725 | 17,902 | 736,405 | 1,255 | 17,910 | |
| 99.98 | 0.416 | 76,602 | 1,905 | 43,720 | 670,792 | 1,432 | 43,720 | |
| 109.44 | 0.42 | 83,757 | 2,043 | 24,328 | 689,222 | 1,589 | 24,340 | |
| 173.48 | 0.409 | 154,860 | 1,564 | 77,434 | 1,363,461 | 1,198 | 77,434 | |
| 107.09 | 0.415 | 84,464 | 1,825 | 23,933 | 813,474 | 1,294 | 23,933 | |
| 105.49 | 0.413 | 78,762 | 1,974 | 28,230 | 751,137 | 1,409 | 28,231 | |
| 113.00 | 0.412 | 87,248 | 1,891 | 23,925 | 849,836 | 1,411 | 23,925 | |
| 122.36 | 0.406 | 106,851 | 1,588 | 25,730 | 1,187,251 | 914 | 25,730 | |
| 101.23 | 0.414 | 81,029 | 1,820 | 23,570 | 723,624 | 1,329 | 23,570 | |
| 97.96 | 0.418 | 80,542 | 1,724 | 26,467 | 751,215 | 1,202 | 26,467 | |
| 101.78 | 0.417 | 85,919 | 1,674 | 24,119 | 843,423 | 1,122 | 24,119 | |
| 106.52 | 0.422 | 77,069 | 2,143 | 25,942 | 639,444 | 1,920 | 25,942 | |
Note that the total size means the total base amount of all transcripts for species.
#The transcriptome data for Oxygymnocypris stewarti was reported in our previous studies[17].
Fig. 2Length distribution of unigenes for all species.
Fig. 3BUSCO statistics of assembled transcripts for species. The rate of single, duplicated, fragmented and missing BUSCO genes were colored by purple, blue, green and pink.
Functional annotation summary for species.
| Species | NR | Swiss-port | KOG | GO | KEGG | Total | Ratio |
|---|---|---|---|---|---|---|---|
| 45,296 | 29,701 | 40,793 | 28,842 | 28,816 | 46,972 | 52.97% | |
| 45,091 | 30,793 | 41,064 | 30,203 | 29,922 | 46,516 | 58.83% | |
| 46,557 | 31,077 | 42,380 | 30,450 | 30,185 | 48,122 | 56.86% | |
| 49,111 | 33,194 | 44,034 | 34,896 | 32,267 | 51,264 | 61.64% | |
| 43,749 | 29,702 | 39,846 | 28,668 | 28,837 | 44,956 | 58.69% | |
| 47,898 | 32,467 | 42,544 | 35,628 | 31,610 | 50,962 | 60.85% | |
| 58,392 | 34,174 | 49,960 | 33,669 | 33,253 | 62,216 | 40.18% | |
| 44,310 | 29,970 | 40,147 | 28,721 | 29,102 | 45,732 | 54.14% | |
| 43,104 | 29,502 | 39,141 | 28,524 | 28,628 | 44,387 | 56.36% | |
| 45,847 | 31,699 | 41,648 | 30,754 | 30,813 | 47,353 | 54.27% | |
| 49,768 | 32,165 | 44,381 | 31,049 | 31,239 | 51,828 | 48.50% | |
| 46,369 | 33,008 | 42,487 | 31,612 | 32,070 | 47,533 | 58.66% | |
| 44,777 | 31,296 | 41,101 | 30,088 | 30,408 | 46,094 | 57.23% | |
| 46,694 | 32,136 | 42,756 | 30,766 | 31,231 | 48,074 | 55.95% | |
| 43,212 | 29,426 | 38,495 | 32,099 | 28,597 | 46,009 | 59.70% |
The hit number for NR, Swiss-port, KOG, GO, KEGG were summarized. The ratio means the percentage of annotated unigenes to the total assembly sequences.
#The transcriptome data for Oxygymnocypris stewarti was reported in our previous studies[17].
| Measurement(s) | RNA • transcriptome • sequence_assembly • sequence feature annotation |
| Technology Type(s) | RNA sequencing • sequence assembly process • sequence annotation |
| Sample Characteristic - Organism | Schizothorax oconnori • Schizothorax lissolabiata • Schizopyge nukiangensis • Schizothorax plagiostomus • Schizothorax labiatus • Schizothorax davidi • Ptychobarbus kaznakovi • Gymnocypris namensis • Gymnocypris przewalskii • Gymnocypris eckloni • Gymnocypris selincuoensis • Schizopygopsis younghusbandi • Schizopygopsis pylzovi • Platypharodon extremus • Oxygymnocypris stewartii |
| Sample Characteristic - Environment | lake • drainage basin |
| Sample Characteristic - Location | Tibetan Plateau |