| Literature DB >> 31683620 |
Sylvia Hofmann1, Heiner Kuhl2, Chitra Bahadur Baniya3, Matthias Stöck4.
Abstract
The Himalayas are one of earth's hotspots of biodiversity. Among its many cryptic and undiscovered organisms, including vertebrates, this complex high-mountain ecosystem is expected to harbour many species with adaptations to life in high altitudes. However, modern evolutionary genomic studies in Himalayan vertebrates are still at the beginning. Moreover, in organisms, like most amphibians with relatively high DNA content, whole genome sequencing remains bioinformatically challenging and no complete nuclear genomes are available for Himalayan amphibians. Here, we present the first well-annotated multi-tissue transcriptome of a Greater Himalayan species, the lazy toad Scutiger cf. sikimmensis (Anura: Megophryidae). Applying Illumina NextSeq 500 RNAseq to six tissues, we obtained 41.32 Gb of sequences, assembled to ~111,000 unigenes, translating into 54362 known genes as annotated in seven functional databases. We tested 19 genes, known to play roles in anuran and reptile adaptation to high elevations, and potentially detected diversifying selection for two (TGS1, SENP5) in Scutiger. Of a list of 37 genes, we also identify 27 candidate genes for sex determination or sexual development, all of which providing the first such data for this non-model megophryid species. These transcriptomes will serve as a valuable resource for further studies on amphibian evolution in the Greater Himalaya as a biodiversity hotspot.Entities:
Keywords: Megophryidae; RNA-seq; high-altitude-specific genes; sex determination
Mesh:
Year: 2019 PMID: 31683620 PMCID: PMC6895926 DOI: 10.3390/genes10110873
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Scutiger cf. sikimmensis found in the Central Himalaya at 3289 m.
Summary of transcriptome sequencing based on RNA samples from six different tissues.
| Brain | Heart | Kidney | Liver | Lung | Testis | |
|---|---|---|---|---|---|---|
| Number of raw reads | 62,192,592 | 73,368,700 | 66,429,410 | 64,241,792 | 87,261,442 | 71,858,834 |
| Clean Reads | 60,440,140 | 70,845,324 | 64,422,228 | 63,141,364 | 85,147,870 | 69,426,236 |
| Clean Bases (Gb) | 6.04 | 7.08 | 6.44 | 6.31 | 8.52 | 6.94 |
| Clean Reads Q20 (%) | 97.08 | 97.32 | 97.11 | 96.93 | 97.06 | 96.84 |
| Clean Reads Q30 (%) | 93.51 | 93.86 | 93.55 | 93.23 | 93.44 | 93.00 |
| GC Clean Reads (%) | 44.78 | 45.53 | 46.19 | 45.65 | 45.48 | 45.90 |
N = reads containing >5% unknown nt; Q20 = reads with base call accuracy of 99%; Q30 = reads with base call accuracy of 99.9%.
Summary of assembled unigenes per tissue.
| Brain | Heart | Kidney | Liver | Lung | Testis | All Unigenes | |
|---|---|---|---|---|---|---|---|
| Number of unigenes | 63,189 | 50,037 | 67,145 | 44,915 | 62,363 | 53,203 | 110,889 |
| Total length | 57,057,129 | 40,054,175 | 51,103,731 | 34,377,766 | 49,335,973 | 43,320,879 | 104,314,420 |
| Mean length | 902 | 800 | 761 | 765 | 791 | 814 | 940 |
| N50 length | 1740 | 1448 | 1301 | 1310 | 1426 | 1461 | 1926 |
| GC content % | 44.51 | 44.73 | 45.75 | 44.20 | 44.34 | 44.55 | 44.70 |
| 300–500 bp | 34,328 (54.33%) | 28,492 (56.94%) | 38,309 (57.05%) | 25,402 (56.56%) | 35,569 (57.04%) | 29,194 (54.87%) | 60,933 (54.95%) |
| 600–1000 bp | 11,604 (18.36%) | 9438 (18.86%) | 13,503 (20.11%) | 8933 (19.89%) | 11,547 (18.52%) | 10,152 (19.08%) | 19,325 (17.43%) |
| 1100–2000 bp | 9286 (14.70%) | 7310 (14.61%) | 9800 (14.60%) | 6744 (15.02%) | 9297 (14.91%) | 8476 (15.93%) | 15,479 (13.96%) |
| 2100–3000 | 4476 (7.08%) | 2970 (5.94%) | 3576 (5.33%) | 2651 (5.90%) | 3851 (6.18%) | 3680 (6.92%) | 7659 (6.91%) |
| ≥ 3000 bp | 3495 (5.53%) | 1827 (3.65%) | 1957 (2.91%) | 1185 (2.64%) | 2099 (3.37%) | 1701 (3.20%) | 7493 (6.76%) |
N50 length = weighted median statistic that 50% of the total length is contained in unigenes that are equal to or larger than this value.
Figure 2Length distribution of all unigenes based on RNA samples from different tissues of Scutiger cf. sikimmensis. The x-axis represents the sequence length (base pairs), while the y-axis represents the number of transcripts.
Figure 3Venn diagram of shared and unique unigenes in Scutiger cf. sikimmensis among the five most informative databases used for annotation. GO = gene orthology; InterPro = integrative protein signature database; KEGG = Kyoto Encyclopedia of Genes and Genomes; KOG = EuKaryotic Orthologous Groups; NR = non-redundant protein database; NT = non-redundant nucleotide database; SwissProt = Swiss Protein Sequence Database.
Sex-determining and sex differentiation gene inventory and homologous Scutiger cf. sikimmensis transcripts. Gene expression is calculated as FPKM based on RNA-seq data from six tissues; the highest gene expression level per gene over all tissues is indicated in bold. CL = cluster of several unigenes.
| Expression in Tissue (FPKM) | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene | Organism | Accession No. | mRNA length, nt | Protein length, amino acids | unigene | E-Value | Length, nt | Brain | Heart | Kidney | Liver | Lung | Testis |
| ALDH1A2 |
| AAI57514.1 | 633 | 211 | Unigene17092_All | 0.00 | 2208 | 3.88 | 1.33 | 1.25 | 1.56 | 6.60 |
|
| ALDH1A3 |
| XP_002939310.1 | 4056 | 512 | Unigene17754_All | 0.00 | 2443 | 3.87 | 10.22 |
| 6.00 | 7.19 | 6.42 |
| AR |
| XP_002941888.2 | 3497 | 788 | Unigene59002_All | 2.00 × 10−33 | 247 | 0.00 |
| 0.00 | 0.39 | 0.00 | 0.00 |
| AR |
| XP_002941888.2 | 3497 | 788 | Unigene61378_All | 9.00 × 10−55 | 247 | 0.00 |
| 0.00 | 0.00 | 0.00 | 0.00 |
| CTNNB1 |
| NP_001016958.1 | 3382 | 781 | Unigene8277_All | 0.00 | 3640 |
| 88.56 | 60.64 | 29.71 | 57.76 | 83.16 |
| CTNNB1 |
| NP_001016958.1 | 3382 | 781 | Unigene10320_All | 1.00 × 10−4 | 815 | 7.56 | 1.32 | 2.78 | 1.17 | 3.02 | 1.13 |
| CXCR4B |
| NP_001080681.1 | 2115 | 358 | Unigene12380_All | 0.00 | 3085 | 8.85 | 3.18 | 5.29 | 11.14 |
| 2.93 |
| CYP26A1 |
| AAI71087.1 | 1458 | 492 | Unigene67459_All | 0.00 | 1560 | 0.45 | 0.00 | 0.00 | 0.00 | 0.00 |
|
| CYP26B1 |
| AAI35552.1 | 6137 | 511 | Unigene661_All | 0.00 | 3759 |
| 4.97 | 0.07 | 0.43 | 0.14 | 0.17 |
| CYP26C1 |
| XP_002939137.2 | 4692 | 533 | Unigene43005_All | 0.00 | 2481 |
| 0.07 | 0.03 | 0.03 | 0.13 | 0.03 |
| DHH |
| NM_001097169.1 | 4372 | 396 | Unigene27079_All | 2.00 × 10−130 | 1359 | 0.53 | 0.09 | 2.18 | 0.10 | 0.99 |
|
| DHH |
| NM_001097169.1 | 4372 | 396 | Unigene63854_All | 5.00 × 10−71 | 366 | 0.75 | 0.00 | 0.90 | 0.23 | 0.69 | 1.55 |
| DMRT1 |
| XP_012808036.1 | 1011 | 337 | Unigene30377_All | 2.00 × 10−33 | 312 | 0.61 | 0.47 | 0.00 | 0.00 |
| 0.00 |
| FGF9 |
| XP_002938621.1 | 624 | 208 | Unigene13573_All | 0.00 | 933 |
| 0.00 | 0.23 | 0.16 | 0.00 | 0.60 |
| FOXL2 |
| XP_004917868.1 | 978 | 326 | Unigene52644_All | 1.00 × 10−19 | 271 |
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| GATA-4 |
| NP_001016949.1 | 1599 | 394 | CL10318.Contig1_All | 5.00 × 10−178 | 4385 | 0.00 |
| 0.00 | 3.04 | 0.00 | 0.00 |
| GATA-4 |
| NP_001016949.1 | 1599 | 394 | CL10318.Contig2_All | 0.00 | 4283 | 0.04 |
| 0.03 | 3.56 | 0.02 | 0.29 |
| HHIP |
| NM_001007190.1 | 2717 | 669 | Unigene21373_All | 0.00 | 861 | 0.47 | 0.00 | 0.17 |
| 1.61 | 0.08 |
| HHIP |
| NM_001007190.1 | 2717 | 669 | Unigene57656_All | 3.00 × 10−93 | 448 | 0.00 | 0.00 | 0.00 | 0.72 | 0.41 | 0.00 |
| LRPPRC |
| NP_001039203.1 | 4347 | 1391 | Unigene19789_All | 0.00 | 4425 | 16.93 | 23.44 | 18.77 | 9.94 | 9.55 |
|
| NR0B1 |
| XP_002933661.1 | 834 | 278 | CL4129.Contig1_All | 1.00 × 10−122 | 772 | 0.00 | 0.33 | 0.00 |
| 0.07 | 0.00 |
| NR0B1 |
| XP_002933661.1 | 834 | 278 | CL4129.Contig2_All | 4.00 × 10−179 | 1108 | 1.11 | 0.00 | 0.00 |
| 0.00 | 0.00 |
| NR0B1 |
| XP_002933661.1 | 834 | 278 | Unigene53513_All | 4.00 × 10−46 | 540 | 0.26 | 0.00 | 0.00 |
| 0.00 | 0.00 |
| PDGFa |
| NM_001170497.1 | 1574 | 660 | Unigene15273_All | 2.00 × 10−116 | 1671 |
| 0.87 | 0.80 | 0.79 | 0.25 | 2.46 |
| PDGFb |
| AAI60575.1 | 2140 | 240 | Unigene19730_All | 2.00 × 10−81 | 1802 | 5.74 | 3.99 | 3.56 | 5.27 |
| 1.19 |
| PTCH2 |
| XP_002937129.2 | 6786 | 1423 | Unigene8457_All | 0.00 | 2946 |
| 0.12 | 0.86 | 0.12 | 0.41 | 1.12 |
| RSPO-1 |
| NP_001121500.1 | 1946 | 257 | Unigene19575_All | 3.00 × 10−118 | 1913 | 1.27 | 0.32 | 2.84 | 2.10 | 1.69 |
|
| SOX10 |
| NP_001093691.1 | 2895 | 436 | Unigene39328_All | 0.00 | 3641 |
| 0.28 | 0.04 | 0.06 | 0.14 | 0.16 |
| SOX8 |
| XP_002932315.2 | 2389 | 466 | Unigene17029_All | 0.00 | 3334 |
| 1.44 | 0.55 | 0.29 | 0.41 | 0.40 |
| SOX9 |
| AAT72000.1 | 2538 | 482 | Unigene17030_All | 0.00 | 1955 |
| 1.20 | 1.14 | 1.73 | 0.00 | 0.34 |
| SRD5A1 |
| NP_001006841.1 | 1537 | 257 | CL1692.Contig1_All | 9.00 × 10−5 | 1964 |
| 0.92 | 2.07 | 1.12 | 0.15 | 1.51 |
| SRD5A1 |
| NP_001006841.1 | 1537 | 257 | Unigene12175_All | 1.00 × 10−125 | 1559 | 2.33 | 0.70 | 3.87 | 5.56 | 0.79 |
|
| SRD5A3 |
| AAH42255.1 | 957 | 319 | CL10211.Contig1_All | 1.00 × 10−32 | 295 |
| 0.00 | 0.00 | 0.00 | 0.63 | 0.29 |
| SRD5A3 |
| AAH42255.1 | 957 | 319 | CL10211.Contig2_All | 1.00 × 10−95 | 931 | 3.37 | 3.70 | 3.26 | 1.48 |
| 3.69 |
| WNT4 |
| NP_001239015.1 | 1962 | 351 | Unigene64864_All | 2.00 × 10−44 | 216 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
|
| WT1 |
| NP_001135625.1 | 6193 | 413 | CL1216.Contig4_All | 0.00 | 3364 | 0.86 | 1.28 | 2.58 | 0.55 | 6.20 |
|
| WT1 |
| NP_001135625.1 | 6193 | 413 | CL4987.Contig2_All | 3.00 × 10−33 | 308 | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.28 |
| WT1 |
| NP_001135625.1 | 6193 | 413 | Unigene71862_All | 1.00 × 10−26 | 264 | 0.00 | 0.58 |
| 0.00 | 0.00 | 0.00 |