| Literature DB >> 34573405 |
Sylvia Hofmann1,2, Chitra Bahadur Baniya3, Matthias Stöck4,5, Lars Podsiadlowski6.
Abstract
The Himalayan Arc is recognized as a global biodiversity hotspot. Among its numerous cryptic and undiscovered organisms, this composite high-mountain ecosystem harbors many taxa with adaptations to life in high elevations. However, evolutionary patterns and genomic features have been relatively rarely studied in Himalayan vertebrates. Here, we provide the first well-annotated transcriptome of a Greater Himalayan reptile species, the Ladakh Ground skink Asymblepharus ladacensis (Squamata: Scincidae). Based on tissues from the brain, an embryonic disc, and pooled organ material, using pair-end Illumina NextSeq 500 RNAseq, we assembled ~77,000 transcripts, which were annotated using seven functional databases. We tested ~1600 genes, known to be under positive selection in anurans and reptiles adapted to high elevations, and potentially detected positive selection for 114 of these genes in Asymblepharus. Even though the strength of these results is limited due to the single-animal approach, our transcriptome resource may be valuable data for further studies on squamate reptile evolution in the Himalayas as a hotspot of biodiversity.Entities:
Keywords: Himalayas; Scincidae; adaptation; evolution; genomic; high elevation
Mesh:
Year: 2021 PMID: 34573405 PMCID: PMC8466045 DOI: 10.3390/genes12091423
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Map of Asymblepharus species based on GBIF (www.gbif.org; accessed on 20 July 2021) records of preserved specimens and georeferenced localities in the taxonomic reptile database (https://reptile-database.reptarium.cz/; accessed on 20 July 2021). The location of our RNA sample of the female A. ladacensis (photo) is indicated by a green circle with a dot in the middle and an arrow. * Note, according to a large-scale phylogeny of squamates, A. sikimmensis is nested within Scincella; however, it remains unclear whether this single “A. sikimmensis” specimen, on which the sequence data are based, had been taxonomically correctly identified. Therefore, we also show the GBIF records of specimens collected as A. sikimmensis. Records of A. eremchenkoi in the databases could not be georeferenced due to insufficient information on the collection site. Photo credit: Sylvia Hofmann.
Summary of sequencing data used to obtain the de novo transcriptome assemblies of Asymblepharus ladacensis based on paired-end Illumina sequencing. Final assemblies based on four unique assemblies per sample generated by ORP using different assemblers and k-mers.
| Brain Tissue | Embryonic Disc Tissue | Pooled Tissue | |
|---|---|---|---|
| Number of paired-end raw reads | 74,780,628 | 73,557,120 | 64,195,486 |
| Number of cleaned reads | 73,139,294 | 71,887,892 | 61,397,610 |
| Number of base pairs in final assembly | 102,605,079 | 98,917,807 | 47,613,446 |
| Number of transcripts in final assembly | 151,718 | 105,133 | 66,696 |
| Average transcript length (bp) | 676 | 940 | 712 |
| Minimum transcript length (bp) | 131 | 131 | 131 |
| Maximum transcript length (bp) | 17,543 | 18,168 | 15,866 |
| N50 | 1215 | 2052 | 1194 |
| N90 | 257 | 311 | 278 |
| GC% content of the final ORP assembly | 0.48 | 0.48 | 0.48 |
Benchmarking Universal Single-Copy Orthologs (BUSCO) results based on the eukaryotic (EU, eukaryota_odb10; 255 BUSCOs), vertebrates (VB, vertebrata_odb10; 3354 BUSCOs), and tetrapod databases (TP, tetrapoda_odb10, 5310 BUSCOs) searched. BUSCO searches for completed, single-copy, duplicated, fragmented, and missing orthologs within given genomes.
| BUSCO Statistics | Brain | Embryonic Disc | Pooled Tissues | ||||||
|---|---|---|---|---|---|---|---|---|---|
| EU | VB | TP | EU | VB | TP | EU | VB | TP | |
| Complete | 220/255 | 2052/3354 | 2696/5310 | 250/255 | 2743/3354 | 3778/5310 | 178/255 | 1405/3354 | 1750/5310 |
| Single-copy | 189/255 | 1759/3354 | 2305/5310 | 185/255 | 1871/3354 | 2555/5310 | 148/255 | 1150/3354 | 1421/5310 |
| Duplicated | 31/255 | 293/3354 | 391/5310 | 65/255 | 872/3354 | 1223/5310 | 30/255 | 255/3354 | 329/5310 |
| Fragmented | 23/255 | 589/3354 | 709/5310 | 3/255 | 221/3354 | 323/5310 | 47/255 | 654/3354 | 645/5310 |
| Missing | 12/255 | 713/3354 | 1905/5310 | 2/255 | 390/3354 | 1209/5310 | 30/255 | 1295/3354 | 2915/5310 |
Number (N) of transcripts identified in Asymblepharus ladacensis that are shared (=Intersection) and unique among seven annotation database resources. GO—Gene Onthology; InterPro—integrative protein signature database; KEGG—Kyoto Encyclopedia of Genes and Genomes; KOG— EuKaryotic Orthologous Groups; NR—non-redundant protein database; NT—non-redundant nucleotide database; SwissProt—Swiss Protein Sequence Database.
| Total | NR | NT | SwissProt | KEGG | KOG | InterPro | GO | Intersection | Overall | |
|---|---|---|---|---|---|---|---|---|---|---|
| N | 76,968 | 33,444 | 34,114 | 30,994 | 28,961 | 26,608 | 27,013 | 11,010 | 7292 | 39,975 |
| % | 100 | 43.45 | 44.32 | 40.27 | 37.63 | 34.57 | 35.10 | 14.30 | 9.47 | 51.94 |
Figure 2Annotation of the Asymblepharus ladacensis transcriptome. (a) Species distribution of the top BLASTx hit performed against NR database; (b) GO (Gene Onthology) assignments as predicted by Blast2GO; (c) functional distribution of KOG (EuKaryotic Orthologous Groups) annotation and (d) KEGG (Kyoto Encyclopedia of Genes and Genomes) classifications of assembled transcripts.
Summary of the positive selection analysis for high-altitude candidate genes of a toad-headed agama (Phrynocephalus vlangalii) [15] tested likewise positive in Asymblepharus (A) using BUSTED (B); p-value > 0.05) [55], FUBAR (FB; number of sites under positive selection) [56] and aBSREL (aB) [57,58] methods. PSGs are represented by the last six digits of the anole lizard’s (Anolis carolinensis) ENSEMBL gene and transcript identifiers (starting with ENSACAG00000, or ENSACAT00000, respectively).
| GeneID | Gene | Gene Description | Transcript | B | FB | aB | |
|---|---|---|---|---|---|---|---|
| 000773 | IL1RAP | 3.58 × 10−2 | Interleukin 1 receptor accessory protein | 000813 | <0.00 × 10−5 | 2 | A. |
| 000907 | MICU1 | 1.79 × 10−2 | Mitochondrial calcium uptake 1 | 000909 | 1.30 × 10−3 | 1 | yes |
| 001142 | TARBP1 | 1.39 × 10−6 | TAR RNA binding protein 1 | 001104 | <0.00 × 10−5 | 1 | yes |
| 002254 | MIA3 | 1.83 × 10−2 | Melanoma inhibitory activity family member 3 | 002276 | 1.93 × 10−2 | 1 | yes |
| 002549 | RPS2 | 1.99 × 10−2 | Ribosomal protein S2 | 002541 | 5.00 × 10−4 | 2 | yes |
| 002995 | RNF10 | 4.38 × 10−2 | Ring finger protein 10 | 003046 | 2.63 × 10−2 | 5 | yes |
| 003987 | NUP107 | 1.54 × 10−4 | Nucleoporin 107kDa | 004158 | <0.00 × 10−5 | 1 | yes |
| 006133 | GRK6 | 1.07 × 10−2 | G protein-coupled receptor kinase 6 | 006252 | 1.16 × 10−2 | 2 | A. |
| 007074 | SMC4 | 1.56 × 10−3 | Structural maintenance of chromosomes 4 | 007191 | 1.00 × 10−4 | 1 | yes |
| 015860 | SH3RF1 | 4.55 × 10−3 | SH3 domain containing ring finger 1 | 015968 | 8.90 × 10−3 | 1 | yes |
Summary of the positive selection analysis for candidate genes of lineages of dicroglossid frogs and toad-headed agamas [7] identified across an elevational gradient, tested likewise positive in Asymblepharus (A) using BUSTED (B); p-value > 0.05) [55], FUBAR (FB; number of sites under positive selection) [56] and aBSREL (aB, number of branches with positive selection) [57,58] methods. Ensembl gene and transcript identifier (ENSACAG00000, ENSACAT00000) refers to Anolis carolinensis.
| Lowland | Gene Description | Transcript | B | FB | aB | |
|---|---|---|---|---|---|---|
| GeneID | Gene | |||||
| 000146 | PCSK9 | Proprotein convertase subtilisin/kexin type 9 | 000163 | 4.40 × 10−3 | 1 | 2 |
| 000201 | NPC1 | NPC intracellular cholesterol transporter 1 | 000261 | 5.10 × 10−3 | 2 | 1 |
| 000264 | LAMP1 | Lysosomal associated membrane protein 1 | 000247 | 2.90 × 10−3 | 8 | 2 |
| 000531 | SEPT12 | Septin 12 | 000602 | 1.29 × 10−2 | 1 | 1 |
| 000768 | SYK | Spleen associated tyrosine kinase | 000802 | <0.00 × 10−5 | 1 | 3 |
| 000798 | WBP4 | WW domain binding protein 4 | 000804 | 2.00 × 10−2 | 3 | 1 |
| 000955 | QSOX1 | Quiescin sulfhydryl oxidase 1 | 000962 | 9.20 × 10−3 | 3 | 1 |
| 001070 | CADM1 | Cell adhesion molecule 1 | 001188 | 1.00 × 10−4 | 1 | 1 |
| 002270 | KANK1 | KN motif and ankyrin repeat domains 1 | 002294 | 8.00 × 10−4 | 2 | 2 |
| 003015 | VLDLR | Very low-density lipoprotein receptor | 003096 | 1.66 × 10−2 | 3 | 1 |
| 003460 | ANKRD12 | Ankyrin repeat domain 12 | 003484 | 1.80 × 10−3 | 1 | 1 |
| 003908 | MDM1 | Mdm1 nuclear protein | 003923 | 2.30 × 10−3 | 2 | 2 |
| 005884 | CCDC66 | Coiled-coil domain containing 66 | 025744 | <0.00 × 10−5 | 1 | 1 |
| 006279 | GLYR1 | Glyoxylate reductase 1 homolog | 006325 | <0.00 × 10−5 | 1 | 3 |
| 007694 | ACOX1 | Acyl-CoA oxidase 1 | 007823 | <0.00 × 10−5 | 8 | 1 |
| 008005 | PHACTR2 | Phosphatase and actin regulator 2 | 008047 | 4.30 × 10−3 | 5 | 1 |
| 008420 | VTA1 | Vesicle trafficking 1 | 008463 | 3.36 × 10−2 | 1 | 1 |
| 009938 | 010074 | <0.00 × 10−5 | 6 | 3 | ||
| 013301 | COL1A2 | Collagen type I α 2 chain | 013614 | 1.10 × 10−3 | 17 | 3 |
| 013917 | MSH2 | MutS homolog 2 | 014076 | 1.11 × 10−2 | 1 | 1 |
| 013984 | LRRCC1 | Leucine-rich repeat and coiled-coil centrosomal protein 1 | 014100 | 2.50 × 10−2 | 1 | 2 |
| 016683 | TENT2 | Terminal nucleotidyltransferase 2 | 016777 | <0.00 × 10−5 | 2 | 2 |
| 017936 | ATP6V0A1 | ATPase H+ transporting V0 subunit a1 | 018008 | 3.50 × 10−3 | 1 | 1 |
| Up to 2000 m | ||||||
| Gene ID | Gene | Gene description | Transcript | B | FB | aB |
| 000608 | FBXL3 | F-box and leucine-rich repeat protein 3 | 000548 | 3.47 × 10−2 | 2 | 1 |
| 000768 | SYK | Spleen associated tyrosine kinas | 000802 | <0.00 × 100 | 1 | 3 |
| 000837 | KATNB1 | Katanin p80 (WD repeat containing) subunit B 1 | 000896 | 4.96 × 10−2 | 1 | 1 |
| 000955 | QSOX1 | Quiescin sulfhydryl oxidase 1 | 000962 | 8.90 × 10−3 | 3 | 1 |
| 002090 | KIAA0232 | KIAA0232 | 002069 | <0.00 × 100 | 5 | 2 |
| 002091 | RANBP2 | RAN binding protein 2 | 002100 | 7.00 × 10−3 | 2 | 1 |
| 002556 | SLC25A1 | Solute carrier family 25 member 1 | 002543 | 5.00 × 10−3 | 2 | 1 |
| 002948 | RSPH1 | Radial spoke head component 1 | 002963 | 1.59 × 10−2 | 2 | 1 |
| 003975 | TOGARAM1 | TOG array regulator of axonemal microtubules 1 | 004004 | <0.00 × 10−5 | 1 | 3 |
| 003987 | NUP107 | Nucleoporin 107 | 004158 | <0.00 × 10−5 | 1 | 2 |
| 004938 | RARS1 | Arginyl-tRNA synthetase 1 | 005003 | <0.00 × 10−5 | 3 | 2 |
| 005569 | CCT4 | Chaperonin containing TCP1 subunit 4 | 005683 | 2.00 × 10−4 | 2 | 1 |
| 006189 | PARP1 | Poly(ADP-ribose) polymerase 1 | 006356 | 4.11 × 10−2 | 4 | 1 |
| 006926 | CTNND1 | Catenin delta 1 | 007006 | 3.00 × 10−4 | 1 | 1 |
| 007100 | ZNF277 | Zinc finger protein 277 | 007173 | 1.17 × 10−2 | 3 | 3 |
| 007489 | PRDX4 | Peroxiredoxin 4 | 007498 | 2.20 × 10−3 | 1 | 1 |
| 007985 | POLR3A | RNA polymerase III subunit A | 008077 | 1.37 × 10−2 | 1 | 1 |
| 008005 | PHACTR2 | Phosphatase and actin regulator 2 | 008047 | 3.90 × 10−3 | 5 | 1 |
| 008475 | SNTA1 | Syntrophin α 1 | 008544 | 5.20 × 10−3 | 5 | 1 |
| 008991 | KEAP1 | Kelch-like ECH-associated protein 1 | 009010 | 2.00 × 10−4 | 3 | 1 |
| 009213 | SWAP70 | Switching B cell complex subunit SWAP70 | 009242 | 8.70 × 10−3 | 1 | 1 |
| 013059 | JPH1 | Junctophilin 1 | 013093 | 1.09 × 10−2 | 4 | 1 |
| 013301 | COL1A2 | Collagen type I α 2 chain | 013614 | 1.10 × 10−3 | 17 | 3 |
| 013326 | ADGRF5 | Adhesion G protein-coupled receptor F5 | 013405 | 4.00 × 10−4 | 1 | 1 |
| 015062 | COL3A1 | Collagen type III α 1 chain | 015539 | 3.00 × 10−2 | 11 | 4 |
| 015374 | RANBP17 | RAN binding protein 17 | 015630 | 7.90 × 10−3 | 3 | 2 |
| 015422 | BAIAP2 | BAR/IMD-domain-containing adaptor protein 2 | 015540 | 4.96 × 10−2 | 2 | 2 |
| 016662 | PGS1 | Phosphatidylglycerophosphate synthase 1 | 016740 | <0.00 × 10−5 | 1 | 1 |
| 017208 | FLOT1 | Flotillin 1 | 017291 | <0.00 × 10−5 | 4 | 2 |
| 017228 | SLC4A1 | Solute carrier family 4 member 1 (Diego blood group) | 017345 | 4.36 × 10−2 | 2 | 1 |
| 017316 | PTBP3 | Polypyrimidine tract binding protein 3 | 017407 | 6.00 × 10−4 | 2 | 3 |
| 018003 | FGFR1 | Fibroblast growth factor receptor 1 | 018080 | 6.00 × 10−4 | 2 | 1 |
|
| ||||||
| Gene ID | Gene | Gene description | Transcript | B | FB | aB |
| 000306 | ZNF622 | Zinc finger protein 622 | 000291 | 8.50 × 10−3 | 5 | 1 |
| 000907 | MICU1 | Mitochondrial calcium uptake 1 | 000909 | 1.30 × 10−3 | 1 | 1 |
| 001396 | ABCC3 | ATP-binding cassette subfamily C member 3 | 001480 | <0.00 × 10−5 | 1 | 1 |
| 002254 | MIA3 | Melanoma inhibitory activity family member 3 | 002276 | 2.22 × 10−2 | 1 | 2 |
| 002779 | CDH1 | Cadherin 1 | 003031 | 1.50 × 10−3 | 1 | 1 |
| 004137 | STARD13 | StAR-related lipid transfer domain containing 13 | 004235 | 2.00 × 10−4 | 3 | 1 |
| 005084 | COL1A1 | Collagen type I α 1 chain | 005298 | 5.00 × 10−4 | 18 | 2 |
| 006739 | RALGAPB | Ral GTPase-activating protein, β subunit | 006803 | 9.40 × 10−3 | 1 | 1 |
| 006920 | NEO1 | Neogenin 1 | 007074 | <0.00 × 10−5 | 1 | 1 |
| 006926 | CTNND1 | Catenin delta 1 | 007006 | 3.00 × 10−4 | 1 | 1 |
| 007694 | ACOX1 | Acyl-CoA oxidase 1 | 007823 | 0.00 × 100 | 8 | 1 |
| 007907 | FLOT2 | Flotillin 2 | 008015 | 0.00 × 100 | 1 | 1 |
| 008206 | ADGRG6 | Adhesion G protein-coupled receptor G6 | 008405 | 1.21 × 10−2 | 4 | 1 |
| 009366 | PLEKHG3 | Pleckstrin homology and RhoGEF domain containing G3 | 026649 | 1.43 × 10−2 | 1 | 2 |
| 010640 | MYLK | Myosin light chain kinase | 010735 | 3.00 × 10−4 | 2 | 1 |
| 011707 | FYN | FYN proto-oncogene, Src family tyrosine kinase | 011760 | 5.00 × 10−3 | 1 | 1 |
| 013938 | SPEG | SPEG complex locus | 023342 | 3.99 × 10−2 | 2 | 2 |
| 014232 | RBBP5 | RB binding protein 5 | 014319 | <0.00 × 10−5 | 5 | 2 |
| 014373 | CD82 | Tetraspanin | 014458 | 2.92 × 10−2 | 5 | 1 |
| 015062 | COL3A1 | Collagen type III α 1 chain | 015539 | 2.92 × 10−2 | 11 | 4 |
| 015121 | SLC26A4 | Solute carrier family 26 member 4 | 015204 | 3.02 × 10−2 | 2 | 1 |
| 015894 | PNN | Pinin, desmosome associated protein | 015973 | 1.56 × 10−2 | 1 | 1 |
| 016077 | 016133 | 1.00 × 10−3 | 4 | 3 | ||
| 017036 | NIF3L1 | NGG1 interacting factor 3 like 1 | 017110 | 8.50 × 10−3 | 4 | 1 |
|
| ||||||
| GeneID | Gene | Gene description | Transcript | B | FB | aB |
| 000773 | IL1RAP | Interleukin 1 receptor accessory protein | 000813 | <0.00 × 10−5 | 2 | 5 |
| 001545 | ZCCHC8 | Zinc finger CCHC-type containing 8 | 001573 | 2.78 × 10−2 | 1 | 2 |
| 002034 | CCDC138 | Coiled-coil domain containing 138 | 002022 | 4.00 × 10−4 | 1 | 1 |
| 003612 | NR3C2 | Nuclear receptor subfamily 3 group C member 2 | 003696 | 4.97 × 10−2 | 7 | 1 |
| 004231 | COL6A3 | Collagen type VI α 3 chain | 004512 | <0.00 × 10−5 | 6 | 2 |
| 005058 | RPL11 | Ribosomal protein L11 | 005076 | 1.67 × 10−2 | 1 | 1 |
| 005562 | PFKM | Phosphofructokinase, muscle | 005957 | 4.83 × 10−2 | 3 | 2 |
| 006776 | FUS | FUS RNA-binding protein | 006895 | 6.70 × 10−3 | 1 | 1 |
| 006926 | CTNND1 | Catenin delta 1 | 007006 | 3.00 × 10−4 | 1 | 1 |
| 007250 | ATP11B | ATPase phospholipid transporting 11B (putative) | 007392 | 1.40 × 10−3 | 12 | 1 |
| 007887 | ABHD3 | Abhydrolase domain containing 3, phospholipase | 007896 | 4.00 × 10−3 | 2 | 2 |
| 009164 | NADK2 | NAD kinase 2, mitochondrial | 009211 | 1.18 × 10−2 | 1 | 1 |
| 009700 | 009686 | 5.50 × 10−3 | 1 | 1 | ||
| 009800 | FLNA | Filamin A | 010200 | 0.00 × 100 | 2 | 2 |
| 011843 | SENP7 | SUMO specific peptidase 7 | 011850 | 1.43 × 10−2 | 1 | 1 |
| 013301 | COL1A2 | Collagen type I α 2 chain | 013614 | 1.20 × 10−3 | 17 | 3 |
| 013313 | AGAP2 | ArfGAP with GTPase domain, ankyrin repeat, PH domain 2 | 013455 | 3.40 × 10−3 | 1 | 1 |
| 014695 | RFWD2 | COP1 E3 ubiquitin ligase | 014789 | 1.00 × 10−3 | 2 | 2 |
| 014919 | ALDOA | Aldolase, fructose-bisphosphate A | 014984 | 2.99 × 10−2 | 2 | 1 |
| 015785 | CDK14 | Cyclin-dependent kinase 14 | 015872 | 6.10 × 10−3 | 1 | 1 |
| 016662 | PGS1 | Phosphatidylglycerophosphate synthase 1 | 016740 | <0.00 × 10−5 | 1 | 1 |
| 017054 | NFIX | Nuclear factor I X | 017136 | 9.50 × 10−3 | 1 | 1 |
| 017166 | FARSA | Phenylalanyl-tRNA synthetase subunit α | 017239 | 2.58 × 10−2 | 1 | 2 |
| 017936 | ATP6V0A1 | ATPase H + transporting V0 subunit a1 | 018008 | 3.50 × 10−3 | 1 | 1 |
|
| ||||||
| Gene ID | Gene | Gene description | Transcript | B | FB | aB |
| 016662 | PGS1 | Phosphatidylglycerophosphate synthase 1 | 016740 | <0.00 × 10−5 | 1 | 1 |