| Literature DB >> 31409288 |
Dwin G B Grashof1, Harald M I Kerkkamp1, Sandra Afonso2, John Archer2, D James Harris2, Michael K Richardson1, Freek J Vonk1,3, Arie van der Meijden4.
Abstract
BACKGROUND: Venom has evolved in parallel in multiple animals for the purpose of self-defense, prey capture or both. These venoms typically consist of highly complex mixtures of toxins: diverse bioactive peptides and/or proteins each with a specific pharmacological activity. Because of their specificity, they can be used as experimental tools to study cell mechanisms and develop novel medicines and drugs. It is therefore potentially valuable to explore the venoms of various animals to characterize their toxins and identify novel toxin-families. This study focuses on the annotation and exploration of the transcriptomes of six scorpion species from three different families. The transcriptomes were annotated with a custom-built automated pipeline, primarily consisting of Basic Local Alignment Search Tool searches against UniProt databases and filter steps based on transcript coverage.Entities:
Keywords: Scorpion; Transcriptome; Venom
Mesh:
Substances:
Year: 2019 PMID: 31409288 PMCID: PMC6693263 DOI: 10.1186/s12864-019-6013-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Phylogenetic position of the species used in this study, indicated in bold font. Other species mentioned in the manuscript are indicated with an asterisk. Phylogeny and taxonomy largely based on Sharma et al. [16] and Santibáñez-López [34]. Some taxa, such as Nebo hierichonticus, were placed in the tree based only on taxonomic affiliation
Assembly statistics and coverage cutoff statistics of the chela and telson transcriptomes per scorpion species
| NCBI accession number | # of transcripts | Average coverage | # of transcripts after coverage cutoff | Average coverage after coverage cutoff | |
|---|---|---|---|---|---|
| SAMN12385121 | 92,307 | 33.5 | 28,563 (30%) | 103.4 | |
| SAMN12385122 | 66,949 | 25.5 | 20,970 (31%) | 76.6 | |
| SAMN12385123 | 49,557 | 66.5 | 17,357 (35%) | 185.8 | |
| SAMN12385124 | 65,083 | 29.7 | 19,125 (30%) | 96.2 | |
| SAMN12385125 | 58,014 | 66.5 | 18,414 (32%) | 205.0 | |
| SAMN12385126 | 46,313 | 39.0 | 15,863 (34%) | 109.8 | |
| SAMN12385127 | 70,182 | 95.8 | 37,545 (54%) | 176.2 | |
| SAMN12385128 | 50,776 | 69.7 | 28,630 (56%) | 121.1 | |
| SAMN12385129 | 64,672 | 97.1 | 35,220 (54%) | 175.6 | |
| SAMN12385130 | 46,614 | 74.3 | 25,872 (56%) | 131.2 | |
| SAMN12385131 | 68,269 | 29.0 | 21,201 (31%) | 88.6 | |
| SAMN12385132 | 55,549 | 30.9 | 16,621 (30%) | 98.2 |
Toxin expression levels in the telson transcriptomes of the six scorpion species after the coverage cutoff of 5 and the orthologue cutoff, together with the expression levels of transcripts labelled as “physiological”, “toxin” or “unidentified” by the bioinformatics pipeline described in the method section
| # of transcripts after cutoffs | # of “physiological” labelled transcripts | # of “toxin” labelled transcripts | # of “unidentified” labelled transcripts | |
|---|---|---|---|---|
|
| 20,048 | 5100 | 247 | 14,701 |
|
| 8477 | 3240 | 134 | 5103 |
|
| 10,937 | 3733 | 179 | 7025 |
|
| 26,005 | 5178 | 317 | 20,510 |
|
| 22,848 | 4516 | 130 | 18,202 |
|
| 12,857 | 3750 | 79 | 9028 |
Fig. 2Coverage of the transcripts in the transcriptomes grouped by their label, per scorpion species. The coverage is an indicator of the expression in the telson tissue
Composition of the transcripts labelled as toxin, shown per scorpion species
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
| α-NaTx | 25 | 9 | 8 | 25 | 2 | 1 |
| β-NaTx | 17 | 4 | 14 | 27 | 4 | 4 |
| α-KTX | 20 | 11 | 8 | 24 | 2 | 2 |
| β-KTX | 5 | 2 | 2 | 8 | 3 | 3 |
| γ-KTX | 10 | 7 | 5 | 19 | 2 | 2 |
| κ-KTX | 0 | 0 | 0 | 1 | 1 | 1 |
| Clorotoxin | 2 | 2 | 4 | 7 | 2 | 2 |
| CaTx | 2 | 3 | 0 | 7 | 2 | 0 |
| Kunitz-type | 9 | 11 | 3 | 15 | 6 | 1 |
| M-theraphotoxin | 3 | 6 | 4 | 4 | 1 | 3 |
| Bradykinin potentiating peptide (BPP) | 3 | 1 | 1 | 6 | 10 | 8 |
| BmKa2-like | 30 | 6 | 14 | 33 | 3 | 1 |
| Phospholipase A2 (PLA2) | 10 | 7 | 3 | 6 | 18 | 5 |
| Other toxins | 111 | 65 | 113 | 135 | 74 | 46 |
| Total toxins | 247 | 134 | 179 | 317 | 130 | 79 |
Fig. 3Toxin-family coverage based expression of six scorpion species. The coverage is shown as a percentage of the total expression for that scorpion
Additional information about the identification of the clusters done with BLASTp searches against NCBI’s non-redundant database
| Cluster name | Best BLASTp hit | Signal peptide | C pattern | Conserved residues | New cluster/singlet label |
|---|---|---|---|---|---|
| Cluster 1 | Lamda-potassium channel toxin (ADT64271.1) | Some | Yesa | High | New toxins in the lamda-potassium channel toxin-family |
| Cluster 2 | Hypothetical secreted protein (ADY39531.1) | High | Yesa | High | Novel putative toxin-family 1 |
| Cluster 3 | U6-buthitoxin-Hj1a (ADY39519.1) | High | Yesa | High | New toxins in the buthitoxin family |
| Cluster 4 | Orphan peptide AbOp-11 (AIX87714.1) | High | N.A. | High | Novel putative secreted proteins |
| Cluster 5a | Hypothetical secreted protein (ADY39514.1) | High | Yesa | High | Novel putative toxin-family 2 |
| Cluster 6 | venom peptide HtC4Tx1(AOF40173.1) | Low | Yesa | Low | Novel putative toxin-family 3 |
| Cluster 7 | hypothetical protein (WP_063562212.1) | Low | Noa | Low | Novel putative toxin-family 4 |
| Cluster 8 | Orphan peptide AbOp-18 (AIX87708.1) | High | N.A. | High | New toxins in the neuropeptide toxin-families |
| Cluster 9 | Venom toxin meuTx23 (AMX81480.1) | Low | N.A. | High | New toxins related to meuTx23 |
| Cluster 10 | Hypothetical secreted protein (ADY39511.1) | High | N.A. | Low | Novel putative secreted proteins |
| Cluster 11 | RNA-binding protein, putative (SCO66159.1) | Low | N.A. | Some | Novel putative secreted proteins |
| Cluster 12 | Uncharacterized protein (XP_023221782.1) | High | Yesa | High | Novel putative toxin-family 5 |
| Cluster 13 | Potassium channel toxin alpha-KTx 4.5 (Q5G8B6.1) | Low | Yesa | High | New toxins in the potassium channel toxin alpha-KTx 4.5 toxin family |
| Cluster 14 | Hypothetical protein (AEX09189.1) | High | N.A. | Some | Novel putative secreted proteins |
| Cluster 15 | Hypothetical protein (GAU10035.1) | Low | Yesa | Some | Novel putative short toxin family 6 |
| Singlet 1 | No Hit | N.A | N.A | N.A | Novel putative secreted protein |
| Singlet 2 | No Hit | N.A. | N.A. | N.A. | Novel putative secreted protein |
| Singlet 3 | Hypothetical protein (AEX09189.1) | High | N.A. | Some | Novel putative secreted protein |
| Singlet 4 | Orphan peptide AbOp-11 (AIX87714.1) | High | N.A. | Low | Novel putative secreted protein |
| Singlet 5 | Potassium channel toxin kappa-KTx (P0DJ41.1) | Some | Yesa | Low | New potassium channel toxin |
| Singlet 6 | Venom peptide Htgkr2 (AOF40260.1) | Some | N.A. | Low | Novel putative secreted protein |
| Singlet 7 | No Hit | N.A | N.Aa | N.A | Novel putative toxin 1 |
| Singlet 8 | SH3 domain and tetratricopeptide repeat-containing protein (XP_004574858.2) | Low | N.A. | Low | Novel putative secreted protein |
| Singlet 9 | Putative antimicrobial peptide (AEX09192.1) | High | N.A. | High | Novel putative AMP |
aIndicates clusters or singlets with a conserved C pattern