| Literature DB >> 31652611 |
Nicolas Langenegger1, Wolfgang Nentwig2, Lucia Kuhn-Nentwig3.
Abstract
This review gives an overview on the development of research on spider venoms with a focus on structure and function of venom components and techniques of analysis. Major venom component groups are small molecular mass compounds, antimicrobial (also called cytolytic, or cationic) peptides (only in some spider families), cysteine-rich (neurotoxic) peptides, and enzymes and proteins. Cysteine-rich peptides are reviewed with respect to various structural motifs, their targets (ion channels, membrane receptors), nomenclature, and molecular binding. We further describe the latest findings concerning the maturation of antimicrobial, and cysteine-rich peptides that are in most known cases expressed as propeptide-containing precursors. Today, venom research, increasingly employs transcriptomic and mass spectrometric techniques. Pros and cons of venom gland transcriptome analysis with Sanger, 454, and Illumina sequencing are discussed and an overview on so far published transcriptome studies is given. In this respect, we also discuss the only recently described cross contamination arising from multiplexing in Illumina sequencing and its possible impacts on venom studies. High throughput mass spectrometric analysis of venom proteomes (bottom-up, top-down) are reviewed.Entities:
Keywords: Araneae; bioinformatics; mass spectrometry; neurotoxins; proteomics; spiders; transcriptomics; venomics
Year: 2019 PMID: 31652611 PMCID: PMC6832493 DOI: 10.3390/toxins11100611
Source DB: PubMed Journal: Toxins (Basel) ISSN: 2072-6651 Impact factor: 4.546
Figure 1Antimicrobial peptides (AMPs) and their proposed mechanism of action. (A) Model of interaction between an AMP and phospholipids. AMPs assume an amphipathic α-helical structure in proximity to cellular membranes. The hydrophobic side of the helix (white) inserts into the membrane and interacts with the phospholipid side chains. The positively charged side (blue) interacts with negatively charged lipid head groups. (B) Models of membranolytic actions of AMPs. (C) NMR-based 3D structures of two antimicrobial peptides from spider venom. Electrostatics were computed using PDB2PQR [70]. Blue surfaces represent positively charged surfaces; red negative charged; and white neutral.
Figure 2Structural motifs of cysteine-rich spider venom peptides. Disulphide connected cysteines are numbered and disulphide bridges are shown in yellow. A linear schematic representation of the disulphide bridge pattern is shown below the corresponding 3D structure. (A) Inhibitor cystine knot motif. The disulphide bridge colored in orange penetrates the ring opened by the peptide backbone and the two other disulphide bridges. (B) Disulphide-directed β-hairpin motif. The disulphide bridge colored in blue is optional for this motif. (C) Kunitz-type motif. (D) Colipase or MIT1-like motif. (E) Helical arthropod-neuropeptide-derived motif.
Activities and nomenclature [119] of main spider venom toxins.
| Prefix | Target | Action | Example |
|---|---|---|---|
| ω (omega) | CaV channels | Inhibits CaV channels | omega-agatoxin-1a, P15969 |
| κ (kappa) | KV channels | Inhibits KV channels | kappa-hexatoxin-Hv1c, P82228 |
| β (beta) | NaV channels | Shifts voltage dependence of NaV channel activation | beta-hexatoxin-Mg1a, P83561 |
| δ (delta) | NaV channels | Delays inactivation of NaV channels | delta-miturgitoxin-Cp1b, C0HKG8 |
| µ (mu) | NaV channels | Inhibits NaV channels | mu-diguetoxin-Dc1a, P49126 |
| M | membrane | Membranolytic activity | M-ctenitoxin-Cs1a, P83619 |
| U | unknown | Unknown activity | U2-ctenitoxin-Cs1a, P83919 |
Figure 3Binding of neurotoxins to a voltage gated ion channel. Peptides are shown space filling, ion channels as cartoons with every domain in a different color. (A) Hydrophobic patch with surrounding ring of charged amino acids. Hydrophobic amino acids are shown in green, negatively charged in blue, and positively charged in red. (B) Four ProTx-2 toxins bound to the human NaV1.7 channel in side view. Toxin surface represents surface charge. Eye-icons and arrows indicate the viewing angel in panels B1 and B2. (B1) View on one voltage-sensing domain of the channel bound to the toxin from the inner side of the membrane towards outwards. (B2) View on the toxin-channel complex form the top. Helices of the voltage sensing domain are indicated with S1–S4.
Figure 43D structures and schematic structures of modular toxins (multi-domain toxins). Antimicrobial domains are highlighted in blue, disulphide bridges in yellow. (A) Neurotoxin-AMP. (B) AMP-neurotoxin. The structure of spiderine-1a is assembled from two experimental 3D structures of the N- and C-terminal toxin part, respectively. (C) Neurotoxin-neurotoxin. (D) AMP-AMP.
Figure 5Maturing of venom peptides. (A) Stepwise schematic representation of the general maturing processes on the example of CsTx-13 (see panel D). (1) The Processing Quadruplet Motif (PQM) protease cleaves C-terminal of the Arg residue of the PQM or inverted Processing Quadruplet Motif (iPQM). (2) A so far uncharacterized carboxypeptidase subsequently removes the C-terminally exposed Arg if present, and (3) if a C-terminally exposed Gly is present, this is eliminated under formation of a C-terminal amide. (B–E) Schematic representation of chosen precursor sequences (top) and the corresponding mature sequences of neurotoxins. Maturing processes are indicated by arrows and triangles (see legend in the top right corner). (B) Precursor of a monomeric neurotoxin. (C,D) Precursors of heterodimeric neurotoxins. (E) N-terminal segment of a complex AMP precursor. Amino acids with positive side-chain charges at pH 7 are shown in blue, such with negative charges at pH 7 in red. All here shown precursors also comprise signal peptides which are not shown for reasons of space.
Overview on selected transcriptomic studies of spider venom glands.
| Spider Species | Sex/Number of Specimens/Time Since Last Milking | Sequencing Method | Total EST/Contigs | Isolation of Sequences Based on | Identified Toxins/Transcripts | Ref. |
|---|---|---|---|---|---|---|
| Illumina HiSeq 4000 (2 × 100 bp) 2 replicates | NA, aligned to genome assembly | differential expression analysis | 1318 upregulated transcripts | [ | ||
| Sanger (1305 clones) | 106 contigs, 189 singletons | Basic Local Alignment Search Tool (BLAST) (UniProtKB) non-hits manual | 63 transcripts of cys-rich peptides | [ | ||
| 454 GS-FLX | 34,107 contigs | BLAST (UniProtKB), signal peptide, HMM, cys-pattern | 81 transcripts of cys-rich peptides, 56 mature peptides | [ | ||
| NA/6/4 d | Sanger (1299 clones) | 752 ESTs, 61 contigs, 196 singletons | BLAST (UniProtKB, nrNCBI), signal peptide | 257 transcripts, 99 mature peptides | [ | |
| Illumina (2 × 50 bp) HiSeq 2500 | NA, 9177 dimorphic, 1404 non-dimorphic | [ | ||||
| Illumina HiSeq | 75,980 contigs | BLAST (nrNCBI, ntNCBI, SwissProt), domain prediction, GO, cys pattern | 48 potential peptide toxins | [ | ||
| Illumina (2 × 151 bp) HiSeq 1500 Sanger (1476 electrograms) | 49,992 contigs | BLAST (UniProtKB, TSA), domain prediction, GO, expression level BLAST (UniProtKB), ORF, signal peptide, propeptide | 99 or 98 cys-rich peptide toxins | [ | ||
| NA/9 ad./ | Ion Torrent | 94,148 | private HMMs based on ArachnoServer sequences, read count threshold | 37 toxins | [ | |
| Illumina (2 × 101 bp) HiSeq 2000 | 57,181 contigs | BLAST (Toxprot, UniProtKB), cys count ≥ 5, domain prediction | 201 potential toxins | [ | ||
| Sanger (500 clones) | NA | NA | 51 toxin-like peptides | [ | ||
| NA/NA/2 d | 454 GS-FLX | 65,432 | BLAST (EST NCBI, nrNCBI), ORF, cys count ≥ 4, length ≥ 45 amino acids | 1136 | [ | |
| Sanger (500 clones) | 267 ESTs, | BLAST (nrNCBI, UniProtKB) | 127 putative toxin precursors, 90 mature peptides | [ | ||
| Sanger (1717 clones) | 307 ESTs, | BLAST (ArachnoServer, NCBI, private dbs) | 19 putative peptide transcripts | [ | ||
| NA/3/2 d | 454 GS FLX Titanium | 4224 contigs | BLAST (UniProtKB, ToxRelDB, Repbase), | 626 toxin precursors, 90 mature peptides | [ | |
| Illumina (2 × 100 bp) | 85,193 contigs | BLAST (UniProtKB), ORF prediction, differential expression analysis | 695 venom gland specific transcripts | [ | ||
| NA/several/7 d | Sanger (5952 clones) | 451 contigs | cys pattern search, signal | 451 transcripts, 163 mature peptides | [ | |
| NA/2/NA | 454 GS-FLX | 136,469 six-frame translated | ORF prediction, BLAST (ArachnoServer) signal peptide, propeptide, cys count | 970 mature (likely to be an overestimate) | [ | |
| 454 GS FLX Titanium | 4711 contigs | BLAST UniProtKB | 46 full-length toxin precursors | [ | ||
| Sanger | 356 ESTs, | BLAST (nrNCBI, UniProtKB) signal peptide, SpiderP | 53 or 55 cys-knot toxin precursors, 48 mature peptides | [ | ||
| NA/3 ad./NA | Illumina (2 × 90 bp) HiSeq 2000 Sanger | 34,334 contigs | ORF prediction, BLAST (UniProtKB), Cys-pattern, domain prediction (SMART/Pfam) | 146 toxin-like proteins | [ | |
| Sanger | 886 ESTs | ≥4 cys, signal peptide | 200 toxin-like precursors | [ | ||
| NA/30/NA | Sanger (1500 clones) | 869 ESTs | BLAST | 48 peptides | [ | |
| NA/20/4 d | Sanger | 833 ESTs | BLAST (nrNCBI, UniProtKB) | 223 toxin-like transcripts | [ | |
| Sanger (1049 clones) | NA | BLAST, pairing with proteomic data from N-term | 88 peptide toxins | [ | ||
| Sanger (2400 clones) | 1843 ESTs, | BLAST(GenBank) | 88 contigs, 80 singletons (toxin sequences) | [ | ||
| NA/1/2 d | Sanger (282 clones) | 236 ESTs, | BLAST (GenBank, ArachnoServer) | 11 toxin-like, 3 putative toxin transcripts | [ | |
| NA/20/4 d | Sanger | 468 ESTs, | BLAST (nrNCBI, UniProtKB) | 31 mature peptides | [ | |
| Sanger | 3008 ESTs, | BLAST, domain prediction (SMART/Pfam), signal peptide | 93 clusters of known toxins, 117 clusters of possible toxins | [ | ||
| NA/NA/NA | Sanger (2166 clones) | 37 contigs, | BLAST (GenBank, peptide sequence databases) | 48 toxin-like structures | [ | |
| NA/2 glands/NA | Sanger (300 clones) | NA | NA | 10 multi-cys peptides | [ |
* Studies include proteomic experiments; ** Studies include further tissues besides venom glands. The row “Isolation of sequences based on” describes the methods applied to retrieve toxin sequences from the contig database. Databases used for BLAST are given in brackets. Data not available from the cited reference is designated as NA (not available). The following abbreviations are used: TSA = transcriptome shotgun assembly protein database. GO = gene ontology annotation, ORF = open reading frame prediction, HMMs = Hidden Markov Models, ad. = adult, d = days, h = hour.
Overview on selected high-throughput mass spectrometric studies of spider venom.
| Spider Species | Top-Down | Bottom-Up | Venom Pre-Fractionation | Digest | Post-Fractionation | Data Analysis Type | Database | Identified Toxins | Reference |
|---|---|---|---|---|---|---|---|---|---|
| X | SDS-PAGE | in-gel trypsin | online RP-HPLC | db-dependend | predicted transcripts from genome assembly | ≥99 distinct proteins | [ | ||
| X | SEC, desalting | trypsin, chymotrypsin | online RP-HPLC | db-dependend | UniProtKB Araneae sequences + identified transcripts | 49 peptides, | [ | ||
| X | SEC, desalting, online RP-HPLC | - | - | db-dependend | |||||
| X | SDS-PAGE | in-gel trypsin | NA, (probably online RP-HPLC) | db-dependend | translated transcriptome sequences + NCBI chelicerate sequences | 62 distinct proteins in 31 clusters | [ | ||
| X | - | trypsin | MudPIT | db-dependend | predicted proteins from transcriptome | 29 cys-rich peptide toxins | [ | ||
| X | - | trypsin | online RP-HPLC | db-dependend | UniProtKB Arthropoda sequences | Ontogenetic analysis, depending on group: 12, 11, 8, or 8 | [ | ||
| X | SDS-PAGE | in-gel trypsin | online RP-HPLC | ||||||
| X | RP-HPLC | Lys-C/trypsin | online RP-HPLC | db-dependend | some with sequences from UniProt, NCBI or the ArachnoServer | 16 peptides | [ | ||
| X | 10 kDa cutoff spin-column | trypsin, pepsin, chymotrypsin | online RP-HPLC/ | de-novo, spectral-network algorithmes | 190 proteins | [ | |||
| X | solid phase extraction | NA | online RP-HPLC | db-dependend | assembled contigs from transcriptome sequencing | 103 | [ | ||
| X | SDS-PAGE | in-gel trypsin | - | db-dependend | plectreurid transcriptome, private haplogyne vg cDNA libraries, chelicerate seq. in NCBI | 7 astacins-like groups, | [ | ||
| X | NA | trypsin | MudPIT | db-dependend | 61 | [ | |||
| X | NA | trypsin | online RP-HPLC | db-dependend | translated cDNA library | 45 | [ | ||
| X | online RP-HPLC | - | - | ||||||
| X | 2D gel electrophoresis (106 spots) | in-gel trypsin | online RP-HPLC | db-dependend, de-novo | NCBInr with animal species restriction, de-novo sequences matched against transcriptome | db-dependend (functional analysis): 65 de-novo: 130 | [ | ||
| X | - | trypsin | online RP-HPLC | ||||||
| X | SDS-PAGE | in-gel trypsin | online RP-HPLC | db-dependend, | NCBInr with animal species restriction | 75 venom proteins | [ | ||
| X | - | trypsin | online RP-HPLC | ||||||
| X | SEC, > 10 kDa: 2D gel electrophoreses, | in-gel trypsin | online RP-HPLC, - | db-dependend manually de-novo | raw genome data of the arthropod; de-novo: | 47 from in-gel digestion | [ |
Data not available from the cited reference is designated as NA (not available). To following abbreviations are used: SDS-PAGE = sodium dodecyl sulphate polyacrylamide gel electrophoresis, RP-HPLC = reverse phase high performance liquid chromatography, SEC = size exclusion chromatography, db = database, MudPIT = multi-dimensional protein identification technology, vg = venom gland. * Studies include transcriptomics experiments.