| Literature DB >> 35448856 |
Choo Hock Tan1,2.
Abstract
Venomic research, powered by techniques adapted from proteomics, transcriptomics, and genomics, seeks to unravel the diversity and complexity of venom through which knowledge can be applied in the treatment of envenoming, biodiscovery, and conservation. Snake venom proteomics is most extensively studied, but the methods varied widely, creating a massive amount of information which complicates data comparison and interpretation. Advancement in mass spectrometry technology, accompanied by growing databases and sophisticated bioinformatic tools, has overcome earlier limitations of protein identification. The progress, however, remains challenged by limited accessibility to samples, non-standardized quantitative methods, and biased interpretation of -omic data. Next-generation sequencing (NGS) technologies enable high-throughput venom-gland transcriptomics and genomics, complementing venom proteomics by providing deeper insights into the structural diversity, differential expression, regulation and functional interaction of the toxin genes. Venomic tissue sampling is, however, difficult due to strict regulations on wildlife use and transfer of biological materials in some countries. Limited resources for techniques and funding are among other pertinent issues that impede the progress of venomics, particularly in less developed regions and for neglected species. Genuine collaboration between international researchers, due recognition of regional experts by global organizations (e.g., WHO), and improved distribution of research support, should be embraced.Entities:
Keywords: genomics; next-generation sequencing; protein decomplexation; proteomics; toxin; transcriptomics; venom
Mesh:
Substances:
Year: 2022 PMID: 35448856 PMCID: PMC9028316 DOI: 10.3390/toxins14040247
Source DB: PubMed Journal: Toxins (Basel) ISSN: 2072-6651 Impact factor: 5.075
Figure 1Venomics: Advancing proteomic, transcriptomic, and genomic platforms, supported by high-throughput sequencing techniques for protein/peptide, RNA and DNA, growing databases, knowledge-bases and bio-computing algorithms, which drive the advancement of venomics. Venomics contributes toward the knowledge of venom evolution, toxin functionality, pathophysiology, and treatment of envenomation, and paves the way for biodiscovery, as well as improvement of antivenom production.
Figure 2Venomic workflow incorporating proteomics, transcriptomics, and genomics. Proteomics utilizes venom (proteins) and adopts various profiling approaches, which can be briefly classified into decomplexation (involving venom fractionation by chromatography and gel electrophoresis) and non-decomplexation strategies (using unfractionated whole venom), followed by amino acid sequencing applying mass spectrometry. Bottom-up proteomics is the conventional and most commonly used technique, whereas top-down (and middle-down) sequencing are emerging methods that offer new insights in recent venomics. Transcriptomics and genomics require tissue samples from the venomous animals for RNA/DNA extraction. Next-generation sequencing (NGS) of nucleotides is a massively parallel sequencing technology that offers ultra-high throughput, scalability, and speed for transcriptome and genome assembly.
Figure 3Snake venom proteomes of selected major cobra species in Asia (genus: Naja, subgeneus: Naja), investigated with venomic approaches that allow differentiation of three-finger toxin subtypes (e.g., SNTX, LNTX, CTX) and quantitation of relative protein abundances (in terms of % of total venom proteins). Genus-wide comparison and geographical mapping reveal a phenotypic venom dichotomy, characterized by the dominant expression of either SNTX (short-chain alpha-neurotoxins) or LNTX (long-chain alpha-neurotoxins) as the principal lethal toxins that mediate neuromuscular paralysis in envenoming caused by cobras. The neurotoxicity of Naja naja (Indian Cobra) venom is induced primarily by LNTX, while as cobras dispersed eastward, this functional role appears to be gradually taken over by the evolutionarily more derived short-chain form of alpha-neurotoxins (SNTX). In at least four occasions, there were only SNTX but no LNTX found in the venom proteomes: Naja atra of Taiwan, Naja kaouthia of Vietnam, Naja philippinensis and Naja samarensis of The Philippines. The LNTX/SNTX dichotomy has evolutionary significance and medical implications (see text). SNTX: Short-chain alpha-neurotoxin; LNTX: Long-chin alpha-neurotoxin; CTX: Cardiotoxin or cytotoxin; Other proteins include non-conventional there-finger toxins (dotted grey). Inlet shows a simplified phylogenetic tree of Naja cobras modified from Wallach et al. [70] and Kazemi et al. [56], illustrating the relative phylogeographical positions of Asiatic cobras (note: N. atra and N. kaouthia are considered to have partially evolved spitting behaviors). Representative structures of LNTX and SNTX were from the PDB Database (PDB entries: 1CTX and 1COE, respectively). References for proteomes: N. naja (Pakistan [71], Rajasthan of India [72], Tamil Nadu of India (unpublished), Sri Lanka [73]), N. kaouthia (Thailand, Malaysia, Vietnam) [29], N. sputatrix (Java of Indonesia) [53], N. atra (China [74], Taiwan [75]), N. philippinensis (northern Philippines) [51], N. samarensis (southern Philippines) [65].
Figure 4A generic venom decomplexation strategy for proteomics. In step 1, the snake venom is fractionated by reverse-phase HPLC using a C18 columnwith varying concentration gradients of solvent B (mobile phase) for 180 min (solvent B is acetonitrile with 1% trifluoracetic acid). The chromatographic fractions are collected manually at 215 nm (absorbance of peptide bond) and lyophilized. Proteins in the fractions are then subjected to SDS-PAGE as in step 2 (lower panel, under reducing conditions). Number 1–17 represent the numbers of chromatographic fractions collected. Protein marker is used for molecular weight calibration. The protein bands are visualized by Coomassie blue staining (Image was reproduced with reference to Tan et al. [86]).
Comparison of decomplexation and non-decomplexation venom proteomics.
| Decomplexation | Non-Decomplexation | |
|---|---|---|
| Sample requirement | Moderate to large amount especially if chromatography is involved, typically in milligrams of protein. | Minute amount, typically in as low as micrograms of protein. |
| Techniques | Protein separation methods applying chromatography, e.g., reverse-phase/ion-exchange/size-exclusion HPLC, and gel electrophoresis techniques (1D or 2D). | Proteins in venom sample are subjected to mass spectrometry (including the preparative work of protein digestion) without prior biochemical separation *. |
| Downstream experiment | Proteins eluted from chromatography can be readily collected for further purification and characterization. | Limited. |
| Advantages | Provides additional information regarding protein characteristics, e.g., hydrophobicity, pI, and molecular size. Further downstream studies, e.g., toxin-specific neutralization and antivenomics, are possible. | Technically less demanding. Time-saving and profiling can be achieved fast with fewer resources. Useful when venom sample is limited. |
| Disadvantages | Laborious and time-consuming. Large amount of sample and more resources are required. | Limited information of protein characteristics. |
| Examples | HPLC [ | [ |
* In top-down proteomics, while the proteins are not subjected to digestion prior to mass spectrometry (MS) analysis, they are fractionated whole by nano-scale liquid chromatography coupled to tandem MS. HPLC: High-performance liquid chromatography; SDS-PAGE: Sodium dodecyl sulfate-polyacrylamide gel electrophoresis.
Figure 5Bottom-up and top-down proteomics in snake venomics. The stark difference between the two approaches is whether or not proteins in the venom are subjected to proteolytic digestion prior to mass spectrometry (MS) analysis. In bottom-up proteomics, the proteins are digested enzymatically into short-length peptides that are then ionized in MS, fragmented, and the peptide masses are deduced. Their empirical peptide masses act like “fingerprints” that are subsequently correlated with known proteins in databases using search engines, such as Mascot or Sequest. Protein is identified indirectly based on sequences of the tryptic peptides that are assigned to reconstruct, though incomplete, a protein. In top-down proteomics, the intact proteins are ionized whole and then fragmented by MS, and the masses of the ionized proteins and fragments are analyzed to inform on the full sequence of the proteins along with important post-translational modifications (PTM).
Comparison of bottom-up and top-down proteomics used for protein identification in snake venomics.
| Bottom-Up | Top-Down | |
|---|---|---|
| Protein truncation | Yes, achieved by proteolytic digestion with enzymes, e.g., trypsin and chymotrypsin. Commonly performed as in-solution or in-gel digestion. | Venom proteins are not subjected to proteolytic digestion. |
| Protein/peptide size | Peptides of ~7–20 amino acid residues (0.8–2 kDa) are analyzed. | The intact protein is analyzed whole. |
| Ionization and | Peptides from proteolytic cleavage are ionized by ESI/MALDI techniques. | Intact protein is fragmented in the mass spectrometer. Fragmentation is accomplished by ECD or ETD. |
| Advantages | Technique is mature, commonly used, and widely available. | Technique avoids time-consuming protein digestion (typically overnight). |
| Disadvantages | A low percentage coverage of the amino acid sequence. | Instrumentation is expensive and operation is technically sophisticated. Not commonly available. |
Snake genomes available to date as deposited in the public database.
| No. | Date of Submission | Common Name | Scientific Name | Family | Sex | Assembly Type | Genome Representation | Notes |
|---|---|---|---|---|---|---|---|---|
| 1 | 15/09/2013 | Burmese python |
| Pythonidae | Female | Scaffold | Full | GCA_000186305.2 |
| 2 | 11/12/2013 | King Cobra |
| Elapidae | Male | Scaffold | Full | GCA_000516915.1 |
| 3 | 01/08/2014 | Southwestern Speckled Rattlesnake |
| Viperidae | Female | Scaffold | Full | GCA_000737285.1 |
| 4 | 10/12/2014 | European Adder |
| Viperidae | Female | Scaffold | Full | GCA_000800605.1 |
| 5 | 26/06/2015 | Common Garter Snake |
| Colubridae | Female | Scaffold | Full | GCA_001077635.2 |
| 6 | 22/01/2016 | Brown Spotted Pitviper; Taiwanese Habu |
| Viperidae | Not stated | Scaffold | Full | GCA_001527695.3 |
| 7 | 21/04/2016 | Timber Rattlesnake |
| Viperidae | Female | Scaffold | Full | GCA_001625485.1 |
| 8 | 02/08/2018 | Okinawa Habu |
| Viperidae | Female | Scaffold | Full | GCA_003402635.1 |
| 9 | 05/09/2018 | Xizang Hot-spring Keel-back | Colubridae | Female | Scaffold | Full | GCA_003457575.1 | |
| 10 | 24/09/2018 | Eastern Brown Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_900518735.1 |
| 11 | 24/09/2018 | Mainland Tiger Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_900518725.1 |
| 12 | 08/01/2019 | Prairie Rattlesnake |
| Viperidae | Male | Chromosome | Full | GCA_003400415.2 |
| 13 | 09/01/2019 | Ijima’s Turtleheaded Sea Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_004319985.1 |
| 14 | 09/01/2019 | Yellow-Lipped Sea Krait |
| Elapidae | Not stated | Scaffold | Full | GCA_004320045.1 |
| 15 | 09/01/2019 | Blue-ringed Sea Krait |
| Elapidae | Not stated | Scaffold | Full | GCA_004320025.1 |
| 16 | 15/01/2019 | Asian Annulated Sea Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_004023725.1 |
| 17 | 15/01/2019 (latest) | Hardwick’s Sea Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_004023765.1 |
| 18 | 13/02/2019 | Slender-necked Sea Snake |
| Elapidae | Not stated | Scaffold | GCA_004320005.1 | |
| 19 | 11/12/2019 | Indian Cobra |
| Elapidae | male | Chromosome | Full | GCA_009733165.1 |
| 20 | 19/12/2019 | Western Terrestrial Garter Snake |
| Colubridae | Female | Type: alternate-pseudohaplotype | Full | GCA_009769695.1 |
| 21 | 23/12/2019 | Western Terrestrial Garter Snake |
| Colubridae | Female | Assembly type: | Full | GCA_009769535.1 |
| 22 | 13/04/2020 | Corn Snake |
| Colubridae | Male | Scaffold | Full | GCA_001185365.2 |
| 23 | 22/04/2020 | Western Rat Snake |
| Colubridae | Female | Scaffold | Full | GCA_012654085.1 |
| 24 | 22/04/2020 | Dhaman; Oriental Ratsnake |
| Colubridae | Female | Scaffold | Full | GCA_012654045.1 |
| 25 | 04/11/2020 (latest) | Yellow-Lipped Sea Krait |
| Elapidae | Not stated | Scaffold | Full | GCA_015471245.1 |
| 26 | 22/11/2020 | Eastern Brown Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_900608585.1 |
| 27 | 22/11/2020 (latest) | Mainland Tiger Snake |
| Elapidae | Not stated | Scaffold | Full | GCA_900608555.1 |
| 28 | 06/01/2021 | Tiger Rattlesnake | Viperidae | Not stated | Contig | Full | GCA_016545835.1 | |
| 29 | 01/04/2021 | Mud Snake |
| Homalopsidae | Male | Scaffold | Full | GCA_017656035.1 |
| 30 | 19/04/2021 | Indian Cobra |
| Elapidae | Female | Scaffold | Full | GCA_018093825.1 |
| 31 | 2021/05/11/05/2021 | Jararaca |
| Viperidae | Female | Scaffold | Full | GCA_018340635.1 |
| 32 | 25/05/2021 | Eastern Diamondback Rattlesnake |
| Viperidae | Female | Scaffold | Full | GCA_018446365.1 |
| 33 | 06/08/2021 | Golden Tree Snake |
| Colubridae | Not stated | Scaffold | Full | GCA_019457695.1 |
| 34 | 09/08/2021 | Shaw’s Sea Snake |
| Elapidae | Male | Chromosome | Full | GCA_019472885.1 |
| 35 | 09/08/2021 | Annulated Sea Snake |
| Elapidae | Male | Chromosome | Full | GCA_019473425.1 |
| 36 | 18/08/2021 | Gopher Snake | Colubridae | Female | Scaffold | Full | GCA_019677565.1 | |
| 37 | 23/02/2022 | Prong-snouted blind snake |
| Typhlopidae | Not stated | Sca | Full | GCA_022379055.1 |
| 38 | 15/03/2022 | Glossy snake | Colubridae | Not stated | Scaffold (alternate-pseudohaplotype) | Full | GCA_022578425.1 | |
| 39 | 15/03/2022 | Glossy snake | Colubridae | Not stated | Scaffold (haploid, principal pseudohaplotype of diploid) | Full | GCA_022577455.1 |
Figure 6Protein quantitation in snake venom proteomics. Proteomics is studied either with or without protein decomplexation (by HPLC and/or gel electrophoresis) prior to mass spectrometry analysis for protein identification and quantitation. The label-free, relative quantitation approach is the most commonly used. The relative protein abundance of venom composition is interpreted based on individual protein’s spectral intensity, spectral count, or spectral total ion current (TIC) (integrated with HPLC peak area and/or gel intensity, quantitative parameters from venom decomplexation if relevant). Images of HPLC, gels and pie chart for illustration were adapted from previous studies [73,93,162].
Comparison of toxins identified in Trimeresurus puniceus venoms by protein families, subtypes, and relative abundances.
| Protein Family/Subtype a | Accession No. b,c | Relative Protein Abundance d (%) | |
|---|---|---|---|
| Method 1 | Method 2 | ||
|
|
|
| |
| Alpha-fibrinogenase albofibrase | P0CJ41 | 3.74 | 1.69 |
| Alpha-fibrinogenase shedaoenase | Q6T5L0 | 1.20 | 1.69 |
| Beta-fibrinogenase mucrofibrase-2 | Q91508 | 1.08 | 1.69 |
| Snake venom serine protease 1 | Unigene42520_TWM | 1.46 | 1.69 |
| Snake venom serine protease 2 | CL403.contig2_TWM | 0.58 | 1.69 |
| Snake venom serine protease 2A homolog | O13060 | 1.01 | 1.69 |
| Snake venom serine protease 2C | O13062 | 0.53 | 1.69 |
| Snake venom serine protease KN14 | Q71QH9 | 1.70 | 1.69 |
| Snake venom serine protease KN8 | Q71QH5 | 0.87 | 1.67 |
| Snake venom serine protease pallase | O93421 | 1.34 | 1.69 |
| Snake venom serine protease salmonase | Q9PTL3 | 1.16 | 1.69 |
| Thrombin-like enzyme 1 | A7LAC6 | 0.80 | 1.69 |
| Thrombin-like enzyme 2 | A7LAC7 | 1.19 | 1.69 |
| Thrombin-like enzyme calobin-1 | Q91053 | 0.65 | 1.69 |
| Venom plasminogen activator TSV-PA | Q91516 | 1.71 | 1.69 |
|
|
|
| |
|
| |||
| Zinc metalloproteinase/disintegrin | Unigene5053_TWM | 0.34 | 1.69 |
| Zinc metalloproteinase homolog-disintegrin albolatin | P0C6B6 | 0.92 | 1.69 |
|
| |||
| Zinc metalloproteinase/disintegrin | P0C6E8 | 1.30 | 1.69 |
| Zinc metalloproteinase-disintegrin-like ACLD | CL1397.contig5_TWM | 1.73 | 1.69 |
| Zinc metalloproteinase-disintegrin-like ACLD | O42138 | 0.17 | 1.69 |
| Zinc metalloproteinase-disintegrin-like HF3 | Q98UF9 | 1.81 | 1.69 |
| Zinc metalloproteinase-disintegrin-like stejnihagin-A | Q3HTN1 | 1.70 | 1.69 |
| Zinc metalloproteinase-disintegrin-like stejnihagin-B | Q3HTN2 | 0.57 | 1.69 |
| Zinc metalloproteinase-disintegrin-like TSV-DM | J3RYA3 | 0.16 | 1.69 |
| Zinc metalloproteinase-disintegrin-like VMP-III | CL1397.contig1_TWM | 8.41 | 1.69 |
| Zinc metalloproteinase-disintegrin-like VMP-III | C9E1S0 | 0.18 | 1.69 |
|
|
|
| |
| Disintegrin albolabrin | P62384 | 4.74 | 1.69 |
| Disintegrin trigramin-gamma | P62383 | 11.08 | 1.69 |
|
|
|
| |
| C-type lectin 6 | Unigene46336_TWM | 1.01 | 1.69 |
| C-type lectin TsL | Q9YGP1 | 1.85 | 1.69 |
| Snaclec alboaggregin-D subunit alpha | P0DM38 | 0.88 | 1.69 |
| Snaclec clone 2100755 | Q8JIV8 | 0.85 | 1.69 |
| Snaclec coagulation factor IX/factor X-binding protein subunit A | Q71RR4 | 0.70 | 1.69 |
| Snaclec convulxin subunit alpha | Unigene46337_TWM | 0.35 | 1.69 |
| Snaclec purpureotin subunit alpha | P0DJL2 | 1.48 | 1.69 |
| Snaclec purpureotin subunit beta | P0DJL3 | 1.27 | 1.69 |
| Snaclec stejaggregin-A subunit alpha | CL746.contig2_TWM | 0.53 | 1.69 |
|
|
|
| |
| Acidic phospholipase A2 | P20249 | 2.54 | 1.69 |
| Acidic phospholipase A2 6 | P70088 | 2.77 | 1.69 |
| Acidic phospholipase A2 Tpu-E6c | P0DJP4 | 4.77 | 1.69 |
| Basic phospholipase A2 daboxin P | C0HK16 | 1.18 | 1.69 |
| Basic phospholipase A2 homolog Tpu-K49a | Q2YHJ9 | 7.50 | 1.69 |
| Basic phospholipase A2 homolog Tpu-K49b | Q2YHJ8 | 4.45 | 1.69 |
| Basic phospholipase A2 Tpu-G6D49 | Q2YHJ7 | 4.33 | 1.69 |
|
|
|
| |
| Cysteine-rich secretory protein | P60623 | 0.91 | 1.69 |
| Cysteine-rich secretory protein | Unigene30615_TWM | 0.26 | 1.69 |
| Cysteine-rich secretory protein triflin | Q8JI39 | 0.52 | 1.69 |
|
|
|
| |
| L-amino-acid oxidase | B0VXW0 | 0.66 | 1.69 |
| L-amino-acid oxidase | Q6WP39 | 1.72 | 1.69 |
| L-amino-acid oxidase | Q90W54 | 0.87 | 1.69 |
| L-amino-acid oxidase | Unigene40029_TWM | 1.57 | 1.69 |
|
|
|
| |
| Phosphodiesterase | Unigene5177_TWM | 0.65 | 1.69 |
| Venom phosphodiesterase 1 | J3SEZ3 | 0.71 | 1.69 |
| Venom phosphodiesterase 2 | J3SBP3 | 0.90 | 1.69 |
|
|
|
| |
| Snake venom vascular endothelial growth factor toxin | Unigene25068_TWM | 1.54 | 1.69 |
|
|
|
| |
| Snake venom 5′-nucleotidase | Unigene721_TWM | 0.38 | 1.69 |
| Snake venom 5′-nucleotidase | B6EWW8 | 0.33 | 1.69 |
|
|
|
| |
| Nerve growth factor | CL2590.contig1_TWM | 0.18 | 1.69 |
|
|
|
| |
| Phospholipase B | Unigene25350_TWM | 0.18 | 1.69 |
|
|
|
|
|
a,b Protein identification, accession numbers, and corresponding species were derived from databases based on best homology match. Number in parenthesis: Total number of distinct proteins matched for individual protein/toxin family. c Accession numbers with suffix “_TWM” were based on an in-house transcript-database specific for Trimeresurus wiroti (Malaysia). d Relative abundance is calculated by two different methods: (1) Method 1: By incorporating the relative spectral intensity (of non-redundant peptides belonging to individual protein) with the area under the curve of chromatographic fraction. (2) Method 2: By dividing the number of individual proteins by the total number of all proteins identified in the venom. In this example, the total number of proteins was 59, which also served as the denominator. Proteomic data for the species and Method 1 were derived from the author’s previously published work [47].
Figure 7Left panel: The streetlight effect illustrated by a man searching for lost keys where the light is better. This is a metaphor of cognitive availability bias or observational bias. Right panel: The flashlight, symbolizing the venomic tool, sheds light on the venom composition (proteome) of a cobra venom. The metaphoric cartoon shows how data could be misinterpreted: (1) The real dataset is under-represented when whatever (genes and proteins) revealed under the light are concluded as all that a species/specimen has, while ignoring what possibly lies beyond the edge of light. In this case, the Kunitz-type serine protease inhibitor, muscarinic toxin-like proteins, and nerve-growth factor, somehow undetected, were simply left out. (2) The dataset is over-interpreted when enormous data shined by the light are not carefully filtered and validated to represent the species/specimen studied. In this example of a cobra’s venomics, the detection of alpha-bungarotoxin, a krait-specific three-finger toxin should have raised suspicion if it is a false identification.