Literature DB >> 30380764

Discovery Methodology of Novel Conotoxins from Conus Species.

Ying Fu¹, Cheng Li², Shuai Dong³, Yong Wu⁴, Dongting Zhangsun⁵, Sulan Luo⁶.

Abstract

Cone snail venoms provide an ideal resource for neuropharmacological tools and drug candidates discovery, which have become a research hotspot in neuroscience and new drug development. More than 1,000,000 natural peptides are produced by cone snails, but less than 0.1% of the estimated conotoxins has been characterized to date. Hence, the discovery of novel conotoxins from the huge conotoxin resources with high-throughput and sensitive methods becomes a crucial key for the conotoxin-based drug development. In this review, we introduce the discovery methodology of new conotoxins from various Conus species. It focuses on obtaining full N- to C-terminal sequences, regardless of disulfide bond connectivity through crude venom purification, conotoxin precusor gene cloning, venom duct transcriptomics, venom proteomics and multi-omic methods. The protocols, advantages, disadvantages, and developments of different approaches during the last decade are summarized and the promising prospects are discussed as well.

Entities: Chemical Disease Gene Species

Keywords: crude venom purification; discovery; gene cloning; methodology; multi-omics; novel conotoxins; proteomics; transcriptomics

Mesh：

Substances：
Conotoxins
Neurotoxins

Year: 2018 PMID： 30380764 PMCID： PMC6266589 DOI： 10.3390/md16110417

Source DB: PubMed Journal: Mar Drugs ISSN： 1660-3397 Impact factor: 5.118

1. Introduction

Cone snails (Conus) are carnivorous mollusks from the Conidae family (Figure 1). They live in the tropical oceans around the world and hunt fish (piscivorous), worms (vermivorous), or molluscs (molluscivorous) for food, although they are slow-moving creatures [1]. Cone snails have evolved a full set of specialized envenomation apparatus to release bioactive venoms to compensate their slow movement for fast-moving prey, competitors, or/and predators [2,3]. Cone snail venom peptides are secreted by the epithelial secretory cells in the long and convoluted venom duct [2,4]. The venom is pushed by muscle peristalsis of venom bulb and loaded into the harpoon-like radula tooth for envenomation [5]. Due to the human casualties that are caused by cone snail stings in 1960s [6], these venoms first caught researcher’s interest in their toxicity and bioactivity.

Figure 1

Representative Conus species native to Hainan China (shot by Cheng Li).

Early studies have confirmed that these bioactive venoms are a cocktail of neuroactive peptides, termed conopeptides or conotoxins, which can cause paralysis, shudder, and even death of the prey within seconds [1,5]. Subsequent research have revealed that conopeptides are able to selectively modulate voltage-gated ion channels [7] (Table 1), including sodium channels [8,9], potassium channels [10], and calcium channels [11,12], as well as ligand-gated ion channels (Table 1), such as nAChRs [13,14,15], serotonin receptor [16], NMDA receptor [17], GABA receptor [18], GPCRs [19] (α1-adrenoceptors [20,21], vasopressin receptor [22], neurotensin receptor [23]), and neurotransmitter transporters (noradrenaline transporter [21,24]), which are key targets for chronic diseases, like neuralgia [8,25,26], addiction [27], epilepsy [17,28], cancer [29], heart disease [30,31], and so on [32,33,34].

Table 1

Target and clinical potential of representative conotoxins.

Target/Mode of Action		Conotoxin	Clinical Potential	Ref.
Voltage-gated Ion Channels	Ca_v 2.2 inhibitor	MVIIA	Analgesia (On Market)	[40]
	Na_v 1.8 inhibitor	MrVIB	Analgesia	[45]
	K_v inhibitor	PVIIA	Cardiac reperfusion	[42]
Ligand-gated Ion Channels	α9α10 nAChRs inhibitor	Vc1.1	Analgesia (Phase II)	[43]
	NMDA-R inhibitor	Conantokin G	Analgesia/anti-epileptic	[17]
	5-HT₃ inhibitor	GVIIIA	—	[16]
GPCRs	α₁-adrenoceptor inhibitor	TIA	Cardiovascular/Benign Prostate Hyperplasia	[20,46]
	vasopressin receptor agonist	Conopressin-G	Cardiovascular/mood	[22]
	neurotensin receptor agonist	Contulakin-G	Analgesia (Phase Ia)	[23]
Neurotransmitter Transporters	noradrenaline transporter	MrIA	Analgesia (Phase I)	[47]

Additionally, the venom peptides show high selectivity and efficacy when interacting with the targets, resulting in minor side effects for disease treatment [35]. Hence, cone snail venoms provide an ideal resource for neuropharmacological tools and drug candidates screening, which have become a research hotspot in neuroscience and new drug development [36,37,38] (Table 1). For instance, an ω-conotoxin, named MVIIA (Ziconotide, Prialt) from Conus magus, which blocks voltage-gated calcium channels, has been approved by FDA for chronic pain treatment since 2004 [39,40]. At present, more than 10 conotoxins, including Xen2174 (MrIA) [41], CGX-1007 (Conantokin G) [17], CGX-1051 (κ-PVIIA) [42], ACV1 (Vc1.1) [43], and CGX-1160 (contulakin-G) [44] have marched into preclinical or clinical research stage, which present good prospects on conotoxin drug discovery. There are more than 700 Conus species in the world [48] and each can secrete over 1000 conotoxins [49]. In particular, 3305 novel conopeptide precursors were discovered from one Conus episcopatus specimen [50]. Owning to small overlap of conopeptides between different Conus species [51], there are an estimated 1,000,000 natural peptides that are produced by cone snails. However, <0.1% of the estimated conopeptides has been characterized to date [36,52]. Therefore, high-throughput and sensitive methods are crucial for the discovery of novel conotoxins and the development of conotoxin-based drug screening from this enormous peptide reservoir. In this review, the discovery methodology of novel conotoxins from mollusks Conus species has been summarized, which mainly focuses on obtaining full N- to C-terminal sequences, regardless of disulfide bond connectivity, through crude venom purification, conotoxin precusor gene cloning, venom duct transcriptomics, venom proteomics, and multi-omic method. The protocols, advantages, disadvantages, and developments of different approaches during the last ten years are overviewed and the promising prospects of these methods are discussed.

2. Diversity of Conotoxins

Conotoxins normally consist of 10 to 40 amino acid residues with 2 to 4 or more disulfide bonds. They are expressed as RNA-encoded precursor proteins, which are processed and transferred into mature peptides in the endoplasmic reticulum (ER) and Golgi apparatus. A typical conopeptide precursor is composed of a highly conserved ER signal region, a pro-region and a greatly variable mature peptide region [52]. Conotoxins can be classified into different gene superfamily categories, according to the similarities between the ER signal sequences [35]. Generally, conotoxin-encoding transcripts produce diverse precursors by hypermutation, fragment insertion/deletion, and mutation-induced premature termination [53]. One precursor can produce far more than one mature peptide because of various posttranslational modifications (PTMs) and variable peptide processing (VPP), which create the exponential diversity of conopetides [53,54]. For example, 20 different conopeptide variants on average for each precursor have been detected and characterized from venom duct transcriptomics of Conus marmoreus [49]. VPP refers to the C- and N-terminal truncations of the precursor by proteolytic cleavage at alternative sites [53,54]. These variations generated by interrupting, deleting, or elongating partial sequences and cysteine frameworks. It produced highly variable mature peptides or isoforms with multiple primary sequences [54]. PTMs are ubiquitous and play a key role in the structure and activity of conotoxins [55]. Many types of PTMs are found in the conotoxin maturation process, such as oxidative folding (disulfide bond formation, the most common PTMs), C-terminal amidation, hydroxylation of proline, valine, and lysine, carboxylation of glutamate, cyclization of N-terminal glutamine, glycosylation, sulfation, bromination, and residue epimerization, etc. [53,55]. These multi-diversification mechanisms, such as transcript variation, VPP and PTMs, explain how thousands of specific conopeptides are produced from such a limited gene precursors in a single Conus specie and reveal the reason for inter- and intra-specific variability [49,53,56].

3. Conotoxins Purified from Crude Venom

Conotoxins have been obtained by isolation from crude venom of cone snails since 1970s [57] (Figure 2). The envenomation apparatus of the snails was dissected first. Only about 10 to 50 µL crude venom could be squeezed from each snail specimen, or the dissected tissues were directly subjected to extraction. Sometimes, tens to hundreds of collected snails were dissected to obtain enough venom for conopeptide isolation. The sampling process is non-renewable. It is a waste of precious resource of cone snails, especially for the rare species.

Figure 2

Purification workflow of native conotoxins obtained from crude venom.

The purification process almost has remained constant for decades (Figure 2). Crude venom or dissected tissues are extracted by acetonitrile aqueous solution with 1% TFA. The crude extract is fractionated by Size Exclusion Chromatography (SEC), and then purified by C18 reverse-phase chromatography with gradient acetonitrile/water solution with 0.1% TFA as mobile phase. In order to gain enough target conopeptides for the subsequent characterization, enough crude venom and rigorous purification skills are required. The purified conotoxins are subjected to de novo sequencing through Edman degradation [58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87] or MS sequencing [81,88,89,90] after sequential disulfide bond reduction, hydrosulphonyl alkylation, and enzymolysis, which make sequencing process much easier. PTMs are assigned with the aid of MS techniques [59,65,66,69,70,71,75,76,77,78,80,83,89,90,91,92]. The targeted conotoxins are then chemically synthesized through SPPS, following the subsequent oxidative folding. HPLC co-elution of the synthesized peptides and the purified native conotoxins could validate the sequencing results [69,73,75,76,81,83,84,89]. For gene superfamily identification of the native peptides, their precursor genomic genes or cDNAs could be cloned by various PCR methods or identified by venom transcriptome sequencing. According to the signal peptide homology of the precursors, their gene superfamily could be determined and classified [31,62,65,68,74,75,79,82,86]. Conotoxins purified from Conus venom during the last ten years are summarized in Table 2. Fifty conotoxins in total were discovered during the last 10 years and only five in average were found and characterized per year, indicating that conotoxin discovery from crude venom isolation was stagnant. More efficient omic study is developing in full swing in recent years. It is difficult to isolate a novel conotoxin from limited amount of crude venom that consists of more than 1000 venom peptides. Blind search policy always makes the native peptide isolation process time-consuming and laborious. Therefore, only limited random conotoxins were discovered, which belonged to a few gene superfamilies (Table 2). Hence, more effective bioassay-guided and MS-sequence-tag guided fractionation methods are under development to facilitate the rapid discovery of novel native conotoxins from different crude venoms [62,86,87].

Table 2

Conotoxins isolated from cone snail venom during recent ten years.

Name	Species	Super-Family	Cystine Pattern	Sequence	Target/IC₅₀	Year	Ref.
RegIIA	C. regius	A	I	GCCSHPACNVNNPHIC #	nAChR: α7/103 nM, α3β2/33 nM, α3β4/97 nM	2011	[60]
α-LsIA	C. limpusi	-	I	SGCCSNPACRVNNPNIC	nAChRs: α3β2/10 nM, α3α5β2/31 nM, α7/10 nM	2013	[87]
α-RgIB	C. regius	-	I	TWEECCKNPGCRNNHVDRCRGQV	α3β4 and/or α3β4α5 nAChRs	2013	[61]
α-BruIB	C. brunneus	-	I	DYCCRROTCIPIC #	Dα7 nAChR	2014	[62]
α-AusIA	C. australis	-	I	SCCARNPACRHNHPCV	α7 nAChR: 11.68 mM for AusIA (g), 9.67 mM for AusIA (r)	2014	[63]
Lo1a	C. longurionis	A	I	EGCCSNPACRTNHPEVCD	α7 nAChR/3.24 μM	2014	[64]
BnIA	C. bandanus	A	I	GCCSHPACSVNNPDIC #	α7 nAChR	2014	[65]
Im10A	C. imperialis	T	I	NTICCEGCMCY #	unknown	2016	[91]
α-EII_B	C. ermineus	-	I	ZTOGCCWHPACGKNRC #	nAChRs	2017	[66]
PIC	C. purpurascens	A	I	SGCCKHPACGKNRC	rα1β1δε nAChR	2017	[67]
PIC[O7]	C. purpurascens	A	I	SGCCKHOACGKNRC	rα1β1δε nAChR	2017	[67]
lt3a	C. litteratus	M	III	DγCCγOQWCDGACDCCS	unknown	2009	[68]
κ-RIIIJ	C. radiates	M	III	LOSCCSLNLRLCOVOACKRNOCCT #	hK_v1.2 channels/33 nM	2010	[69]
pr3a	C. parius	M	III	CCNWPCSFGCIPCCY	unknown	2010	[70]
pr3b	C. parius	M	III	ERVCCGYOMSCKSRACKOSYCC #	unknown	2010	[70]
CnIIIC	C. consors	M	III	ZGCCNGPKGCSSKWCRDHARCC #	Na_v1.4/1.3 nMα3β2 nAChR/450 nM	2012	[71]
BnIIID	C. bandanus	M	III	CCDBγNCDHLCSCCD #	unknown	2014	[72]
Asi3a	C. asiaticus	M	III	CCQWPCSHGCIPCCY #	unknown	2016	[91]
bt5a	C. betulinus	T	V	SγCCIRNFLCC	unknown	2010	[73]
pr6a	C. parius	O	VI/VII	TCLARDELCGASFLSNFLCCDGLCLLICV	unknown	2010	[70]
pr6b				FGSFIOCAHKGEOCTICCROLRCHEEKTOTCV
pr6c				DQCTYCGIYCCPPKFCTSSGCRSP
pr6d				YGNFOTCSETGEDCSAMHCCRSMTCRNNICAD
MfVIA	C. imperialis	O	VI/VII	RDCQEKWEYCIVPILGFVYCCPGLICGPFVCV	Na_v1.8/95.9 nM, Na_v1.4/81 nM	2012	[88]
ge6b	C. geneis	O2O2	VI/VII	ACGGGGAPCGSSLDCCYPFECSYNSCG	unknown	2015	[74]
ge6c	C. geneis	O2O2	VI/VII	ACGGGGAPCGSSLDCCYPFγCSYNSCG	unknown	2015	[74]
PiVIIA	C. princeps	O2	VI/VII	CDAOTHYCTNYWγCCSGYCγHSHCW	unknown	2016	[75]
vi6a	C. virgo	O1	VI/VII	DCGGQGEGCYTQOCCOGLRCRGGGTGGGVCQL	unknown	2016	[76]
Lo6/7a	C. longurionis	-	VI/VII	DQCSYCGIYCCPPKFCTSAGCRSP #	unknown	2016	[91]
Lo6/7b	C. longurionis	-	VI/VII	SCLSSGALCGIDSNCCNGCNVPRNQCY #	unknown	2016	[91]
fu6a	C. fulgetrum	O	VI/VII	TCREKGEOCSVYVγCCSRICGYYACA	unknown	2016	[77]
α-GVIIIB	C. geographus	S	VIII	SGSTCTCFTSTNCQGSCECLSPPGCYCSNNGIRQPGCSCTCPGT #G	α9α10 nAChR/9.8 nM	2015	[92]
lt9a	C. litteratus	P	IX	IWFCASRTCSAPADCNPCTCESGVCVDWL	tetrodotoxin-sensi-tive sodium channels/300 nM	2017	[78]
lt9b	C. litteratus	P	IX	IWFCASRTCSAOADCNOCTCγSGVCVDWL	tetrodotoxin-sensi-tive sodium channels/504 nM	2017	[78]
Ca11a	C. caracteristicus	I	XI	AWPCGGVRASCSRHDDCCGSLCCFGTSTGCRVAVRPCW	unknown	2009	[79]
Ca11b	C. caracteristicus	I	XI	ALLCGGTHARCNRDNDCCGSLCCFGTCISAFVPC	unknown	2009	[79]
ts14a	C. tessulatus	A	XIV	DGCPPHPVPGMHPCMCTNTC	unknown	2010	[80]
Asi14a	C. asiaticus	-	XIV	SCGYPCSHCGIPGCYPG #	unknown	2016	[92]
pc16a	C. pictus	M	XVI	SCSCKRNFLCC #	unknown	2011	[81]
qc16a	C. quercinus	-	XVI	DCQPCGHNVCC	unknown	2011	[82]
αD-Ms	C. mustelinus	D	XX	DVRECQVNTPGSKWGKCCMTRMCGTMCCARSGCTCVYHWRRGHGCSCPG	nAChR: α7/0.12 nM, α3β2/1.08 nM, α4β2/4.5 nM	2009	[31]
αD-Cp	C. capitaneus	D	XX	EVQECQVDTPGSSWGKCCMTRMCGTMCCSRSVCTCVYHWRRGHGCSCPG	showed the same selectivity profile as αD-Ms, but has a lower potency	2009	[31]
α-GeXXA	C. generalis	D	XX	DVHRPCQSVRPGRVWGKCCLTRLCSTMCCARADCTCVYHTWRGHGCSCVM (dimer)	α9α10 nAChR	2015	[83]
im23a	C. imperialis	K	XXIII	IPYCGQTGAECYSWCIKQDLSKDWCCDFVKDIRMNPPADKCP	unknown	2012	[84]
im23b	C. imperialis	K	XXIII	IPYCGQTGAECYSWCIKQDLSKDWCCDFVKTIARLPPAHICSQ	unknown	2012	[84]
as25a	C. cancellatus	-	XXV	CKCPSCNFNDVTENCKCCIFRQP #	unknown	2013	[85]
as25b	C. cancellatus	-	XXV	CKCOSCNFNDVTENCKCCIFRQO?	unknown	2013	[85]
RsXXIVA	C. regularis	-	XXVI	CKGQSCSSCSTKEFCLSKGSRLMYDCCTGSCCGVKTAGVT	Ca_v2.2	2013	[89]
GeXXVIIA	C. generalis	O	-	ALMSTGTNYRLLKTCRGSGRYCRSPYDCRRRYCRRISDACV	α9α10 nAChR/16.2 nM	2017	[93]
p21a	C. purpurascens	-	-	FELLPSQDRSCCIQKTLECLENYOGQASQRAHYCQQDATTNCODTYYFGCCPGYATCMSINAGNNVRSAFDKCINRLCFDPGH #	unknown	2011	[86]

#, [O], [γ], [B] represent C-terminal amidation, hydroxyproline, carboxyglutamate and bromotryptophan, respectively. Dα7 nAChR means the receptor is expressed in the CNS of the Drosophila melanogaster fly. The sequence of α-GeXXA (a dimer) presents one subchain of the dimer. ? indicates that the amidation of the C-terminus was not directly confirmed. Dash (-) means undetermined or none.

4. Gene Cloning to Discover New Conotoxins

To overcome the limitations of crude venom purification strategy, gene cloning for novel conotoxins discovery has emerged in 1990s [94]. Since a conotoxin is expressed by a specific gene, it can be amplified by PCR technique with specific primers [53,95,96,97,98]. Generally, genomic DNA is extracted from snail tissue of an individual specimen, or cDNA is prepared by reversed transcription of mRNA extracted from dissected venom duct. The resulting total DNA or cDNA is served as a template for PCR amplification with forward and reverse primers to perform 3′- and 5′-RACE. The primers are designed and synthesized on the basis of the conserved sequence in signal region (Figure 3, primer 1) or untranslated region of 3′- or 5′-UTRs (Figure 3, primer 2 and 3) of specific known conotoxin precursor, or its relatively conserved introns (Figure 3, primer 4). The PCR products are purified by electrophoresis on agarose gel and are ligated into a plasmid vector for sequencing. The annotation of possible conotoxin-encoding genes is conducted based on homologous searching. The resulting conotoxin sequences are analyzed and assigned by CLUSTALX [99]. The signal region sequences can be predicted by SignalP 3.0 server (http://www.cbs. dtu.dk/services/SignalP/) [99,100,101].

Figure 3

PCR amplification strategy to clone conotoxin precursor genes from genomic DNA.

Primers make it possible for specific conotoxin-encoding genes to be amplified from the total genomic DNA or RNA of a cone snail. Thus the PCR primer design is a key factor for conotoxin discovery by gene cloning. Generally, the resulting PCR sequence of a conotoxin precursor gene generated by primer 1 or primer 2 pairing with primer 3, contains a complete open reading frame (ORF) sequence, which includes a signal region, a pro-region, and a mature peptide region (Figure 3). When primer 4 pairing with primer 3 is used to clone a conotoxin precursor gene, it starts with partial pro-region without signal peptide. Representative α-family (α*-) conotoxins discovered by gene cloning during the last ten years are shown in Table 3. The resulting sequences of Pu14.1 and GeXIVA consist of complete precursor sequences including signal regions which facilitate to assign gene superfamily category. Previous study showed that the sequences of the α-conotoxin intron in pro-region is highly conserved [97]. Many new α-conotoxins have been discovered by PCR technique using its conserved intron and 3′-UTR primers in our lab, such as α-conotoxin TxIB, TxID, LvIA, etc., which do not contain signal regions (Table 3). A forward primer and its paired reverse primer could be designed according to the conserved intron of a known gene superfamily and its 3′-UTR, such as A-, O-, or other superfamily, to clone novel conotoxin precursor genes. Random cDNA sequencing can also obtain the complete precursor sequence, e.g., VxXXIVA, but this method is not as targeted as the strategy with delicately designed primers.

Table 3

Representative α*-conotoxins discovered by gene cloning during the last ten years.

Conotoxin	Super-Family	Primer	Sequence	Target (nAChRs)/IC₅₀	Ref.
Pu14.1	A	signal sequence & 3′-UTR	MGMRMMFAVFLLVVLATTVVSFNSDRASDGRNAAANVKASDLMARVLEKDCPPHPVPGMHKCVCLKTC	rα1β1δε > rα6α3β2 > rα3β2	[73]
GeXIVA	O1	signal sequence	MKLTCVLIITVLFLTACQLTTAVTYSRGEHKHRALMSTGTNYRLPKTCRSSGRYCRSPYDRRRRYCRRITDACV	rα9α10/4.6 nM	[102]
TxIB	-	intron & 3′-UTR	FDGRNTSANNKATDLMALPVRGCCSDPPCRNKHPDLC #	rα6/α3β2β4/28 nM	[103]
TxID	-	intron & 3′-UTR	FDGRNAAGNDKMSALMALTTRGCCSHPVCSAMSPIC	rα3β4/12.5 nM, rα6/α3β4/94 nM	[104]
LvIA	-	intron & 3′-UTR	FRGRDAAAKASGLVGLTDRRGCCSHPACNVDHPEIC #	rα3β2 (8.7 nM) > rα6/α3β2β3 ≈ rα6/α3β4 ≈ rα3β4 > α7	[105]
Lt1.3	-	intron & 3′-UTR	FDGRNAAPSDKASDLISLAVRGCCSHPACSGNNPYFC #	α3β2/44.8 nM	[106]
VxXXIVA	B	cDNA sequencing	METLTLLWRASSSCLLVVLSHSLLRLLGVRCLEKSGAQPNKLFRPPCCQKGPSFARHSRCVYYTQSRE	rα9α10/1.2 μM, Mouse α1β1γδ/6.6 μM	[107]

The signal region is shadowed. The pro-region is italics. The mature conotoxin sequence is underlined. # represents C-terminal amidation. “r” indicates rat.

When compared with crude venom purification, the gene cloning strategy is more resource-saving. Generally, several or even one specimen is enough for conotoxin gene cloning. However, the mature peptide sequences are speculated from their precursor genes, so no PTMs identification is involved. On the other hand, gene cloning strategy is relatively low-throughput, when compared with the transcriptomic approach that arose in 2010s. In addition, the primers for gene cloning are designed according to the conserved sequences of known family or superfamily, so new family or superfamily conotoxins are difficult to be discovered by this way.

5. Cone Snail Multi-Omics

Although big efforts have been made for novel conotoxin discovery from natural crude venom and gene cloning, most of the total estimated conotoxins have not been characterized yet [108]. More efficient, resource-saving, and high-throughput methodology urgently needs to be exploited. “Omics” such as transcriptomics and proteomics, and “Multi-omics” by integrating them together, have opened a new era for conotoxin discovery and rapidly accelerate the rate of conotoxin discovery [108,109,110].

5.1. Transcriptomics—A Useful Pathway to Identify Putative Conotoxins

Transcriptomics aims to identify and profile the holistic gene (including the conotoxin-encoding genes) transcription and expression at RNA level. Venom duct is an ideal material for transcriptomic analysis, because the number and level of conotoxin-encoding transcripts from venom duct are much larger than those from other tissues [50,111]. Conus venom duct transcriptomics is able to describe the conotoxin expression and it has presented a useful method to rapidly identify putative conopeptide sequences. In addition, transcriptomics using next generation sequencing (NGS) technology [112] makes large scale sequencing time- and cost-effective. The transcriptomic pipeline (Figure 4) starts from the total RNA extraction of dissected venom duct. Then, mRNAs are served as reverse-transcriptional templates for cDNA library construction. PCR amplification is conducted while using cDNA as template and specific sequences as primers. The resulting cDNA or the raw RNA sequences are sequencing using NGS platforms, such as 454 (Roche, Branford, CT, USA), Illumina (Illumina, San Diego, CA, USA), Ion Torrent Personal Genome Machine (Thermo Fisher, Waltham, MA, USA), Nanopore (Oxford, UK), ABI 3730 Series (Applied Biosystems, Foster City, CA, USA), and PacBio (Pacific Biosciences, Menlo Park, CA, USA) [108,113]. Illumina and Roche 454 are the most widely-used NGS platform at present (Table 4). The raw reads generated from NGS platforms require data assembly to remove artifacts, poor quality raw reads, as well as redundant and aberrant sequences [114]. The trimmed sequences are then deciphered into peptide primary sequences according to opening reading frames (ORFs) [112] by ConoPrec [1,53,54,111,115,116] or SignalP4.0 [115,116,117,118], which may locate the signal peptides and predict their cleavage sites. Profile Hidden Markov Models (pHMMs) [1,111,118,119] is a useful tool of ConoSorter [1,110,118,119], which could identify the putative precursors of conopeptides and categorize their superfamilies. Homology search and analysis by running BLAST against the combined searchable online databases, like ConoServer (The university of Queensland, Brisbane, Australia) (http://research1t.imb.uq.edu.au/conoserver/) [120,121], UniProtKB/Swiss-Prot (http://www.uniprot.org/downloads) [122,123], and NCBI (http://www.ncbi.nlm.nih.gov/), may enable the rapid identification of known and novel conotoxins. ConoSorter also facilitates to illustrating relative sequence frequency, length, number of cysteines, N-terminal hydrophobicity, and sequence similarity score [118]. Thus, a unique transcriptomic dataset for an individual specimen from a specific Conus specie might be established.

Figure 4

Multi-omic pipeline of conotoxin discovery.

Table 4

The reported transcriptomic and proteomic data from various cone snails during the past decade.

Species	Number of Precursors	Number of Gene Superfamily	Sequencing Platforms	Number of Confirmed Conotoxins by Proteomics	MS Instruments	Year	Ref.
C. textile	-	-	-	31	ESI-LTQ-Orbitrap	2010	[128]
C. bullatus	30	6	Illumina, Roche 454	-	-	2011	[129]
C. consors	53	11	Roche 454	-	-	2012	[130]
C. pulicarius	82 (79 new)	14	Roche 454	-	-	2012	[131]
C. marmoreus	105	13	Roche 454	2710–6254	MALDI-TOF,ESI-Q-TOF	2013	[49]
C. marmoreus	158	13 new	Roche454	106	ESI-MS/MS	2013	[118]
C. miles	662	16 (8 new)	Roche 454	48	ESI-Q-TOF	2013	[54]
C. flavidus	-	-	-	31	ESI-LTQ-Orbitrap	2013	[53]
C. victoriae	113	20	Roche454	-	-	2014	[119]
C. geographus	127	16 (4 new)	Roche454	43	ESI-TripleTOF	2014	[3]
C. catus	104	11	Roche 454	51	ESI-Q-TOF	2015	[115]
C. episcopatus	3305	25 (16 new)	Illumina	1,448	ESI-MS/MSESI-Q-TOF	2015	[50]
C. tribblei	136	30 (6 new)	Illumina, Roche 454	-	-	2015	[132]
C. tribbleiC. lenavati	100 (45 new)132	3940	ABI 3730XL	-	-	2015	[116]
C. planorbis	182	25	Roche 454	23	ESI-TripleTOF	2015	[133]
C. betulinus	215 (183 new)	9 new	Illumina	-	-	2016	[111]
C. vexillumC. capitaneus	220	19 (4 new)	Roche 454	24	ESI-Q-TOF,MALDI-TOF	2016	[1]
C. gloriamaris	108 (98 new)	31	Illumina	-	-	2017	[134]

Dash (-) means undetermined.

Compared with traditional isolation and gene cloning, venom duct transcriptomic approach is a rapid, efficient, resource-saving, and high-throughput way to identify massive conotoxins from different cone snails, which greatly extends our cognition of conotoxin resource (Table 4). During the last decade, many putative conopeptide precursors have been identified from transcriptome of different Conus species (Table 4). At least 30 conopeptides precursors were discovered from C. bullatus by transcriptome sequencing. Surprisingly, as many as 3305 novel conopeptide precursors were discovered from a single Conus episcopatus specimen by sequencing its transcriptome (Table 4). Phylogeny-based conotoxin discovery utilizes the known conserved sequence to design specific primers for PCR amplification, which enables to find more conopeptides belonging to known superfamilies from different Conus species [124]. Additionally, specific PCR primers might be designed according to incomplete sequences that were obtained by MS-sequencing-tag or Edman degradation [124,125], which is also applied to clone new conotoxin precursors from venom transcriptome, cDNA, and its genomic DNA of various Conus species. It provides a feasible way to explore novel conotoxins belonging to new superfamilies [124]. cDNA library normalization is an effective and commonly-used method to equalize some specific cDNA, which facilitates to identify conotoxin genes with a relatively low level expression level [114,124]. Normalization suppresses highly abundant transcript reads and increases rare transcripts, so as to maximize the identified number of unique conotoxins [119]. Thanks to transcriptomic study, venom insulins, which target the heterospecific insulin receptors of prey, predators, and competitors, have been proven to be expressed in many worm- and snail-hunting cone snails [126]. Six insecticidal conotoxins have been validated and screened out from the transcriptomic dataset of 215 precursors by homologous search with α-conotoxin ImI [127]. These findings reveal that Conus transcriptomic database can promote the extension for new knowledge and find various new conopeptides.

5.2. Proteomics—An Effective Approach to Discovery Natural Conotoxins

Traditional proteomic identification depends on Edman degradation and amino acid composition analysis to assign the peptide sequences, but its sample-consuming and low-throughput characters make it difficult to be extensively applied. As the high-resolution MS instrument appears [135], venom proteomic study with the aid of modern MS technology has proven to be an effective and high-throughput approach for novel conotoxin discovery [108,109]. The general proteomic procedure is presented in Figure 4. Briefly, the venom sample is prepared by squeezing the venom from dissected venom duct (one-off operation), or collecting the secreted venom that is induced by pray from living cone snail individuals (reproducible operation) [1]. As the MS techniques develop, the required venom amount for experimental analysis of proteomics is decreased. Even about 7% or less of crude venom from one specimen is enough [136]. The proteomic data detected from different Conus species, especially for those cone snails hunting different preys, are quite different from each other, because different species and the food preference are the key factors for the evolution of venom diversity [137]. In traditional bottom-up proteomics, pretreatment of venom sample, such as reduction, alkylation, and enzymatic digestion, is carried out before HPLC-MS analysis in order to eliminate the influence of disulfide bonds, although it leads to partial loss of conopeptides during processing [50,53,111,115,138]. In top-down proteomic approach, intact disulfide-bridged venom peptides are remained, which makes it more applicable to analyze simple peptide mixtures, like highly purified venom subfractions, whereas bottom-up approach is more suitable for complex crude venom [50,53,139]. The combination of top-down and bottom-up approach enables the identification of several unexpected cleavage sites during conotoxin maturation [53]. The resulting vast MS data generated from LC-MS analysis are subjected to bioinformatic tools for further data processing and mining. Raw MS data are inputted into Mascot for Peptide mass fingerprint. ProteinPilot™ [1,49,54,115] is used for sequence identification and the annotation of precursor ions by searching the MS/MS mass list obtained at a relatively high precise level [54]. Parameters for enzymolysis and various types of PTMs are imported into ProteinPilot to identify PTMs and fragment splicing. ConoMass [49,115] and ProteinPilot are able to identify nearly all the PTMs except glycosylation, which requires assignment by de novo sequencing. The sequences are homologically searched and matched against databases, such as ConoServer, UniProtKB/Swiss-Prot, NCBI, and known transcriptomic dataset from its own, to identify the known and novel conopeptides as well as their gene superfamilies. The subsequent results are presented by various peptide sequences with a series of statistical data to profile the venom components. Advanced mass analyzer, like TOF, especially Quadrupole-TOF (Q-TOF), shows rapid acquisition, high resolution, first-class sensitivity, and excellent mass accuracy. Ionization methods, such as ESI, MALDI, CID, ETD, EThcD, etc., provide options for obtaining alternative mass data for different purpose. ESI and MALDI are generally for proteomic study, whereas CID, ETD, EThcD are commonly for de novo MS sequencing by providing different dissociation patterns to acquire variable specific peptide fragments. More mature peptides can be detected by using superior MS instrument with advanced mass analyzer and efficient ionization technique. For instance, from venom proteomics of Conus marmoreus, there were 2710 peptide sequences revealed by MALDI-TOF; 3172 peptide sequences were detected by ESI-Q-TOF with regular electrospray; and 6254 peptide sequences were disclosed using ESI-Q-TOF, which is equipped with a DuoSpray ionization source [49]. ETD ionization strategy combined with targeted chemical derivatization has been applied to increase the charge state of conopeptides so as to maximize the detectable mass range, because the molecular masses of conotoxins usually exceed the optimum detective coverage [132]. Superior mass analyzer and various ionization methods are combined and applied to expand the boundary of accessible venom repertoire. Modern venom proteomics provides a methodology, not only for the rapid detection and characterization of specific conotoxins, but also for profiling an overview of the complex venom components.

5.3. Bioinformaics—An Efficient Tool for Massive Data Processing and Integrating

Bioinformatics is an efficient tool for massive data processing and integrating, which has been deeply penetrative during the raw data processing, sequence identification, and superfamily classification by exquisite analytical softwares and algorithms with the introduction of integrated databases [108,140]. Venom duct transcriptomics and venom proteomics both benefit from the emergence and development of bioinformatics, especially the improvements on bioinformatic softwares, algorithms and expansion of searchable databases. The functions of the frequently-used tools for transcriptomics and proteomics are presented in Table 5. Transcriptomic and proteomic studies are quite reliant on the foundation database, which provides templates for sequence searching, matching, and annotating. The databases for sequence identification and BLAST should be the latest updated version, which should be composed of complete or partial natural precursor and mature toxin sequences generated either from conotoxin genes, transcripts, or proteins, as well as artificially synthesized conotoxins. Discovery of novel sequences using different approaches, in return, expands the capacity of the databases.

Table 5

Frequently-used bioinformatic tools for cone snail venom transcriptomics and proteomics.

Tool	Developer	Function
Tools for transcriptomics
SignalP	Technical University of Denmark, Denmark	Predict and locate the signal peptides and their cleavage sites
ConoPrec	The university of Queensland, Australia	Identify ORF and analyze contigs coding for conopeptide precursors, predict signal peptides and their cleavage site; Superfamily categorization
ConoSorter	—	Identify and classify precursor conotoxins into gene superfamilies; Provide relevant information (frequency of protein sequences, length, number of cysteine residues, hydrophobicity rate of N-terminal region etc.)
pHMMs	Technical University of Denmark, Denmark	Identify precursor peptides and classify the sequences into gene superfamily
Tools for proteomics
ConoMass	The university of Queensland, Australia	Match experimental proteomic mass list against the mass predicted from transcripts, mass spectrometry comparison; PTMs identification
Mascot	Mascot science, UK	Peptide mass fingerprint; MS/MS database searches
ProteinPilot	AB SCIEX, USA	Searching and identification of mass sequences; Identification of PTMs
MaxQuant	Max Planck Institute of Biochemistry, Germany	Quantitative analysis of label-free and SILAC-based analysis; PTMs identification

5.4. Multi-Omics Integration

A comprehensive strategy, named “multi-omics” or “venomics” [108,140], by integrating transcriptomics with proteomics through bioinformatics, is popular in the field of conotoxin research [110]. Although venom duct transcriptomics and venom proteomics both have proven to be effective and high-throughput methods to identify massive conopeptide sequences, the conotoxin sequences that are generated from transcriptomics are putative precursor peptides that need to be further confirmed for their real existence at protein level. Furthermore, no PTMs could be predicted from the precursor sequences. Luckily, venom proteomics is able to validate the putative peptides at protein level (Table 4) and identify nearly all the PTMs [49,128]. The validated sequencing and PTMs data can help to illustrate the processing mechanisms (transcript variation, VPP, PTMs) from precursor peptides (transcriptomic data) to the corresponding mature peptides (proteomic data). Just as not every putative precursor can be validated by proteomic data, not all peptide sequences from proteomic data can find their corresponding precursors (Table 4). In fact, they were overlapped and matched with a very small percentage of 9.98% for Conus episcopatus [50]. The significant variations between the datasets of transcriptomics and proteomics actually exist, and the overlapped data (hit sequences) are not big enough. How to extend the datasets for making access to the completed repertoire of conotoxins? How to decode the variation so as to expand the overlapped or matched precursors with their corresponding mature peptides? Since one precursor can generate various mature conopeptides by the different PTMs. Theoretically, more mature peptides should be detected by proteomics. In practical use, the detected number always varies greatly with different MS instrument, bioinformatic analytical methods, fractionation, and sample pretreatment processes, etc. Rare transcripts at a low translational level are difficult to be recognized, which also contribute to the disparity. Thus, standardized processing protocols, reliable detection methods, dedicated integrated databases, and robust data analysis tools are needed.

6. Conclusions and Prospects

In this review, we introduced the discovery methodology of novel conotoxins from various Conus species. It focused on obtaining full N- to C-terminal sequences, regardless of disulfide connectivity through crude venom purification, conotoxin precursor gene cloning, venom duct transcriptomics, venom proteomics, and multi-omic methods. The protocols, advantages, disadvantages, and developments of different approaches during the last decade and the promising prospects are summarized and discussed. To overcome the limitations of crude venom purification strategy, gene cloning technique have been developed and it temporarily slows down the deprivation of the native cone snail resource. In order to improve efficiency, high-throughput omic and multi-omic strategies have opened a new era for conotoxin discovery. Transcriptomics and proteomics are now acknowledged to be effective, resource-saving, and high-throughput approaches for novel conotoxin discovery. Multi-omic strategy is more efficient than using transcriptomics or proteomics alone. Efforts should be made to decode and reduce the variations between transcriptomic and proteomic data in order to expand the accessible repertoire of known conotoxins. The precursor processing mechanisms need to be illustrated as well. Thus, standardized processing protocols, reliable detection methods, dedicated integrated databases, and robust data analysis tools for transcriptomic, proteomic, and multi-omic studies are required to speed up novel conotoxin discovery.

137 in total

1. The synthesis, structural characterization, and receptor specificity of the alpha-conotoxin Vc1.1.

Authors: Richard J Clark; Harald Fischer; Simon T Nevin; David J Adams; David J Craik
Journal: J Biol Chem Date: 2006-06-05 Impact factor: 5.157

Review 2. Ziconotide: neuronal calcium channel blocker for treating severe chronic pain.

Authors: G P Miljanich
Journal: Curr Med Chem Date: 2004-12 Impact factor: 4.530

3. A novel alpha-conotoxin, PeIA, cloned from Conus pergrandis, discriminates between rat alpha9alpha10 and alpha7 nicotinic cholinergic receptors.

Authors: J Michael McIntosh; Paola V Plazas; Maren Watkins; María E Gomez-Casati; Baldomero M Olivera; A Belén Elgoyhen
Journal: J Biol Chem Date: 2005-06-27 Impact factor: 5.157

Review 4. Ziconotide: a clinical update and pharmacologic review.

Authors: Jason E Pope; Timothy R Deer
Journal: Expert Opin Pharmacother Date: 2013-03-28 Impact factor: 3.889

Review 5. Next-generation sequencing platforms.

Authors: Elaine R Mardis
Journal: Annu Rev Anal Chem (Palo Alto Calif) Date: 2013 Impact factor: 10.745

6. Purification and properties of a myotoxin from Conus geographus venom.

Authors: L J Cruz; W R Gray; B M Olivera
Journal: Arch Biochem Biophys Date: 1978-10 Impact factor: 4.013

7. Screening and Validation of Highly-Efficient Insecticidal Conotoxins from a Transcriptome-Based Dataset of Chinese Tubular Cone Snail.

Authors: Bingmiao Gao; Chao Peng; Bo Lin; Qin Chen; Junqing Zhang; Qiong Shi
Journal: Toxins (Basel) Date: 2017-07-06 Impact factor: 4.546

8. Identification of a Novel O-Conotoxin Reveals an Unusual and Potent Inhibitor of the Human α9α10 Nicotinic Acetylcholine Receptor.

Authors: Shantong Jiang; Han-Shen Tae; Shaoqiong Xu; Xiaoxia Shao; David J Adams; Chunguang Wang
Journal: Mar Drugs Date: 2017-06-09 Impact factor: 5.118

9. A Conus regularis conotoxin with a novel eight-cysteine framework inhibits CaV2.2 channels and displays an anti-nociceptive activity.

Authors: Johanna Bernáldez; Sergio A Román-González; Oscar Martínez; Samanta Jiménez; Oscar Vivas; Isabel Arenas; Gerardo Corzo; Roberto Arreguín; David E García; Lourival D Possani; Alexei Licea
Journal: Mar Drugs Date: 2013-04-08 Impact factor: 5.118

10. α-Conotoxin Vc1.1 inhibits human dorsal root ganglion neuroexcitability and mouse colonic nociception via GABA_B receptors.

Authors: Joel Castro; Andrea M Harrington; Sonia Garcia-Caraballo; Jessica Maddern; Luke Grundy; Jingming Zhang; Guy Page; Paul E Miller; David J Craik; David J Adams; Stuart M Brierley
Journal: Gut Date: 2016-02-17 Impact factor: 23.059

10 in total

Review 1. Biomaterials and Bioactive Natural Products from Marine Invertebrates: From Basic Research to Innovative Applications.

Authors: Giovanna Romano; Mariana Almeida; Ana Varela Coelho; Adele Cutignano; Luis G Gonçalves; Espen Hansen; Denis Khnykin; Tali Mass; Andreja Ramšak; Miguel S Rocha; Tiago H Silva; Michela Sugni; Loriano Ballarin; Anne-Marie Genevière
Journal: Mar Drugs Date: 2022-03-22 Impact factor: 6.085

2. Conotoxin Diversity in the Venom Gland Transcriptome of the Magician's Cone, Pionoconus magus.

Authors: José R Pardos-Blas; Iker Irisarri; Samuel Abalde; Manuel J Tenorio; Rafael Zardoya
Journal: Mar Drugs Date: 2019-09-27 Impact factor: 5.118

3. Diversity of Conopeptides and Conoenzymes from the Venom Duct of the Marine Cone Snail Conus bayani as Determined from Transcriptomic and Proteomic Analyses.

Authors: Rajesh Rajaian Pushpabai; Carlton Ranjith Wilson Alphonse; Rajasekar Mani; Deepak Arun Apte; Jayaseelan Benjamin Franklin
Journal: Mar Drugs Date: 2021-04-03 Impact factor: 5.118

4. A Novel α4/7-Conotoxin QuIA Selectively Inhibits α3β2 and α6/α3β4 Nicotinic Acetylcholine Receptor Subtypes with High Efficacy.

Authors: Liujun Wang; Xixi Wu; Xiaopeng Zhu; Dongting Zhangsun; Yong Wu; Sulan Luo
Journal: Mar Drugs Date: 2022-02-17 Impact factor: 5.118

5. Identification of Novel Conotoxin Precursors from the Cone Snail Conus spurius by High-Throughput RNA Sequencing.

Authors: Roberto Zamora-Bustillos; Mario Alberto Martínez-Núñez; Manuel B Aguilar; Reyna Cristina Collí-Dula; Diego Alfredo Brito-Domínguez
Journal: Mar Drugs Date: 2021-09-28 Impact factor: 5.118

Review 6. Malaria Parasite Plasmodium falciparum Proteins on the Surface of Infected Erythrocytes as Targets for Novel Drug Discovery.

Authors: Andrew V Oleinikov
Journal: Biochemistry (Mosc) Date: 2022-01 Impact factor: 2.487

7. Anti-Ovarian Cancer Conotoxins Identified from Conus Venom.

Authors: Shuang Ju; Yu Zhang; Xijun Guo; Qinghui Yan; Siyi Liu; Bokai Ma; Mei Zhang; Jiaolin Bao; Sulan Luo; Ying Fu
Journal: Molecules Date: 2022-10-05 Impact factor: 4.927

8. A novel proline-rich M-superfamily conotoxin that can simultaneously affect sodium, potassium and calcium currents.

Authors: Manyi Yang; Yubin Li; Longfei Liu; Maojun Zhou
Journal: J Venom Anim Toxins Incl Trop Dis Date: 2021-06-11

9. Structure-Function Elucidation of a New α-Conotoxin, MilIA, from Conus milneedwardsi.

Authors: Steve Peigneur; Prabha Devi; Andrea Seldeslachts; Samuthirapandian Ravichandran; Loïc Quinton; Jan Tytgat
Journal: Mar Drugs Date: 2019-09-16 Impact factor: 5.118

10. Identification of Conomarphin Variants in the Conus eburneus Venom and the Effect of Sequence and PTM Variations on Conomarphin Conformations.

Authors: Corazon Ericka Mae M Itang; Jokent T Gaza; Dan Jethro M Masacupan; Dessa Camille R Batoctoy; Yu-Ju Chen; Ricky B Nellas; Eizadora T Yu
Journal: Mar Drugs Date: 2020-10-01 Impact factor: 5.118

10 in total