| Literature DB >> 23170083 |
Ilona Urbarova1, Bård Ove Karlsen1, Siri Okkenhaug1, Ole Morten Seternes2, Steinar D Johansen1,3, Åse Emblem1.
Abstract
Marine bioprospecting is the search for new marine bioactive compounds and large-scale screening in extracts represents the traditional approach. Here, we report an alternative complementary protocol, called digital marine bioprospecting, based on deep sequencing of transcriptomes. We sequenced the transcriptomes from the adult polyp stage of two cold-water sea anemones, Bolocera tuediae and Hormathia digitata. We generated approximately 1.1 million quality-filtered sequencing reads by 454 pyrosequencing, which were assembled into approximately 120,000 contigs and 220,000 single reads. Based on annotation and gene ontology analysis we profiled the expressed mRNA transcripts according to known biological processes. As a proof-of-concept we identified polypeptide toxins with a potential blocking activity on sodium and potassium voltage-gated channels from digital transcriptome libraries.Entities:
Keywords: deep sequencing; drug discovery; marine bioprospecting; neurotoxin; sea anemone; transcriptomics
Mesh:
Substances:
Year: 2012 PMID: 23170083 PMCID: PMC3497022 DOI: 10.3390/md10102265
Source DB: PubMed Journal: Mar Drugs ISSN: 1660-3397 Impact factor: 6.085
Figure 1(A) The cold-water sea anemone species B. tuediae and H. digitata included in this study; (B) Flowchart describing the pipeline in digital bioprospecting from RNA extraction to prediction of candidate biomolecules, which can be expressed in functional trials. Photo by SDJ.
Transcriptome sequencing and assembly a.
| Species | Reads/Contigs | Number | Average size (nt) | Total nt |
|---|---|---|---|---|
|
| Raw reads | 547,061 | 547 | 299,232,484 |
| Trimmed reads | 546,903 | 333 | 182,128,133 | |
| All contigs | 64,442 | 591 | 38,101,858 | |
| Large contigs | 5072 | 1380 | 6,997,895 | |
| Single reads | 118,104 | 279 | 33,008,862 | |
|
| Raw reads | 546,974 | 543 | 296,833,666 |
| Trimmed reads | 546,846 | 331 | 181,169,361 | |
| All contigs | 54,293 | 613 | 33,255,104 | |
| Large contigs | 5083 | 1430 | 7,272,471 | |
| Single reads | 105,695 | 260 | 27,786,964 |
a Number of sequencing reads obtained from 454 pyrosequencing of the transcriptomes of the two sea anemones B. tuediae and H. digitata. Raw Reads, represent all sequence reads obtained from the transcriptome sequencing. Trimmed reads, represent raw reads after trimming of key tag (TCAG) at the 5′ end and removal of low quality and adapter sequences. All contigs, represent all contigs assembled by MWG Eurofins. Large Contigs, represent assembled contigs with size larger than 1000 bases. Single reads, represent reads that are only found in one copy number in the dataset.
Figure 2Gene Ontology (GO) assignment for B. tuediae and H. digitata from 454 pyrosequencing. All assembled contigs together with single reads were blasted and annotated. For the 182,546 and 159,988 contig sequences together with single reads for B. tuediae and H. digitata, respectively, 104,622 and 128,814 GO terms in total were assigned. Furthermore, 127 GO slim ancestor term were assigned to both species. Transcripts were annotated in three main categories: cellular components, molecular function and biological processes. Top 15 classes from each GO category were chosen as representatives for transcriptome comparison. A single transcript could be assigned in more than one category.
Figure 3Representative examples of predicted neurotoxin candidates in H. digitata transcriptome libraries. (A) Recognition of one sodium channel (HdNa3) and one potassium channel (HdK2a) neurotoxin candidates from H. digitata based on amino acid sequence alignments. Observed cysteine residues involved in disulfide bridges are indicated. The N-terminal leader peptide sequences (italics) are proposed to be cleaved off at the cleavage tandem sequence (KR). (B) Structure predictions of the HdNa3 and HdK2a maturepeptide regions. Predictions were made in SWISS-MODEL. The sodium channel neurotoxin predictions contain only β-sheets and loops, in contrast with the potassium channel neurotoxin that also contains an α-helix motif. Disulfide bridges are indicated by white lines between β-sheet motifs. (C) Additional two potassium channel neurotoxin candidates from group II, one predicted for Bolocera (BtK2) and one for Hormathia (HdK2b). 3D structure predictions of both of these type II potassium channel toxins are similar to HdK2a potassium channel neurotoxin from H. digitata. Note that star (*) below alignments in (A,C) indicates identical amino acids. Conserved amino acid changes are indicated by (: or ·).
Conserved domain recognition.
| CDD, input and output a |
|
|
|---|---|---|
| Query amino acid sequences | 864 | 1236 |
| Queries with domain hits | 131 | 229 |
| Total number of domain hits | 151 | 267 |
| KU (Kunitz-type) | 135 | 211 |
| Toxin4 | - | 23 |
| KAZAL_FS | 6 | 23 |
| Antistatin | 6 | - |
| WAP | 1 | 3 |
| TY | 2 | 1 |
| ShK | - | 1 |
| VMA21-like | - | 1 |
| NTR | - | 1 |
a Conserved domain recognition in transcriptome data from B. tuediae and H. digitata. A neurotoxin-enriched portion of the 454 transcriptome raw reads was translated into six reading frames and ran through the NCBI’s Conserved Domain Databases (CDD). Recognized superfamily domains included: KU—Kunitz type toxins (serine proteinase inhibitor); Toxin4—sea anemone neurotoxin; KAZAL_FS—serine protease inhibitor, Antistatin—serine protease inhibitor; WAP—whey acidic protein-type four-sulfide core domains; TY—thyroglobin type I; ShK—three disulfide bridges, potassium channel inhibitor; VMA21-like—two potential transmembrane helicos; NTR-like—beta barrel.