Literature DB >> 22384206

Global transcriptome profiling of the pine shoot beetle, Tomicus yunnanensis (Coleoptera: Scolytinae).

Jia-Ying Zhu1, Ning Zhao, Bin Yang.   

Abstract

BACKGROUND: The pine shoot beetle Tomicus yunnanensis (Coleoptera: Scolytinae) is an economically important pest of Pinus yunnanensis in southwestern China. Developed resistance to insecticides due to chemical pesticides being used for a long time is a factor involved in its serious damage, which poses a challenge for management. In addition, highly efficient adaptation to divergent environmental ecologies results in this pest posing great potential threat to pine forests. However, the molecular mechanisms remain unknown as only limited nucleotide sequence data for this species is available. METHODOLOGY/PRINCIPAL
FINDINGS: In this study, we applied next generation sequencing (Illumina sequencing) to sequence the adult transcriptome of T. yunnanensis. A total of 51,822,230 reads were obtained. They were assembled into 140,702 scaffolds, and 60,031 unigenes. The unigenes were further functionally annotated with gene descriptions, Gene Ontology (GO), Clusters of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genome (KEGG). In total, 80,932 unigenes were classified into GO, 13,599 unigenes were assigned to COG, and 33,875 unigenes were found in KO categories. A biochemical pathway database containing 219 predicted pathways was also created based on the annotations. In depth analysis of the data revealed a large number of genes related to insecticides resistance and heat shock protein genes associated with environmental stress.
CONCLUSIONS/SIGNIFICANCE: The results facilitate the investigations of molecular resistance mechanisms to insecticides and environmental stress. This study lays the foundation for future functional genomics studies of important biological questions of this pest.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22384206      PMCID: PMC3285671          DOI: 10.1371/journal.pone.0032291

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The bark beetle genus Tomicus (Coleoptera: Curculionidae: Scolytinae) comprises seven described species [1]. Among them, T. piniperda and T. minor have a Palearctic distribution from Europe to Japan and China and repeatedly introduced to North America, T. destruens is found only around the Mediterranean Basin, while T. yunnanensis, T. brevipilosus, T. pilifer and T. puellus are restricted to southwestern, eastern, and central China [2]. Tomicus damages various pine species mainly due to its shoot feeding behavior and trunk stem attacking, which is among the most damaging pests of pine forests in the outbreak countries. Especially, some species of Tomicus have evolved towards different ecological strategies of host use, wider range distribution, and high levels of resistance to major classes of insecticides owing to biological and genetic factors. They are as exotic pests frequently invaded into new areas. Tomicus imposes a potentially serious damage upon pine forest. Having a globally economical importance and ecological threat, it has been studied well from general biology and ecological viewpoints [3]–[5]. But it has remained poorly investigated at the molecular level. Only few studies have been developed, leading to elucidate their population genetic and species evolution [2], [6]. Up to September 26, 2011, there are only 719 nucleotide sequences deposited on National Center for Biotechnology Information (NCBI) for Tomicus. Due to insufficient genetic background, the molecular mechanisms of Tomicus in forest ecological systems are poorly understood. The next generation sequencing (high-throughput deep sequencing) technology recently enables efficient approach on large-scale and genome-wide for gene discovery and expression profiling, and studies in functional, comparative and evolutionary genomics in non-model organisms with little or no previous genomic information exists [7], [8]. In the past few years, several studies based on this technology have, indeed, allowed the efficient, massive and successful molecular mechanisms investigation of some insect species lacking genome information, such as Bemisia tabaci, Nilaparvata lugens, and Trialeurodes vaporariorum [9]–[11]. T. yunnanensis is a newly described pine shoot beetle only found in Yunnan, Sichuan, and Guizhou Provinces in southwestern China, and has affected 200,000 ha. of Pinus yunnanensis forests over the past 30 years [12]. Control of this pest until now depended largely on chemical insecticides. It has evolved high level of resistance to insecticides. This might be due to the increasing of metabolic capability of detoxificative systems and/or reducing target site sensitivity. In addition, this pest has been found distributed in divergent environmental ecologies. The distributional areas of T. yunnanensis are with high mounts and deep valley as geographical barriers, and its ecological environments are in large difference. It suggested that T. yunnanensis adapted efficiently to environmental stress for surviving in diversifiable ecologies during the past years. Even resistance of T. yunnanensis to insecticides and severe environmental stress is an ongoing challenge for pest management, there is no information available for uncovering the molecular mechanisms under these. In the present work, we characterized the first global transcriptome using Illumina technology of that Tomicus species T. yunnanensis. A systematic bioinformatics strategy was engaged to functional annotation of the transcriptome data. Additionally, important homologues of genes involved in insecticide resistance and heat shock protein (HSP) genes associated with environmental stress were identified. This study dramatically provides a foundation and increases the significant promise for further functional genomics studies of T. yunnanensis and other Tomicus species.

Materials and Methods

Ethics statement

Regarding the field study, no specific permits were required. The location is not privately-owned or protected in any way. The field studies did not involve endangered or protected species.

Insects

Adult beetles identified based on morphological characters [1] as T. yunnanensis by their trunk attacking phase on P. yunnanensis were collected from Qujing city, Yunnan province, China. The samples were frozen at −80°C until use.

cDNA library and Illumina sequencing

Total RNA was extracted from each 20 female and male adult beetles using TRIzol® Reagent (Invitrogen) according to the manufacturer's instructions. RNA quality and yield were assessed by 2100 Bioanalyzer (Agilent Technologies) with a minimum RNA integrated number value of 8. The cDNA library was prepared using Illumina's kit following manufacturer's recommendations. Briefly, messenger RNA was isolated from 20 µg total RNA (pooled RNA of female and male adults) using oligo(dT) magnetic beads and fragmented into short sequences using divalent cations under elevated temperature. First and second strand cDNA were synthesized from cleaved RNA fragments. After the end repair and ligation of adaptors, the products were cleaned up with a QIAquick PCR Purification Kit to create the final cDNA library. The library was sequenced on the Illumina sequencing platform (GAII) to obtain short sequences from both ends.

Bioinformatic analysis

The raw reads from the images and quality value calculation were performed by the Illumina data processing pipeline (version 1.6). Before the assembly, the raw reads were cleaned by adaptor sequences, empty reads and low quality sequences (reads with unknown sequences ‘N’) to obtain the high-quality clean reads. Raw reads were then assembled into sequence contigs, scaffolds, and unigenes using SOAPdenovo software [13], and clustered using TGI Clustering tools [14]. All Illumina assembled unigenes were searched against nr database in NCBI, Swiss-Prot, Kyoto Encyclopedia of Genes and Genome (KEGG), and Cluster of Orthologous Groups (COG) with the BLASTX algorithm. The E-value cut-off was set at 10−5. Genes were tentatively identified according to the best hits against known sequences. Blast2GO [15] was used to predict the functions of the sequences, assign Gene Ontology (GO) terms, and predict the metabolic pathways in COG and KEGG databases. Amino acid sequence comparisons were conducted with Clustal X (v1.83) program [16]. Phylogenetic tree was constructed by MEGA 5.0 package [17] using the neighbour-joining method with the Poisson correction model after sequence alignment performed by Clustalx. Bootstrap analysis of 1,000 replication trees was performed in order to evaluate the branch strength of each tree.

Data availability

The Illumina reads of T. yunnanensis have been submitted to NCBI Short Read Archive under the accession number of SRA047283.

Results and Discussion

Sequencing and de novo assembly

Illumina sequencing resulted in 51,822,230 raw reads, corresponding to an accumulated length of 4,664,000,700 bp (Table 1). The average raw reads length is 90 bp, which is consistent with the Illumina sequencing capacity. Using SOAPdenovo software, the raw reads were assembled into contigs after adaptor sequences, empty reads and low quality sequences filtered out. The raw reads were assembled into 642,521 contigs with a mean length of 117 bp. The range length of contigs is from 50 to 2953 bp. Although the majority of the contigs are between 50 to 400 bp, we obtained 9,331 contigs which were greater than 400 bp in length. The size distribution of these contigs is shown in Figure S1. Using paired end-joining and gap-filling, these contigs were further assembled into 140,702 scaffolds with a mean length of 233 bp, and range length of 100–2953 bp. 2,293 scaffolds were longer than 800 bp (Figure S2). Using TGI software, scaffold sequences were assembled into clusters. We obtained 60,031 unigenes with a mean length of 355 bp. The lengths of the 8,953 and 1,055 unigenes were ≥500 bp and 1000 bp, respectively, revealing that 85% of them fell between 100 and 500 bp in length (Figure 1). The result demonstrated that the scaffold and unigene length distribution followed the contig length distribution closely, with the majority being shorter sequences with relatively little redundancy, which was in a similar to other insect transcriptome projects using this technology [9]–[11], [18], [19]. The majority of scaffolds and unigenes after assembly were still less than 500 bp, which might be due to the short length sequencing capacity of Illumina sequencing and/or the low coverage of the transcriptome represented in this dataset [20]. The assembled abundant sequence data provided a rich source of information for further investigation, thus allowing for rapid characterization of a large portion of the transcriptome and better reference of interesting genes.
Table 1

Sequence statistics of the Illumina sequencing assembly.

ReadsContigsscaffoldsUnigenes
Number of sequences51,822,230642,521140,70260,031
Mean length (bp)90117233355
Total length (bp)4,664,000,70074,951,78832,749,10221,338,135
Figure 1

Length distribution of unigenes.

The number of y-axis has been transfer into logarithmic scale.

Length distribution of unigenes.

The number of y-axis has been transfer into logarithmic scale.

Annotation of predicted proteins

The assembled unigenes were used as a query for Blastx searches in the NCBI nr protein database with a cut-off E-value of 10−5. The search produced 34,702 hits, which comprised 57.81% of all the unigenes (Table S1). A large proportion of them (about 40%) apparently have no significant match in any of the existing databases, indicating many of them may contain novel sequences and a high number of Coleoptera or species-specific transcripts or transcript parts (orphan UTRs). This is expected, as there is very little sequence information from closely related species. The E-value distribution of the top hits in the nr database showed that 17% of the mapped sequences have strong homology (smaller than 1.0E−49), whereas 83% of the homolog sequences ranged between 1.0E−5 to 1.0E−49 (Figure 2A). The species distribution of the best match result for each sequence showed that the T. yunnanensis sequences have 62.48% matches with sequences from the Coleoptera species (Tribolium castaneum), while very low proportion (<5%) of them have matches to other insects (Figure 2B). It demonstrated that T. yunnanensis have near evolution distance with T. castaneum.
Figure 2

Characteristics of homology search of Illumina sequences against the nr database.

(A) E-value distribution of BLAST hits for each unique sequence with a cut-off E-value of 1.0E-5. (B) Species distribution of the BLASTX results. We used the first hit of each sequence for analysis.

Characteristics of homology search of Illumina sequences against the nr database.

(A) E-value distribution of BLAST hits for each unique sequence with a cut-off E-value of 1.0E-5. (B) Species distribution of the BLASTX results. We used the first hit of each sequence for analysis.

GO assignments

Gene ontology is widely used to standardize representation of genes across species and provide a controlled vocabulary of terms for describing gene products [21]. In total, 80,932 unigenes were assigned for GO terms based on BlastX matches with sequences whose function is previously known (Figure 3, Table S2). These GO terms were summarized into the 3 main GO categories (biological process, cellular component, and molecular function) according to the standard GO terms and 47 sub-categories. Compared to the GO annotations of Drosophila melanogaster genome [22], our sequence data do not contain any notable biases towards particular categories of genes. Biological process made up the majority of the GO annotations (38,578, 47.67% of the total), followed by cellular component (26,119, 32.27%), and molecular function (16,235, 20.06%). Among biological process category, cellular process (17.97%), and metabolic process (15.12%) were the most dominant subcategories, reflecting that the analyzed tissues were undergoing rapid growth and extensive metabolic activities. The following subcategories were multicellular organismal process (8.54%), developmental process (8.21%), biological regulation (7.85%), localization (6.88%), and cellular component organization (5.81%). The biological process illustrated all of the major cellular processes from transport and cellular organization to transcription, translation, and metabolism. Under the category of cellular component, cell (29.91%), cell part (29.91%), and organelle (17.75%) were among the most highly represented subcategories. Most of the unigenes annotated with a cellular component are localized to plastids or mitochondria. The molecular function category was mainly comprised of proteins involved in binding (46.54%), predominantly heat shock proteins (Hsp), and catalytic activities (36.91%) including hydrolases, kinases, and transferases, allowing for the identification of genes involved in the secondary metabolite synthesis pathways. Similar observations for metabolic processes were reported in transcriptomic studies of other insects [18]. These GO annotations represent a general gene expression profile signature for T. yunnanensis adults, which demonstrates that the expressed genes in this species encode diverse structural, regulatory and stress proteins.
Figure 3

Distribution of second level GO of Tomicus yunnanensis transcriptome.

(A) biological process, (B) cellular component and (C) molecular function. The percentage of a specific category of genes in that main category is shown.

Distribution of second level GO of Tomicus yunnanensis transcriptome.

(A) biological process, (B) cellular component and (C) molecular function. The percentage of a specific category of genes in that main category is shown.

COG classification

A total of 13,599 unigenes were assigned to the appropriate COG clusters (Figure 4). These COG classifications were grouped into 25 function categories that correspond to the categories observed in GO analysis. The cluster for general function prediction only (16.10%) represents the largest group. The other three largest categories include: (1) translation, ribosomal structure and biogenesis (9.85%), (2) posttranslational modification, protein turnover, chaperones (8.54%), and (3) replication, recombination and repair (7.54%). The category of secondary metabolites biosynthesis, transport and catabolism was highlighted with 2.86%, because of the importance of secondary metabolites to the insecticides in insects. The most abundant sequences in this category are cytochrome P450 monooxygenases. To some extent, the COG classifications shed light on specific responses and functions involved in the molecular processes of T. yunnanensis.
Figure 4

COG function classification of the Tomicus yunnanensis transcriptome.

All putative proteins were aligned to the COG database and can be classified functionally into at least 25 molecular families.

COG function classification of the Tomicus yunnanensis transcriptome.

All putative proteins were aligned to the COG database and can be classified functionally into at least 25 molecular families.

KEGG analysis

To identify the biological pathways that are active in T. yunnanensis, we mapped the 34,702 annotated sequences to the referential canonical pathways in KEGG. A total of 16,727 unigenes were assigned to 219 KEGG pathways. All the pathways are summarized in Table S3. The top 10 pathways are metabolic pathways (3339 members), spliceosome (665 members), huntington's disease (663 members), pathways in cancer (586 members), lysosome (527 members), purine metabolism (509), regulation of actin cytoskeleton (472 members), alzheimer's disease (471 members), ubiquitin mediated proteolysis (467 members), and focal adhesion (466 members). These annotations provide a valuable resource for investigating specific processes, functions and pathways in T. yunnanensis research.

Putative insecticides resistance related genes

Cytochrome P450 (P450)

Because of the genetic diversity, broad substrate specificity, and catalytic versatility, P450s can mediate resistance to all classes of insecticides [23]. Approximately a total of 146 P450 related unigenes were identified in the transcriptome. Even 23 T. yunnanensis P450 sequences were identified in the dataset as length longer than 600 bp (Table 2), and the majority of them were as short fragments listed in Table S4. In general, insect genomes harbor ∼100 different P450s, which can be divided into four clades (CYP2, CYP3, CYP4 and mito) [24], [25]. The number of unigenes in the T. yunnanensis transcriptome belongs to the range of P450s identified in other insect species, while the accurate gene number still remains to be identified by gene cloning based on the fragments obtained here. Of the 23 unigenes which contained longer length, it was possible to differentiate them into four clades by phylogenetic analysis. The majority of these P450s belonged to the CYP3 and CYP4 clades compared with CYP2 and mito clades, which is in agreement with Tribolium castaneum and other insect systems [25], [26]. Because some functionality of P450s can be assigned to each of the four clades based on known functionalities characterized in other insects, we can select some interesting candidates for a further investigation according to the phylogenetic results. Until recently, the increased production of P450s in resistant insects has shown to occur almost exclusively through up regulation via changes in cis-or trans-acting regulatory loci, but gene duplication or amplification of P450s has now been implicated in the resistance of four insect species [27]. From phylogeny, gene duplication events specific to T. yunnanensis are apparent, with the best example being the two CYP3-type sequences Unigene58724 and Unigene58733, and two CYP4-type sequences Unigene59997 and Unigene8705, if they were paired in the phylogeny with bootstrap support greater than 70%. However, the relevance of duplication resistance of T. yunnanensis requires a further investigation.
Table 2

Putatively identified P450 genes in Tomicus yunnanensis.

Unigene IDPutative assignmentLength (bp)First hitIdentity (%)E_valueBlast annotation/Organism
Unigene12883CYP3612XP_975568442.00E-28PREDICTED: similar to cytochrome P450 [Tribolium castaneum]
Unigene15066CYP4819EFA01330589.00E-59cytochrome P450-like protein [Tribolium castaneum]
Unigene1549CYP4679AAT38513428.00E-43ubiquitous cytochrome P450 [Phyllopertha diversa]
Unigene1697CYP3902EFA04564395.00E-56cytochrome P450 347A1 [Tribolium castaneum]
Unigene21111CYP3771XP_972794569.00E-34PREDICTED: similar to Probable cytochrome P450 9f2 (CYPIXF2) [Tribolium castaneum]
Unigene22371CYP4694XP_001814854451.00E-45PREDICTED: similar to cytochrome P450 [Tribolium castaneum]
Unigene23650mito645XP_974252607.00E-65PREDICTED: similar to CYP302a1 [Tribolium castaneum]
Unigene23993CYP21092NP_503303382.00E-57CYtochrome P450 family member (cyp-33C11) [Caenorhabditis elegans]
Unigene24457CYP31192EEZ99338622.00E-63cytochrome P450 6BQ13 [Tribolium castaneum]
Unigene24875CYP21173XP_968477531.00E-98PREDICTED: similar to cytochrome P450, partial [Tribolium castaneum]
Unigene551mito640NP_001123894572.00E-54cytochrome P450 CYP314A1 [Tribolium castaneum]
Unigene58533CYP4603ABF06546672.00E-44CYP4BD1 [Ips paraconfusus]
Unigene58724CYP3627NP_001164248505.00E-50cytochrome P450 9Z4 [Tribolium castaneum]
Unigene58733CYP3628NP_001164248474.00E-51cytochrome P450 9Z4 [Tribolium castaneum]
Unigene59248CYP4720ABF06544465.00E-59CYP4AY1 [Ips paraconfusus]
Unigene59398CYP4760XP_973153451.00E-44PREDICTED: similar to cytochrome P450 [Tribolium castaneum]
Unigene59490CYP4788XP_001602395367.00E-41PREDICTED: similar to cytochrome P450 [Nasonia vitripennis]
Unigene59572CYP3818EFA02819456.00E-67cytochrome P450 6BQ5 [Tribolium castaneum]
Unigene59684CYP3871NP_496108424.00E-47CYtochrome P450 family member (cyp-13A1) [Caenorhabditis elegans]
Unigene59753CYP2916XP_969587691.00E-115cytochrome P450 307A1 [Tribolium castaneum]
Unigene59997CYP41443EEZ99364451.00E-100cytochrome P450-like protein [Tribolium castaneum]
Unigene8212CYP4650ABF06550606.00E-71CYP4BG1 [Ips paraconfusus]
Unigene8705CYP4686EEZ99364324.00E-32cytochrome P450-like protein [Tribolium castaneum]

Fragments less than 600 bp listed in .

Fragments less than 600 bp listed in .

Glutathione S-transferase (GST)

GSTs catalyse the glutathione conjugation reaction with reduced glutathione (GSH) to convert them, resulting in less toxic water-soluble products that can eventually be excreted, which are widespread in both prokaryotes and eukaryotes [28]. The increased expression and activity of GSTs, and amplification of the structural GST genes has been documented as a mechanism of insect resistance [27]. Seventeen putative GST unigenes were identified in T. yunnanensis transcriptome (Table 3). Based on the closest BLAST hits in the NCBI nr database and, when possible, through applying a phylogenetic analysis 14, 2, and 1 unnigenes were assigned to the Epsilon, Sigma, and Delta classes, respectively. In insects, there are two ubiquitously distributed distantly related groups of GSTs, classified according to their location within the cell: microsomal and cytosolic [29]. The microsomal class contains few gene duplicates, while the cytosolic class contains highly diverse larger gene family divided into six major subclasses: Delta, Epsilon, Sigma, Omega, Theta, and Zeta [30]. In regard to microsomal class, it may await discovery due to its absence from the current transcriptomic dataset, but this group has not been implicated in the metabolism of insecticides [29]. In microsomal class, only three subclasses have been identified from the current database, which is similar to some insect species that do not have six subclasses like Drosophila melanogaster [31]. For instance, only Sigma, Delta, and Theta type GSTs were found in Nasonia vitripennis, and Epsilon, Sigma, and Omega type GSTs were found in T. castaneum based on genomic analysis [30]. Although the Delta subclasse is absent in T. castaneum, one unnigene assigned to Delta subclasse was identified in T. yunnanensis. In addition, the majority of T. yunnanensis GSTs were assigned to Epsilon subclasse, which is in accordance with studied insect GSTs with extensive cases of gene expansions in Epsilon and Delta types [32].
Table 3

Putatively identified GST genes in Tomicus yunnanensis.

Unigene IDPutative assignmentLength (bp)First hitIdentity (%)E_valueBlast annotation/Organism
Unigene22819Epsilon521XP_971136492.00E-25PREDICTED: similar to Glutathione S transferase E8 CG17533-PA [Tribolium castaneum]
Unigene23097Epsilon542ACU09495419.00E-23glutathione S-transferase 16 [Helicoverpa armigera]
Unigene25020Sigma693XP_002633726352.00E-16C. briggsae CBR-GST-3 protein [Caenorhabditis briggsae]
Unigene25058Epsilon715XP_966966411.00E-37PREDICTED: similar to Glutathione S transferase E6 CG17530-PA [Tribolium castaneum]
Unigene28476Epsilon209XP_971389654.00E-15PREDICTED: similar to Glutathione S transferase E5 CG17527-PA [Tribolium castaneum]
Unigene31256Epsilon218XP_966787427.00E-10PREDICTED: similar to Glutathione S transferase E7 CG17531-PA [Tribolium castaneum]
Unigene34958Epsilon230XP_971268372.00E-07PREDICTED: similar to Glutathione S transferase E7 CG17531-PA [Tribolium castaneum]
Unigene35169Epsilon231XP_971449542.00E-11PREDICTED: similar to Glutathione S transferase E7 CG17531-PA [Tribolium castaneum]
Unigene37693Sigma241NP_001165920526.00E-14glutathione S-transferase S3 [Nasonia vitripennis]
Unigene40017Epsilon252NP_611325563.00E-19glutathione S transferase E3 [Drosophila melanogaster]
Unigene44809Epsilon279ADD19697622.00E-27glutathione S-transferase [Glossina morsitans morsitans]
Unigene44817Epsilon279ADD19697611.00E-27glutathione S-transferase [Glossina morsitans morsitans]
Unigene49693Epsilon323XP_967234433.00E-19PREDICTED: similar to glutathione S-transferase 6A [Tribolium castaneum]
Unigene50873Delta339XP_974204764.00E-45PREDICTED: similar to GST [Tribolium castaneum]
Unigene54951Epsilon415BAE80117454.00E-26glutathione S-transferase [Plutella xylostella]
Unigene5052Epsilon443XP_971136452.00E-31PREDICTED: similar to Glutathione S transferase E8 CG17533-PA [Tribolium castaneum]
Unigene8150Epsilon416NP_611328442.00E-14glutathione S transferase E6 [Drosophila melanogaster]

Other candidates

In addition to the detailed analyses above, further unigenes were identified with a high sequence similarity to important genes related to insecticide metabolism and targets. As shown in Table 4, a number of unigenes annotated as enzymes related to insecticide metabolic resistance, such as carboxylesterase, and superoxide dismutase; and insecticide targets, such as acetyl-CoA carboxylase, acetylcholinesterase, and γ-aminobutyric acid (GABA) receptor, were present in T. yunnanensis transcriptome. Although most of these unigenes are not full length, they will nevertheless facilitate a further characterisation of these targets by RACE to retrieve the full length cDNAs. The abundance of these transcripts demonstrates the quality of our sequencing data. It provided new leads for functional studies of dissecting the potential insecticide resistance role each these genes plays.
Table 4

Putative genes of interest related to insecticide resistance.

Gene nameUumber of unigenes had a hit with nr database
Cytochrome P450146
Glutathione S-transferase17
Carboxylesterase128
Superoxide dismutase20
Acetyl-CoA carboxylase25
Acetylcholinesterase6
γ-aminobutyric acid (GABA) receptor8
Nicotinic acetylcholine receptor17
Sodium channel5
Chloride channel63
Ryanodine receptor36

Detection of HSP genes

Heat shock protein 10 (HSP10)

HSPs are highly conserved molecules that play vital roles in all cells. Among them, HSP10 is a near 10 kDa, highly conserved protein. In eukaryotes, HSP10, originally identified as a mitochondrial chaperone, now is also known to be present in other places such as cytosol, cell surface, and extracellular space [33]. Here, six unigenes were identified putatively encoded HSP10 (Table S5). Two of them (Unigene4430 and Unigene52967) appeared to be complete. Amino acid alignment revealed a low homology of these four HSP10s to HSP10s from the Acyrthosiphon pisum (46%), Glossina morsitans morsitans (49%), Nasonia vitripennis (51%), and Tribolium castaneum (41%) (Figure 5). HSP10 is known as a co-chaperone for heat shock protein 60 (HSP60), and exerts immunosuppressive activity in mammals [34]. To our best knowledge, HSP10 in insects has not been structurally and functionally studied in detail.
Figure 5

Alignment of the predicted amino acid sequences of Tomicus yunnanensis HSP10s with that of other insects.

Identical residues are shaded black, conserved substitutions are shaded grey. Dash (–) indicates insertion or deletion. Api, Acyrthosiphon pisum (NP_001119666); Gmo, Glossina morsitans morsitans (ADD19718); Nvi, Nasonia vitripennis (XP_001599992); Tca, Tribolium castaneum (XP_969732).

Alignment of the predicted amino acid sequences of Tomicus yunnanensis HSP10s with that of other insects.

Identical residues are shaded black, conserved substitutions are shaded grey. Dash (–) indicates insertion or deletion. Api, Acyrthosiphon pisum (NP_001119666); Gmo, Glossina morsitans morsitans (ADD19718); Nvi, Nasonia vitripennis (XP_001599992); Tca, Tribolium castaneum (XP_969732).

Small heat shock protein (sHSP)

sHSP are a family of molecular chaperones, with molecular mass ranging from 12 kDa to 43 kDa, usually below 30 kDa, which is involved in cellular defense under environmental stress conditions [35], [36]. Up to data, knowledgment of insect sHsps is much less than of sHsps in plants and vertebrates. Twenty three unigenes were found to be with similarities to sHSP (Table S5). Among of them, five (Unigene24062, Unigene56764, and Unigene56997, Unigene58448 and Unigene59017) appeared to be complete. These five sHSPs showed variable sequence identity, ranged from 29% to 82%, to the first hits under blast search, suggesting that sHSPs are diverse. An alignment of the predicted proteins deduced from the complete unigenes is shown in Figure 6. Interestingly, the alignment indicated that they were with very low similarity between each other, revealing that they do not display a highly conservation. Previous studies have demonstrated that sHSPs are abundant and ubiquitous in almost all organisms with different numbers, from bacteria to algae single celled and even to the higher organisms including human [37]. Representative sequences of the many demonstrated sHSPs available share a conserved sequence of approximately 90 amino acid residues, termed α-crystallin domain responsible for dimer formation and form large multimeric complexes that are known to be crucial to their chaperone activities [38]–[40]. There is α-crystalling domain in the resided C-terminus of T. yunnanensis predicted sHSPs. The N-terminal coding sequences and N-terminal ends of sHSPs are more variable than C-terminal with the α-crystalling domain in different species [41]. As expected, the deduced amino acid sequence of T. yunnanensis HSP20s showed high conservation in α-crystalling domain, and high divergence in N-terminal region and variable N-terminal end. However, there are still some conserved amino acid residues in N-terminal.
Figure 6

Alignment of the predicted amino acid sequences of Tomicus yunnanensis sHSPs. The conserved α-crystallin domain is underlined.

Identical residues are shaded black, conserved substitutions are shaded grey. Dash (–) indicates insertion or deletion.

Alignment of the predicted amino acid sequences of Tomicus yunnanensis sHSPs. The conserved α-crystallin domain is underlined.

Identical residues are shaded black, conserved substitutions are shaded grey. Dash (–) indicates insertion or deletion.

Heat shock protein 40 (HSP40)

HSP40 family is one of the HSPs containing the DNAJ homologous region of Escherichia coli, J-domain of DNAJ [42]. HSP40 performs an essential molecular chaperone function in protein translation, folding, unfolding, translocation and degradation, in protein translocation across membranes, and in protecting cells from the effects of heat and other stress factors, primarily acting as cofactors or regulators of heat shock protein 70 (HSP70) [43]. Thirty seven unigenes with similarity to HSP40 were present in T. yunnanensis transcriptome. Among them, Unigene59953 is with full open reading frame. The amino acid identity is about 30% compared with those of T. castaneum, Liriomyza sativae, Locusta migratoria, and Bombyx mori (Figure 7). Typically, HSP40 proteins have three distinct domains: (1) the J domain of 70 amino acid residues in length which constitutes the most conserved region of these proteins and interacts with HSP70 and stimulates its ATPase activity; (2) a glycine and phenylalanine-rich region (G/F domain), postulated to act as a flexible hinge needed to activate the substrate binding properties of Hsp70 when it interacts with Hsp40; and (3) a cysteine-rich region (C domain) resembling a zinc-finger like structure, suggested to mediate dimer formation and molecular chaperone–peptide interactions [44]–[47]. These three domains were found in the predicted amino acid sequence of T. yunnanensis HSP40. However, not all HSP40 necessarily contain all of these 3 domains. Based on the differences in these regions, HSP40 proteins can be categorized into three groups: Type I homologs have all 3 domains (J, G/F, and C), Type II have the J and G/F but not the C domain, and Type III have the J domain alone [48], [49]. According to this, T. Yunnanensis HSP40 obtained here belongs to Type I.
Figure 7

Alignment of the Tomicus yunnanensis HSP40 amino acid sequence with other insects HSP40 amino acid sequences.

The conserved J domain is underlined. The Gly/Phe-rich domain is double underlined. The cysteine-rich domain is indicated as dotted line. Identical residues are shaded black, conserved substitutions are shaded grey. Dash (–) indicates insertion or deletion. Tca, Tribolium castaneum (EFA11191); Lsa, Liriomyza sativae (ABE57132); Lmi, Locusta migratoria (ABC84495); Bmo, Bombyx mori (BAD90846).

Alignment of the Tomicus yunnanensis HSP40 amino acid sequence with other insects HSP40 amino acid sequences.

The conserved J domain is underlined. The Gly/Phe-rich domain is double underlined. The cysteine-rich domain is indicated as dotted line. Identical residues are shaded black, conserved substitutions are shaded grey. Dash (–) indicates insertion or deletion. Tca, Tribolium castaneum (EFA11191); Lsa, Liriomyza sativae (ABE57132); Lmi, Locusta migratoria (ABC84495); Bmo, Bombyx mori (BAD90846).

Heat shock protein 60 (HSP60)

HSP60, a major group of the heat shock proteins, includes stress inducible and constitutively expressed members. It is believed to be predominately mitochondrial, although some are also reported in cytosol and in extracellular compartments [50]. Eighteen unigenes coding for putative HSP60 were identified in the database (Table S5). Compared with orthologues from other organisms, the percentage identity of the deduced amino acid sequence of these unnigenes is high, between 45–93%, confirming the remarkable conservation of this family. Among them, only Unigene59944 was with the length above 1 kb, the others were shorter than 600 bp. Alignment of the predicted amino acid sequence of Unigene59944 with that of other insects revealed it contained the signature peptide known as conserved ATP-binding motif [51] (Figure 8). The classical mitochondrial HSP60 signature motif was found in the deduced amino acid sequence [52], suggesting the HSP60 coded by Unigene59944 presented here is a member of the mitochondrial HSP60 chaperone family. As Unigene59944 was 3′ and 5′-truncted, a fragment of about 30 amino acids at the N terminus required for import into the mitochondriaa and a typical GGM repeat motif for HSP60 at the C terminus were not observed.
Figure 8

Alignment of the partial amino acid sequence of Tomicus yunnanensis HSP60 with that of other insects.

(A) the ATP-binding motif. (B) The classical mt-HSP60 signature (underlined) motif. Identical amino acids are shaded black, and conserved residues are shaded grey. Tca, Tribolium castaneum (XP_971630), Ame, Apis mellifera (XP_392899), Csu, Chilo suppressalis (ACT52824); Lsa, Liriomyza sativae (AAW49251).

Alignment of the partial amino acid sequence of Tomicus yunnanensis HSP60 with that of other insects.

(A) the ATP-binding motif. (B) The classical mt-HSP60 signature (underlined) motif. Identical amino acids are shaded black, and conserved residues are shaded grey. Tca, Tribolium castaneum (XP_971630), Ame, Apis mellifera (XP_392899), Csu, Chilo suppressalis (ACT52824); Lsa, Liriomyza sativae (AAW49251).

Heat shock protein 70 (HSP70)

In the HSP family, the most studied are HSP70s. There are two main forms of these 70 kD proteins, the heat shock cognate (HSC70) which is expressed constitutively and an inducible form (HSP70) which is normally expressed in response to external stimuli [53]. Seventeen four unigenes produced the best sequence matches to HSP70 (Table S5). The majority of them showed highly conserved identity above 70% to the classic inducible form of HSP70 from other organisms. All these sequences were not in completeness, and only six sequences (Unigene1, Unigene16402, Unigene20916, Unigene59995, Unigene6391, and Unigene6586) were in length longer than 1 kb. After alignment of these 6 non-redundant sequences to that of other insects (Figure 9), three signature motifs of the HSP70 family [54] were conserved. Putative ATP-GTP binding site [55] and nonorganellar consensus motif were located at the amino acid sequences. Putative bipartite nuclear localization signals, which are needed for the selective translocation of HSC70 into the nucleus [56] were identified. The terminal sequence motif of the HSP70 family identifies their cellular location, with the EEDV sequence motif attesting to its cytoplasmic localisation, whereas K/HDEL and PEAEYEEAKK characterise members localised to the endoplasmic reticulum and mitochondria respectively [53], [57]. With regard to this, the terminal sequence motif of Unigene20916 and Unigene6391 were K/HDEL and EEDV, indicating they belonged to cytoplasmic localization and endoplasmic reticulum member, respectively.
Figure 9

Alignment of the partial amino acid sequence of Tomicus yunnanensis HSP70s with that from other insects.

(A) Three highly conserved HSP70 family signatures labeled I, II, III. (B) The ATP-GTP binding site. (C) Putative bipartite nuclear localization signals. (D) The nonorganellar consensus motif. (E) The C-terminus. Identical amino acids are shaded black, and conserved residues are shaded grey. Dash (–) indicates insertion or deletion. Mci, Macrocentrus cingulum (ACD84944); Pxy, Plutella xylostella (BAF95560); Tca, Tribolium castaneum (XP_974442); Lmi, Locusta migratoria (AAO21473).

Alignment of the partial amino acid sequence of Tomicus yunnanensis HSP70s with that from other insects.

(A) Three highly conserved HSP70 family signatures labeled I, II, III. (B) The ATP-GTP binding site. (C) Putative bipartite nuclear localization signals. (D) The nonorganellar consensus motif. (E) The C-terminus. Identical amino acids are shaded black, and conserved residues are shaded grey. Dash (–) indicates insertion or deletion. Mci, Macrocentrus cingulum (ACD84944); Pxy, Plutella xylostella (BAF95560); Tca, Tribolium castaneum (XP_974442); Lmi, Locusta migratoria (AAO21473).

Heat shock protein 90 (HSP90)

In normal physiological conditions, HSP90 is abundant, accounting for 1% of the total soluble cytosolic protein, which is a highly conserved molecular chaperone contributing to the folding, maintenance of structural integrity and proper regulation of a subset of cytosolic protein [54], [58], [59]. Like HSP70, HSP90s has also been extensively studied. Twenty four unigenes with short length less than 800 bp were identified coding for putative HSP90 (Table S5). Except for Unigene20269 and Unigene46492, they showed above 60% amino acid identity to HSP90s from other organisms. Other sequence information was not analyzed here because of only relatively short fragments available.

Conclusions

In this work, transcriptome database has been produced on a large scale for T. yunnanensis using Illumina sequencing. The results provided new insights into the genomics of this forest pest. It contributed significantly to the rapid discovery of a wide diversity candidate gene for this organism that lacks complete genome sequences and other genetic tools and resources. This transcriptome can be used as a reference to provide new leads for comparative studies within the family. Based on this database, the predicted repertoire of genes related to insecticide resistance and environmental stress are constructed, providing valuable information regarding further investigations of the detailed mechanisms. However, as the sequences obtained by Illumina sequencing are short, most of the interesting putative genes discovery by this comprehensive tool is need to rely on RACE PCR in order to obtain full-length sequence data. Furthermore, the database will continue to be an enormous resource for genome-wide association studies of the whole picture of other important biological questions in the future. Length distribution of contigs. (TIF) Click here for additional data file. Length distribution of scaffolds. (TIF) Click here for additional data file. Top BLAST hits from NCBI nr database. BLAST results against the NCBI nr database for all the distinct sequences with a cut-off E value above 10−5 are shown. (XLS) Click here for additional data file. Gene Ontology of Tomicus yunnanensis unigenes. (XLS) Click here for additional data file. KEGG summary of Tomicus yunnanensis transciptome. (XLS) Click here for additional data file. Putative P450 genes identified in Tomicus yunnanensis transcriptome. (XLS) Click here for additional data file. Putative heat shock protein genes identified in Tomicus yunnanensis transcriptome. (XLS) Click here for additional data file.
  56 in total

1.  Blast2GO goes grid: developing a grid-enabled prototype for functional genomics analysis.

Authors:  G Aparicio; S Götz; A Conesa; D Segrelles; I Blanquer; J M García; V Hernandez; M Robles; M Talon
Journal:  Stud Health Technol Inform       Date:  2006

2.  Response of Tomicus yunnanensis (Coleoptera: Scolytinae) to infested and uninfested Pinus yunnanensis bolts.

Authors:  Hui Liu; Zhen Zhang; Hui Ye; Hongbin Wang; Stephen R Clarke; Lu Jun
Journal:  J Econ Entomol       Date:  2010-02       Impact factor: 2.381

3.  HSP70 heat shock proteins and environmental stress in Antarctic marine organisms: A mini-review.

Authors:  Melody S Clark; Lloyd S Peck
Journal:  Mar Genomics       Date:  2009-03-26       Impact factor: 1.710

4.  Characterization of an omega-class glutathione S-transferase in the stress response of the silkmoth.

Authors:  K Yamamoto; S Teshiba; Y Shigeoka; Y Aso; Y Banno; T Fujiki; Y Katakura
Journal:  Insect Mol Biol       Date:  2011-03-24       Impact factor: 3.585

Review 5.  Structure, function and evolution of DnaJ: conservation and adaptation of chaperone function.

Authors:  M E Cheetham; A J Caplan
Journal:  Cell Stress Chaperones       Date:  1998-03       Impact factor: 3.667

6.  The maternal and early embryonic transcriptome of the milkweed bug Oncopeltus fasciatus.

Authors:  Ben Ewen-Campen; Nathan Shaner; Kristen A Panfilio; Yuichiro Suzuki; Siegfried Roth; Cassandra G Extavour
Journal:  BMC Genomics       Date:  2011-01-25       Impact factor: 3.969

7.  Evolution of insect P450.

Authors:  R Feyereisen
Journal:  Biochem Soc Trans       Date:  2006-12       Impact factor: 5.407

8.  Genetic study of the forest pest Tomicus piniperda (Col., Scolytinae) in Yunnan province (China) compared to Europe: new insights for the systematics and evolution of the genus Tomicus.

Authors:  Y Duan; C Kerdelhué; H Ye; F Lieutier
Journal:  Heredity (Edinb)       Date:  2004-11       Impact factor: 3.821

9.  Molecular and functional characterisation of the heat shock protein 10 of Strongyloides ratti.

Authors:  Yasmina Tazir; Vera Steisslinger; Hanns Soblik; Abuelhassan Elshazly Younis; Svenja Beckmann; Christoph G Grevelding; Hanno Steen; Norbert W Brattig; Klaus D Erttmann
Journal:  Mol Biochem Parasitol       Date:  2009-07-28       Impact factor: 1.759

10.  Pyrosequencing the transcriptome of the greenhouse whitefly, Trialeurodes vaporariorum reveals multiple transcripts encoding insecticide targets and detoxifying enzymes.

Authors:  Nikos Karatolos; Yannick Pauchet; Paul Wilkinson; Ritika Chauhan; Ian Denholm; Kevin Gorman; David R Nelson; Chris Bass; Richard H ffrench-Constant; Martin S Williamson
Journal:  BMC Genomics       Date:  2011-01-24       Impact factor: 3.969

View more
  25 in total

1.  Stress Responses of Small Heat Shock Protein Genes in Lepidoptera Point to Limited Conservation of Function across Phylogeny.

Authors:  Bo Zhang; Jincheng Zheng; Yu Peng; Xiaoxia Liu; Ary A Hoffmann; Chun-Sen Ma
Journal:  PLoS One       Date:  2015-07-21       Impact factor: 3.240

2.  Transcriptome immune analysis of the invasive beetle Octodonta nipae (Maulik) (Coleoptera: Chrysomelidae) parasitized by Tetrastichus brontispae Ferrière (Hymenoptera: Eulophidae).

Authors:  Baozhen Tang; Jun Chen; Youming Hou; E Meng
Journal:  PLoS One       Date:  2014-03-10       Impact factor: 3.240

3.  Sequencing and de novo assembly of the transcriptome of the glassy-winged sharpshooter (Homalodisca vitripennis).

Authors:  Raja Sekhar Nandety; Shizuo G Kamita; Bruce D Hammock; Bryce W Falk
Journal:  PLoS One       Date:  2013-12-10       Impact factor: 3.240

4.  Transcriptomic immune response of Tenebrio molitor pupae to parasitization by Scleroderma guani.

Authors:  Jia-Ying Zhu; Pu Yang; Zhong Zhang; Guo-Xing Wu; Bin Yang
Journal:  PLoS One       Date:  2013-01-14       Impact factor: 3.240

5.  Transcriptome analysis of artificial hybrid pufferfish Jiyan-1 and its parental species: implications for pufferfish heterosis.

Authors:  Yang Gao; Huan Zhang; Qiang Gao; Lingling Wang; Fuchong Zhang; Vinu S Siva; Zhi Zhou; Linsheng Song; Shicui Zhang
Journal:  PLoS One       Date:  2013-03-08       Impact factor: 3.240

6.  Sequencing and de novo assembly of the western tarnished plant bug (Lygus hesperus) transcriptome.

Authors:  J Joe Hull; Scott M Geib; Jeffrey A Fabrick; Colin S Brent
Journal:  PLoS One       Date:  2013-01-24       Impact factor: 3.240

7.  De novo assembly and characterization of the global transcriptome for Rhyacionia leptotubula using Illumina paired-end sequencing.

Authors:  Jia-Ying Zhu; Yong-He Li; Song Yang; Qin-Wen Li
Journal:  PLoS One       Date:  2013-11-21       Impact factor: 3.240

8.  De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein) using transcriptome sequences.

Authors:  Dan-Dan Wei; Er-Hu Chen; Tian-Bo Ding; Shi-Chun Chen; Wei Dou; Jin-Jun Wang
Journal:  PLoS One       Date:  2013-11-14       Impact factor: 3.240

9.  Analysis of insecticide resistance-related genes of the Carmine spider mite Tetranychus cinnabarinus based on a de novo assembled transcriptome.

Authors:  Zhifeng Xu; Wenyi Zhu; Yanchao Liu; Xing Liu; Qiushuang Chen; Miao Peng; Xiangzun Wang; Guangmao Shen; Lin He
Journal:  PLoS One       Date:  2014-05-15       Impact factor: 3.240

10.  Rapid transcriptome sequencing of an invasive pest, the brown marmorated stink bug Halyomorpha halys.

Authors:  Panagiotis Ioannidis; Yong Lu; Nikhil Kumar; Todd Creasy; Sean Daugherty; Marcus C Chibucos; Joshua Orvis; Amol Shetty; Sandra Ott; Melissa Flowers; Naomi Sengamalay; Luke J Tallon; Leslie Pick; Julie C Dunning Hotopp
Journal:  BMC Genomics       Date:  2014-08-29       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.