Literature DB >> 30043835

De novo transcriptome assembly of the lobster cockroach Nauphoeta cinerea (Blaberidae).

Ana Lúcia Anversa Segatto1, José Francisco Diesel1, Elgion Lucio Silva Loreto1, João Batista Teixeira da Rocha1.   

Abstract

The use of Drosophila as a scientific model is well established, but the use of cockroaches as experimental organisms has been increasing, mainly in toxicology research. Nauphoeta cinerea is one of the species that has been studied, and among its advantages is its easy laboratory maintenance. However, a limited amount of genetic data about N. cinerea is available, impeding gene identification and expression analyses, genetic manipulation, and a deeper understanding of its functional biology. Here we describe the N. cinerea fat body and head transcriptome, in order to provide a database of genetic sequences to better understand the metabolic role of these tissues, and describe detoxification and stress response genes. After removing low-quality sequences, we obtained 62,121 transcripts, of which more than 50% had a length of 604 pb. The assembled sequences were annotated according to their genes ontology (GO). We identified 367 genes related to stress and detoxification; among these, the more frequent were p450 genes. The results presented here are the first large-scale sequencing of N. cinerea and will facilitate the genetic understanding of the species' biochemistry processes in future works.

Entities:  

Year:  2018        PMID: 30043835      PMCID: PMC6136372          DOI: 10.1590/1678-4685-GMB-2017-0264

Source DB:  PubMed          Journal:  Genet Mol Biol        ISSN: 1415-4757            Impact factor:   1.771


Introduction

The understanding of biological phenomena is dependent on the organisms observed, and increasing the variety of experimental organisms can provide clearer views of the targeted processes (Golstein ). Among the reasons that have motivated the search for alternative experimental organisms is also the pressure to reduce the use of mammalian species in toxicological testing (Peterson ). As an alternative experimental organism, the use of the cockroach is increasing (Walz ; Blankenburg ). Cockroaches (order Dictyoptera) are primitive winged insects, with worldwide distribution and comprising approximately 4,000 species (Bell ). Thirty of these species are associated with human household, and less than 1% is thought to be associated with human health problems (Fakoorziba ). Periplaneta americana (L.) is the most common domestic cockroach species, it shows great reproductive capability and has been used as an experimental organism in scientific studies, including neurophysiology research (Wicher ; Nishino ). At least three transcriptomes of different P. americana tissues were published (Chen ; Zhang ; Kim ). Another cockroach species frequently used as scientific experimental organism is the German cockroach Blattella germanica (L.) (Cruz ). This species is highly dependent on humans for survival, and its domestic presence is associated with the occurrence of allergic respiratory diseases (Brenner, 1995). One transcriptome of this species has been published (Zhou ). The lobster cockroach Nauphoeta cinerae (Olivier) has been used as an experimental model for toxicology (Rodrigues ; Adedara , 2016) and shown to be a valid alternative for basic toxicological studies. Compared with the other species of cockroaches used in scientific studies, N. cinerea is easy to handle and does not fly, facilitating its maintenance in the laboratory. However, until now, there are few genetic sequences available for this species, and studies using molecular biology techniques like quantitative real-time PCR have been laborious to implement. Cockroaches are found in diverse and inhospitable habitats that can have different amounts of toxic substances, such as environmental pollutants, microbial toxins, insecticides, and other xenobiotics. Thus, they may be a good experimental model to study detoxification abilities and stress response (Bell ; Zhang ). The usual response to stress conditions is the overproduction of reactive oxygen species (ROS), resulting in redox homeostasis alterations as well as oxidative stress. Overproduction of ROS have been associated to the toxicity of a wide range of xenobiotics, such as benzo[a]pyrene (Winn and Wells 1997), methamphetamine (McCallum ; Wong ), ethanol (Dong , 2010), sodium fluoride (Umarani ; Samanta ; Song ), and methylmercury (Usuki and Fujimura, 2016). However, ROS are also produced by normal cellular metabolism, and one of its beneficial effects is on the organism's defense system (Valko ). The main components of the antioxidant system are conserved along the evolutionary process, but there are different adaptations in different groups. In insects, the major change in comparison to other phylogenetic groups is the absence of selenium-dependent glutathione peroxidase (SeGPx). It has been proposed that in insect GPxs evolution, selenium was replaced by cysteine more than once (Bae ; Flohe ). Due the variations among groups, the detoxification genes being expressed should be known before starting studies of exposure to toxic compounds. Antioxidant enzymes can be divided as acting in phase I (primary) and phase II (secondary) reactions. Phase I reactions consist of oxidation, hydrolysis and reduction, and the enzymes involved are aldehyde dehydrogenases, alcohol dehydrogenases, catalases, cytochrome P450s, dehalogenases, hydroxylases, oxidoreductases, peroxidases, superoxide dismutases (SODs), thioredoxins, and glutaredoxins. Secondary antioxidant enzymes that act indirectly on ROS include acetyltransferases, acyltransferase, and glutathione S-transferases (GSTs). The fat body, together with midgut, is the primary detoxification organ in insects, but its gene expression profile is poorly addressed. In addition, the fat body plays an essential role in most intermediary metabolism reactions, as well as energy storage and utilization (Arresse and Soulages, 2010), besides being an endocrine organ producing several antimicrobial peptides (Wen ). Genetic and transcriptome information of N. cinerae cockroaches is scarce. Transcriptome sequencing is a cost-effective tool for non-model organisms and the high-throughput sequencing technologies are an efficient method for genomic characterization of a species and gene discovery (Wang ). Here, we describe the transcriptome of N. cinerae fat body and head, with a focus on the detoxification and stress response genes. We also analyzed the differential gene expression of tissues to investigate the metabolic activity of the fat body and its participation in detoxification routes.

Materials and Methods

Cockroach rearing, RNA isolation and Illumina sequencing

The N. cinerea specimens used in this work were obtained from the Laboratório de Bioquímica Toxicologica, Universidade Federal de Santa Maria, Brazil. The insects were maintained in dark plastic boxes under standard conditions (Adedara ). RNA was extracted from two dissected body parts: the head (all its tissues, including the central nervous system), and the fat body (containing epidermal, neuronal, muscle cells, and other integument constituents). Two and four individuals were used for head and fat body RNA extraction, respectively. Total RNA was prepared from the two tissues using Trizol Reagent (Life Technologies) according to the manufacturer's instructions. A second extraction procedure was done from the product of the first for RNA cleanup. Library preparation followed the recommendation of the TruSeq® Stranded mRNA Library Prep Kit (Illumina). Briefly, total RNA was purified and fragmented. The cleaved RNA fragments were then employed for first-strand cDNA synthesis, followed by second-strand cDNA synthesis. Fragments underwent a 3' adenylation process and were ligated with adapters before the PCR reaction was carried out. After library validation, the products were normalized and pooled. The MiSeq Reagent Kit v3 (150 cycles) was employed for sequence in a Illumina MiSeq system, resulting in pair-end reads of 75 pb. We used 25% of the flow cell for each sample. Sequencing was carried out using an Illumina MiSeq Plataform by Unidade de Genôminca Computacional Darcy Fontoura de Almeida/LNCC/Brazil.

Read processing, de novo transcriptome assembly, and annotation

Read quality was checked and visualized using FastQC v 0.11.4 (Andrews, 2010), and low quality reads and adapters were eliminated using Trim Galore v 0.4.1 (Wu ). Reads with a quality threshold of less than 25 on the Phred scale, empty reads, and short sequences with a length of less than 25 nt were removed. The de novo assembly was performed with Trinity v 2.2.0 software using default settings (Grabherr ). The Deconseq Standalone v 0.4.3 program (Schmieder and Edwards, 2011) was used to remove contamination from virus, bacteria, and human sequences, and contigs with such hits were removed from the assembly before further analysis. We performed three assemblies: one from head reads, one from fat body reads, and one combining all the reads. The latter was used as a reference in the transcript quantification and differential expression analysis. Functional annotation was performed using Trinotate v 3.0.1 (Haas ) with a cutoff E-value of 10-5. CateGOrizer v 3.21 (Hu ) was used to map GO terms to GO Slim file by single count in order to get a broad overview of the functional classification of the transcripts. Each GO term was manually assigned to a consensus functional class. The assembly quality was assessed by examining the percentage of input RNA-Seq reads that are represented in the transcriptome assembly. The representation of full-length reconstructed protein-coding genes was evaluated by searching the assembled transcripts against SwissProt (E-value 10-20) and calculating the values of Nx statistics. Also, in order to verify the quality of our transcriptome and the similarity among transcripts of different species of cockroaches, we used our reference transcriptome as query and performed a blastn search using P. americana (Chen ) or B. germanica (Zhou ) transcriptomes as target database. To this end, we downloaded the raw reads of P. americana and B. germanica and used Trinity v 2.2.0 (Grabherr ) for the assembly, just as done for our data. The online tool BioVenn (Hulsen ) was used to draw a Venn diagram showing the distribution of transcripts detected in the head and fat body by a blastx, with a cutoff E-value of 10-5, using the UniProt database as subject (The UniProt Consortium, 2017).

Transcript quantification and differential expression analysis

The assembly with reads from the head and fat body was used as a reference to estimate transcript abundance and build a counts matrix for each of the transcripts in each tissue using RSEM v 1.2.27 software (Li and Dewey, 2011). The expression abundance of isoforms was normalized using the TPM (transcripts per million) method (Wagner ). Differential expression analyses were carried out using the dgeR Bioconductor package (Robinson ) using a false discovery rate (FDR) cutoff of 0.001. Isoforms up-regulated in each sample were isolated and analyzed using Perl scripts and annotation was carried out using Blast2GO (Conesa ) using a cutoff E-value of 10-5. CateGOrizer v 3.21 (Hu ) was used to map GO terms to GO Slim file by single count.

Identification and analysis of genes of interest

To directly identify detoxification and stress response genes in our transcriptome, we translated our transcripts in the six frames using the EMBOSS Transeq command line (Rice ; Goujon ), and used this as a database in blast+ using blastx (Camacho ). We searched for genes related to oxidation and reduction, conjugation, hydrolysis, and other functions possibly related to stress (Xu ) in Flybase to then be used as a query.

Results

De novo assembly of an N. cinerea transcriptome

A total of 24,980,364 Illumina MiSeq reads from fat body and head tissue were generated. The SRA submission reads can be found under the accession numbers SRR3581673 and SRR3581312. After trimming, 23,137,682 reads were assembled in a reference transcriptome, combining head and fat body reads, this resulting in 62,121 transcripts with an N50 of 604 (Table 1). The separated assemblies of the head and fat body transcriptomes (Table 1) were used in transcript quantification and the differential expression analysis. To assess the quality of our assembly we also mapped the reads back to the transcripts and found that among 70-80% of the fragments were mapped as proper pairs. In each assembly, approximately 2,000 transcripts are represented by nearly full-length proteins, having > 80% alignment coverage.
Table 1

Summary statistics of transcriptome assembly.

HeadFat bodyReference
Number of reads12,935,30412,045,06024,980,364
Reads after trimming6,098,5255,470,31623,137,682
Total Trinity genes41,63127,56857,928
Total Trinity transcripts43,99129,16362,121
Percent CG38.7837.9438.38
Total assembled bases21,525,40613,939,30331,375,798
Transcript contig N50576547604

Functional annotation of the reference transcriptome

The transcriptome was annotated using Trinotate v3.0.1 (Haas ). Blast homologies were captured by searching the UniProt/Swiss-Prot protein databases for further GO assignments. Mapping Entrez Gene IDs to GO annotations identified 44,963 terms categorized in 114 functional groups. The GO terms attributed to the greatest number of genes, in descending order, were metabolism (GO:0008152), catalytic activity (GO:0003824), and development (GO:0007275) (Figure 1). Twelve GOs were classified with antioxidant activity and 558 as response to stress activity. The assembly of P. americana resulted in 158,464 transcripts with a N50 of 536 and the assembly of B. germanica resulted in 71,903 transcripts with a N50 of 904. Our assembly strategy resulted in more transcripts and a higher N50 for B. germanica than previously described (Zhou ). For P. americana, our assembly strategy resulted in more transcripts but with a smaller average size than that previously described (Chen ). A total of 533 N. cinerea transcripts showed a Blast similarity with P. americana, and 2,679 with B. germanica with an E-value of zero. The top 10 hits of blastn results, with lower E-value and higher bit score are listed in Supplementary Information Table S1. The percentage of GC was similar in all the cockroach transcriptomes assembled, being, near 37.94 in N. cinerea fat body and 40.08 in P. americana testis.
Figure 1

GO slim terms distribution associated with the N. cinerea transcriptomes.

Differential expression

Using a dispersion of 0.1 in edgeR (Robinson ) we found 3,257 isoforms as differentially expressed in the head and fat body of the lobster cockroach. The up-regulated isoforms in the head were classified in 1,749 GO terms and could be categorized into 77 functional groups. In the fat body, a total of 864 GO terms were categorized into 81functional groups (Figure 2). Regarding species distribution, the top hit species was Zootermopsis nevadensis Hagenfor. Blattella germanica, and P. americana appeared among the six top hits in both tissues. Among the classification of up-regulated genes in the fat body, 30 GOs were associated with response to biotic and abiotic factors, while only 10 up-regulated genes were found in the head in this subcategory. Differentially expressed transcripts are shown in red in the volcano plot of Figure 3A. From the total of 43,991 transcripts assembled in the head and 29,163 assembled in the fat body, 37,743 and 24,174 had blastx results respectively, using the UniProt database as subject and an E-value of 10-5 as cut-off. After removing redundant hits, 5,921 Blast results were common between the assembled transcriptomes (Figure 3B).
Figure 2

GO slim terms distribution of upregulated genes in the fat body and head of N. cinerea.

Figure 3

Comparison among head and fat body transcripts. (A) Volcano plot of differentially expressed transcripts; the X-axis displays the fold change expression differences (FC) and the Y-axis the statistical significance based on a false discovery rate (FDR) cut-off of 0.001. (B) Venn diagram of blastx results of transcripts detected in the head and fat body.

Detoxification and stress genes

We identified 367 transcripts that were classified into four main categories: 1) oxidation and reduction, 2) conjugation, 3) hydrolytic enzymes, and 4) other transcripts with function possibly related to stress. In each category, the most abundant subcategory is presented in Table 2. Among the oxidation and reduction enzymes, cytochrome p450 genes were the most abundant ones. Glycosyl transferases were the most abundant transcripts among the conjugation enzymes. Acetylcholine and carboxyl esterases were the most frequently found hydrolytic enzyme transcripts. Heat shock proteins were also found in substantial numbers.
Table 2

Detoxification genes expressed in the reference transcriptome.

Detoxification gene transcriptsNumbers found
Oxidation and reduction enzymes 132
Cytochrome P450s85
Conjugation enzymes 103
Glycosyl transferases37
Hydrolytic enzymes 69
Acetylcholine and carboxyl esterases28
Possible stress related functions 63
Heatshock proteins29

Discussion

After cleaning and quality checks, we obtained 1.7 Gb of reads. Nauphoeta cinerea has a C value (pg) of 5.15, or a genome size of approximately 5,000 Mb (Koshikawa ). Hence, our transcriptome, disregarding isoforms, represented around 30% of the N. cinerea genome. Our library was enriched for mRNA sequences, and the reads obtained were 75 bp long, sequenced as pair-end. Thus, it was necessary to reconstruct full-length transcripts by transcriptome assembly. A transcriptome assembly encounters many challenges, among them differential expression of transcripts and alternative splicing (Grabherr ). In spite of these challenges, the comparison with other cockroaches transcriptomes (Zhou , Chen ; Kim ) showed that we had obtained a good assembly with a relatively small amount of data using Trinity v2.2.0 (Table 1). An outstanding characteristic of the cockroach transcriptomes that we assembled was the low level of GC content. A high GC content is correlated to high recombination rate, and in insects genomes, the GC content is usually low, but can be heterogeneous (Kent ; Kent and Zayed, 2013). Blastn was used to find similarities between N. cinerea and the other cockroach transcriptomes, and large structural proteins gave among the best results (Table S1). The B. germanica assembly has overall better Blast results, which may be a consequence of its higher N50 provided by that the Roche 454 sequencing method, which produces longer reads that can improve the assembly in complex regions (Martin and Wang, 2011). Here, we used as a strategy a small number of individuals for RNA extraction, which on the one hand, simplifies the assembly as a result of less genetic variations, but on the other rules out the possibility of performing any population analysis or searches for SSR markers and SNPs. The transcriptome annotation showed that the most frequent GOs (Figure 1 and 2) are similar to other insect transcriptomes (Zhou , Chen ; Wadsworth and Dopman 2015; Kim ; Zhang ). The annotation also revealed a high similarity of the N. cinerea sequences with the termite Zootermopsis nevadensis (Blattodea). While this can be related to the amount of sequences available in databases, the phylogenetic relationship between termites and cockroaches is still controversial (Legendre ). Up-regulated genes were more frequent in fat body, confirming the versatility of this organ in insects (Arresse and Soulages, 2010), even when compared to a tissue set that contains sensory organs and central nervous system ganglia (Figure 2). In addition, there were only 5,921 common Blast results among these tissues, in a total of 39,553 Blast results (Figure 3B). These numbers reflect the big functional difference among head and fat body tissue. It is important to note that although the fat body had a lesser number of assembled transcripts, it had more up-regulated genes in comparison with the head tissue. The individuals used to generate the transcriptomes had not been submitted to any specific stress condition. Consequently, the elevated number of genes related to biotic and abiotic stress in the differential expression analysis confirm the role of the fat body as an active detoxification organ (Figure 2). Our interest in detoxification genes is due to the growing use of N. cinerea as a potential model for toxicological biochemistry studies. In a study aiming to identify candidate genes for insecticide resistance in insecticide susceptible and resistant strains of Anopheles gambiae, no single body part (including the fat body) emerged as the key site of overtranscription of putative insecticide resistance genes (Ingham ). In contrast, our result indicate a quite different pattern, with genes up-regulated in the fat body compared to head tissue in specimens maintained in the laboratory. It is important to highlight that the heads and fat bodies used for RNA extraction were from different individuals. A more comprehensive study design involving multiple dissected tissues and individuals exposed to different stress conditions would facilitate the comprehension of the role of the fat body in biotic and abiotic stress responses. It is well known that the fat body and hemocytes are the major components of the innate immune response in insects. Signals resulting from such stimuli can activate the synthesis and secretion of antimicrobial peptides by the fat body (Tsakas and Marmaras, 2010). However, the metabolic response of the fat body to ROS and the activation of inflammation-associated signaling pathways remains to be determined (Gloire ). In the reference assembly, we found many genes related to detoxification, in similar number to those found in transcriptomes of other insects (Xu, ), indicating that our assembly strategy was efficient. Consequently, the sequences obtained here are a valuable source for future studies of such genes in N. cinerea. Detoxification related cytochrome p450 transcripts were found in the highest number (85). The termite Zootermopsis nevadensis genome has 76 p450 genes (Terrapon ). In contrast, a search for detoxification and insecticide target genes in B. germanica, resulted in 163 p450-related genes (Zhou ). A similar search previously done on the midgut transcriptome of P. americana resulted in 31 P450 transcripts (Zhang ). It is important to note that these results were obtained in transcriptome data that can both subestimate and overestimate this diversity compared to genomic analyses. In conclusion, we obtained a total of 24,980,364 reads and 57,928 genes, constituting a public database for gene identification and expression analysis in N. cinerea. The data presented here are a starting point to understand the fat body metabolism of N. cinerea based on nucleic acid sequences. In addition, our results attest to the multifunctionality of the fat body in insects.
  51 in total

1.  Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples.

Authors:  Günter P Wagner; Koryu Kin; Vincent J Lynch
Journal:  Theory Biosci       Date:  2012-08-08       Impact factor: 1.919

2.  A new bioinformatics analysis tools framework at EMBL-EBI.

Authors:  Mickael Goujon; Hamish McWilliam; Weizhong Li; Franck Valentin; Silvano Squizzato; Juri Paern; Rodrigo Lopez
Journal:  Nucleic Acids Res       Date:  2010-05-03       Impact factor: 16.971

3.  Recombination is associated with the evolution of genome structure and worker behavior in honey bees.

Authors:  Clement F Kent; Shermineh Minaei; Brock A Harpur; Amro Zayed
Journal:  Proc Natl Acad Sci U S A       Date:  2012-10-15       Impact factor: 11.205

4.  Transcriptome profiling reveals mechanisms for the evolution of insect seasonality.

Authors:  Crista B Wadsworth; Erik B Dopman
Journal:  J Exp Biol       Date:  2015-09-28       Impact factor: 3.312

Review 5.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

6.  Nrf2-mediated transcriptional induction of antioxidant response in mouse embryos exposed to ethanol in vivo: implications for the prevention of fetal alcohol spectrum disorders.

Authors:  Jian Dong; Kathleen K Sulik; Shao-Yu Chen
Journal:  Antioxid Redox Signal       Date:  2008-12       Impact factor: 8.401

7.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.

Authors:  Bo Li; Colin N Dewey
Journal:  BMC Bioinformatics       Date:  2011-08-04       Impact factor: 3.307

8.  Fast identification and removal of sequence contamination from genomic and metagenomic datasets.

Authors:  Robert Schmieder; Robert Edwards
Journal:  PLoS One       Date:  2011-03-09       Impact factor: 3.240

9.  Detoxification and stress response genes expressed in a western North American bumble bee, Bombus huntii (Hymenoptera: Apidae).

Authors:  Junhuan Xu; James P Strange; Dennis L Welker; Rosalind R James
Journal:  BMC Genomics       Date:  2013-12-12       Impact factor: 3.969

10.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

View more
  3 in total

1.  Molecular mechanisms underlying milk production and viviparity in the cockroach, Diploptera punctata.

Authors:  Emily C Jennings; Matthew W Korthauer; Jacob M Hendershot; Samuel T Bailey; Matthew T Weirauch; Jose M C Ribeiro; Joshua B Benoit
Journal:  Insect Biochem Mol Biol       Date:  2020-02-29       Impact factor: 4.714

2.  Utility of cockroach as a model organism in the assessment of toxicological impacts of environmental pollutants.

Authors:  Isaac A Adedara; Khadija A Mohammed; Oluwatobiloba F Da-Silva; Faoziyat A Salaudeen; Falco L S Gonçalves; Denis B Rosemberg; Michael Aschner; Joao B T Rocha; Ebenezer O Farombi
Journal:  Environ Adv       Date:  2022-02-18

3.  Streptozotocin induces brain glucose metabolic changes and alters glucose transporter expression in the Lobster cockroach; Nauphoeta cinerea (Blattodea: Blaberidae).

Authors:  Olawande C Olagoke; Blessing A Afolabi; João B T Rocha
Journal:  Mol Cell Biochem       Date:  2020-11-20       Impact factor: 3.396

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.