Literature DB >> 27333791

Sequencing, de novo assembly and annotation of a pink bollworm larval midgut transcriptome.

Erica E Tassone1, Gina Zastrow-Hayes2, John Mathis2, Mark E Nelson2, Gusui Wu2, J Lindsey Flexner3, Yves Carrière4, Bruce E Tabashnik4, Jeffrey A Fabrick5.   

Abstract

BACKGROUND: The pink bollworm Pectinophora gossypiella (Saunders) (Lepidoptera: Gelechiidae) is one of the world's most important pests of cotton. Insecticide sprays and transgenic cotton producing toxins of the bacterium Bacillus thuringiensis (Bt) are currently used to manage this pest. Bt toxins kill susceptible insects by specifically binding to and destroying midgut cells, but they are not toxic to most other organisms. Pink bollworm is useful as a model for understanding insect responses to Bt toxins, yet advances in understanding at the molecular level have been limited because basic genomic information is lacking for this cosmopolitan pest. Here, we have sequenced, de novo assembled and annotated a comprehensive larval midgut transcriptome from a susceptible strain of pink bollworm.
FINDINGS: A de novo transcriptome assembly for the midgut of P. gossypiella was generated containing 46,458 transcripts (average length of 770 bp) derived from 39,874 unigenes. The size of the transcriptome is similar to published midgut transcriptomes of other Lepidoptera and includes up to 91 % annotated contigs. The dataset is publicly available in NCBI and GigaDB as a resource for researchers.
CONCLUSIONS: Foundational knowledge of protein-coding genes from the pink bollworm midgut is critical for understanding how this important insect pest functions. The transcriptome data presented here represent the first large-scale molecular resource for this species, and may be used for deciphering relevant midgut proteins critical for xenobiotic detoxification, nutrient digestion and allocation, as well as for the discovery of protein receptors important for Bt intoxication.

Entities:  

Keywords:  Bacillus thuringiensis; Midgut; Pectinophora gossypiella; Pink bollworm; RNA-Seq; Transcriptome

Mesh:

Substances:

Year:  2016        PMID: 27333791      PMCID: PMC4916539          DOI: 10.1186/s13742-016-0130-9

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


Data description

Background

The pink bollworm Pectinophora gossypiella (Saunders) (Lepidoptera: Gelechiidae) is an important global pest of cotton. In many countries, transgenic cotton producing Bacillus thuringiensis (Bt) crystalline (Cry) proteins kills pests including the pink bollworm, thereby providing economic and environmental benefits. However, the evolution of pest resistance threatens the continued success of such Bt crops. While field populations of the pink bollworm in the USA have remained susceptible to two different Cry toxins produced simultaneously in Bt cotton, field-evolved practical resistance to Bt cotton has occurred widely in India [1-3]. Cry toxins kill susceptible pests like the pink bollworm by binding to protein receptors on the surface of midgut epithelial cells, eventually causing cell lysis [4]. In this study, we used Illumina sequencing of cDNA from the larval midgut of a Bt-susceptible strain to provide the first comprehensive view of the genes transcribed in this species. We generated over 18 million high-quality DNA sequence reads and >35 million bases that assembled into 21,715 unique transcripts. This transcriptome sequencing effort has dramatically increased the number of known genes for this insect, and provides an invaluable resource for the discovery of potential roles of proteins involved in various physiological and toxicological processes in the pink bollworm larval midgut.

Samples

Samples were derived from the APHIS-S strain of pink bollworm maintained at the US Department of Agriculture (USDA) Agricultural Research Service US Arid Land Agricultural Research Center in Maricopa, Arizona. APHIS-S is a Bt-susceptible strain that has been reared in the laboratory for more than 30 years without exposure to Bt toxins [5]. To generate samples, 120 newly emerged neonates were placed individually on ~5 g of wheat germ pink bollworm diet in 30 ml plastic cups and reared at 26 °C with ~30 % relative humidity and a photoperiod of 14 light:10 dark. After 9 days, midguts were dissected from three sets of ten female and ten male 4th instar larvae. Salivary glands, foregut, Malpighian tubules and hindguts were removed from each midgut in phosphate buffered saline buffer. Three biological replicates of 20 midguts each were pooled in 0.5 ml RNAlater (Sigma, St. Louis, USA), held overnight at 4 °C and stored at −80 °C. Frozen midguts in RNAlater were shipped to DuPont Pioneer in Johnston, Iowa, USA for RNA extraction, library preparation and DNA sequencing.

Sequencing

Total RNAs were isolated from frozen midgut pools using the Qiagen RNeasy kit (Hilden, Germany). Sequencing libraries from the resulting total RNAs were prepared using the TruSeq mRNA-Seq kit and protocol from Illumina, Inc. (San Diego, USA). Briefly, mRNAs were isolated via attachment to oligo(dT) beads, chemically fragmented and then reverse transcribed into cDNA via random hexamer priming. Resulting double stranded cDNA fragments were end-repaired to create blunt end fragments, 3′ A-tailed, ligated with Illumina indexed TruSeq adapters and PCR amplified using Illumina TruSeq primers. Purified PCR amplified libraries were assessed for quality and quantity on the Agilent Bioanalyzer DNA 7500 chip before normalization and sample pooling. Sample pools were clustered and sequenced on the Illumina HiSeq 2500 system with Illumina TruSeq SBS Rapid v1 chemistry as per vendor protocols. Samples selected for transcriptome assembly were paired-end sequenced, with 76 cycles per read to a target depth of 40 million read pairs per sample. Raw quality was assessed and filtered with a custom pipeline that uses both the program FastQC and Trimmomatic (V 0.32), using the parameters ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:10 TRAILING:20 SLIDINGWINDOW:4:25 MINLEN:36 to remove adaptor sequence and filter by quality score. After filtering, approximately 18 million reads were obtained, totaling over 5 Gb or 2 × 72 bp paired-end data. The short read archive (SRA) accessions for data used in the assembly are as follows: SRX1164974, SRX1164977 and SRX1164978.

Transcriptome assembly

Before assembly, the three datasets were concatenated and read abundance was normalized to 50× coverage, using the in silico normalization tool in Trinity to improve assembly time and minimize memory requirements. Filtering and normalization reduced the dataset to 3 Gb, comprised of approximately 9 million read pairs that were then assembled using default parameters in Trinity (v.2.0.6) with the addition of the ‘-- jaccard clip’ flag to reduce the generation of transcript fusions from non-strand-specific data. Transcript expression levels were estimated with RSEM [6] and open reading frames were predicted using Transdecoder [7]. To remove bacterial contamination from the assembly, a BLASTx analysis of the newly assembled transcriptome was performed against a custom bacterial database containing all bacterial sequences deposited in NCBI (created 18 August 2015). After contamination filtration, the transcriptome was again filtered, sorted and prepared for NCBI transcriptome shotgun assembly (TSA) submission as previously described [8]. The resulting transcriptome was analyzed using TransRate (v.1.0.1), obtaining a TransRate score of 0.21, which indicates that the assembly is better than ~50 % of 155 published de novo transcriptomes available in the NCBI TSA [9].

Annotation

Functional annotation was performed at the peptide level using a custom pipeline [8] that defines protein products and assigns transcript names. Predicted proteins/peptides were analyzed using InterProScan5, which searched all available databases including Gene Ontology (GO) [10]. BLASTp analysis of the resulting proteins was performed with the UniProt Swiss-Prot database (downloaded 11 February 2015). Annie [11], a program that cross-references Swiss-Prot BLAST and InterProScan5 results to extract qualified gene names and products, was used to generate the transcript annotation file. The resulting .gff3 and .tbl files were further annotated with functional descriptors in Transvestigator [12].

Transcriptome comparisons

The assembled pink bollworm transcriptome was compared with midgut transcriptomes from three other lepidopterans, Plutella xylostella [13], Chilo suppressalis [14] and Heliothis virescens [15], and was found to have comparable metrics (Table 1). Specifically, the number of assembled contigs per 1000 reads was 2.5 for pink bollworm, 0.94 for C. suppressalis, 0.30 for H. virescens and 5.4 for P. xylostella. The number of BLASTx sequence hits (cutoff e-value of 10−5) in the non-redundant (nr) NCBI protein database per 1000 assembled reads was 1.2 for pink bollworm, 0.39 for C. suppressalis, 0.14 for H. virescens and 0.72 for P. xylostella (Table 1).
Table 1

Comparison of assembled lepidopteran transcriptomes

Chilo suppressalis Heliothis virescens Plutella xylostella Pectinophora gossypiella
PlatformIlluminaIllumina, Roche, 454, SangerIlluminaIllumina
Assembled reads39,400,002212,987,02839,764,23018,623,508
Average read size (bp)90Variable9072
Number of contigs37,04063,648213,67446,458
Contigs per 1000 reads0.940.305.42.5
Mean contig size (bp), range497, 201–9744383, 80–2000189, nr770, 224–14,619
Sequences with e-value <10−5 15,44629,97828,76821,715
Number of e-value <10−5 hits Per 1000 reads0.390.140.721.2
GC (%)42nra nr39
N50 transcript length (bp)nr10312621153
PipelineTrinitySeqMan Ngen v2.1TrinityTrinity
Reference[14][15][13]This study

a nr not reported

Comparison of assembled lepidopteran transcriptomes a nr not reported The quality of the pink bollworm assembly was further assessed by direct comparison of core statistics with the P. xylostella midgut transcriptome [13] (Table 2). We evaluated the completeness of both the P. xylostella and P. gossypiella transcripts using the program BUSCO (benchmarking universal single-copy orthologs) using the arthropod gene set [16]. The percentages of conserved genes from the P. gossypiella and P. xylostella transcriptomes recovered by the BUSCO analysis are ~34 and ~37 %, respectively. The overall BUSCO percentages are lower than previously reported for a reference Spodoptera frugiperda transcriptome [17], but are not surprising given these are single-organ (i.e., midgut) assemblies compared with assembled RNA sequences from whole larvae and tissue samples from multiple time points. Lastly, a tBLASTx analysis of the P. gossypiella transcriptome against the P. xylostella midgut transcriptome (representing the nearest phylogenetic relative lepidopteran relative with a currently available midgut transcriptome) revealed that 29 % (12,475 out of 46,458), 37 % (17,032 out of 46,458) and 91 % (42,089 out of 46,458) had matching hits at e-values of 10−5, 10−2 and 101 [18]. These results are not unexpected for tBLASTx at the more stringent e-values given the considerable phylogenetic distance between the two species of Lepidoptera [19].
Table 2

Comparison of Pectinophora gossypiella and Plutella xylostella midgut transcriptomes by BUSCO analysisa

SpeciesComplete (%)Duplicated (%)Fragment (%)Missing (%)
P. gossypiella 348.83035
P. xylostella 37122438

aA total of 2675 total BUSCO groups were searched from the assembled P. gossypiella midgut transcriptome and the assembled P. xylostella midgut short read archive transcriptome

Comparison of Pectinophora gossypiella and Plutella xylostella midgut transcriptomes by BUSCO analysisa aA total of 2675 total BUSCO groups were searched from the assembled P. gossypiella midgut transcriptome and the assembled P. xylostella midgut short read archive transcriptome

Gene ontology

Blast2GO [20, 21] was used to assign P. gossypiella transcripts with a minimum BLASTx e-value of 10−3 into putative functional groups or GO terms. A total of 12,762 transcript sequences were assigned GO terms (Additional file 1, [18]), including 7073 with hits at the Biological Process level, 6402 at the Cellular Component level and 7747 sequences at the Molecular Function level. Within the Biological Process GO category, the most abundant transcripts were assigned to ‘single-organism metabolic process’, ‘signal transduction’ and ‘cellular protein modification’ (Fig. 1a). ‘Integral component of membrane’, ‘nucleus’ and ‘intracellular organelle part’ were the most abundant GO terms for Cellular Component (Fig. 1b). For Molecular Function, ‘zinc ion binding’, ‘ATP binding’ and ‘DNA binding’ were the most prevalent, with several different types of hydrolases also highly represented (Fig. 1c). Overall, typical gut-specific functions, such as digestion and storage, energy metabolism, ion transport and gene regulation were indicated by GO terms (Additional file 1, [18]).
Fig. 1

Classification of Pectinophora gossypiella midgut transcripts based on predicted Gene Ontology (GO) terms. a Biological Process, (b) Cellular Component and (c) Molecular Function GO terms were determined using Blast2Go [19, 20] with an e-value cutoff of 10−3, and minimum sequence filters set at 707 sequences for Biological Process, 640 for Cellular Component and 250 for Molecular Function for generating pie charts. Note that individual categories can have multiple mappings, resulting in a sum greater than the total number of transcript sequences assigned GO terms

Classification of Pectinophora gossypiella midgut transcripts based on predicted Gene Ontology (GO) terms. a Biological Process, (b) Cellular Component and (c) Molecular Function GO terms were determined using Blast2Go [19, 20] with an e-value cutoff of 10−3, and minimum sequence filters set at 707 sequences for Biological Process, 640 for Cellular Component and 250 for Molecular Function for generating pie charts. Note that individual categories can have multiple mappings, resulting in a sum greater than the total number of transcript sequences assigned GO terms

Abbreviations

Bt, Bacillus thuringiensis; BUSCO, benchmarking universal single-copy orthologs; Cry protein, crystalline protein; GO, gene ontology; NCBI, National Center for Biotechnology Information; nr, non-redundant; SRA, short read archive; TSA, transcriptome shotgun assembly; USDA, US Department of Agriculture
  18 in total

1.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

2.  Effects of Bt cotton and crylac toxin on survival and development of pink bollworm (Lepidoptera: Gelechiidae).

Authors:  Y B Liu; B E Tabashnik; T J Dennehy; A L Patin; M A Sims; S K Meyer; Y Carrière
Journal:  J Econ Entomol       Date:  2001-10       Impact factor: 2.381

3.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis.

Authors:  Brian J Haas; Alexie Papanicolaou; Moran Yassour; Manfred Grabherr; Philip D Blood; Joshua Bowden; Matthew Brian Couger; David Eccles; Bo Li; Matthias Lieber; Matthew D MacManes; Michael Ott; Joshua Orvis; Nathalie Pochet; Francesco Strozzi; Nathan Weeks; Rick Westerman; Thomas William; Colin N Dewey; Robert Henschel; Richard D LeDuc; Nir Friedman; Aviv Regev
Journal:  Nat Protoc       Date:  2013-07-11       Impact factor: 13.491

4.  Exploring the midgut transcriptome and brush border membrane vesicle proteome of the rice stem borer, Chilo suppressalis (Walker).

Authors:  Weihua Ma; Zan Zhang; Chuanhua Peng; Xiaoping Wang; Fei Li; Yongjun Lin
Journal:  PLoS One       Date:  2012-05-29       Impact factor: 3.240

5.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.

Authors:  Bo Li; Colin N Dewey
Journal:  BMC Bioinformatics       Date:  2011-08-04       Impact factor: 3.307

6.  Reconstructing a comprehensive transcriptome assembly of a white-pupal translocated strain of the pest fruit fly Bactrocera cucurbitae.

Authors:  Sheina B Sim; Bernarda Calla; Brian Hall; Theodore DeRego; Scott M Geib
Journal:  Gigascience       Date:  2015-03-31       Impact factor: 6.524

7.  Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression.

Authors:  Omaththage P Perera; Kent S Shelby; Holly J R Popham; Fred Gould; Michael J Adang; Juan Luis Jurat-Fuentes
Journal:  PLoS One       Date:  2015-06-05       Impact factor: 3.240

8.  InterProScan 5: genome-scale protein function classification.

Authors:  Philip Jones; David Binns; Hsin-Yu Chang; Matthew Fraser; Weizhong Li; Craig McAnulla; Hamish McWilliam; John Maslen; Alex Mitchell; Gift Nuka; Sebastien Pesseat; Antony F Quinn; Amaia Sangrador-Vegas; Maxim Scheremetjew; Siew-Yit Yong; Rodrigo Lopez; Sarah Hunter
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

9.  Alternative splicing and highly variable cadherin transcripts associated with field-evolved resistance of pink bollworm to bt cotton in India.

Authors:  Jeffrey A Fabrick; Jeyakumar Ponnuraj; Amar Singh; Raj K Tanwar; Gopalan C Unnithan; Alex J Yelich; Xianchun Li; Yves Carrière; Bruce E Tabashnik
Journal:  PLoS One       Date:  2014-05-19       Impact factor: 3.240

10.  Sequencing, de novo assembly and annotation of a pink bollworm larval midgut transcriptome.

Authors:  Erica E Tassone; Gina Zastrow-Hayes; John Mathis; Mark E Nelson; Gusui Wu; J Lindsey Flexner; Yves Carrière; Bruce E Tabashnik; Jeffrey A Fabrick
Journal:  Gigascience       Date:  2016-06-22       Impact factor: 6.524

View more
  5 in total

1.  Next-Generation Sequencing Analysis of the Tineola bisselliella Larval Gut Transcriptome Reveals Candidate Enzymes for Keratin Digestion.

Authors:  Michael Schwabe; Sven Griep; Henrike Schmidtberg; Rudy Plarre; Alexander Goesmann; Andreas Vilcinskas; Heiko Vogel; Karina Brinkrolf
Journal:  Genes (Basel)       Date:  2021-07-22       Impact factor: 4.096

2.  Midgut transcriptomal response of the rice leaffolder, Cnaphalocrocis medinalis (Guenée) to Cry1C toxin.

Authors:  Yajun Yang; Hongxing Xu; Yanhui Lu; Caiyun Wang; Zhongxian Lu
Journal:  PLoS One       Date:  2018-01-23       Impact factor: 3.240

3.  Sequencing, de novo assembly and annotation of a pink bollworm larval midgut transcriptome.

Authors:  Erica E Tassone; Gina Zastrow-Hayes; John Mathis; Mark E Nelson; Gusui Wu; J Lindsey Flexner; Yves Carrière; Bruce E Tabashnik; Jeffrey A Fabrick
Journal:  Gigascience       Date:  2016-06-22       Impact factor: 6.524

4.  The Mexican bean beetle (Epilachna varivestis) regurgitome and insights into beetle-borne virus specificity.

Authors:  Cassidy R Gedling; Charlotte M Smith; Christophe M R LeMoine; Bryan J Cassone
Journal:  PLoS One       Date:  2018-01-29       Impact factor: 3.240

5.  ABC transporter mis-splicing associated with resistance to Bt toxin Cry2Ab in laboratory- and field-selected pink bollworm.

Authors:  Lolita G Mathew; Jeyakumar Ponnuraj; Bheemanna Mallappa; Lingutla R Chowdary; Jianwei Zhang; Wee Tek Tay; Thomas K Walsh; Karl H J Gordon; David G Heckel; Sharon Downes; Yves Carrière; Xianchun Li; Bruce E Tabashnik; Jeffrey A Fabrick
Journal:  Sci Rep       Date:  2018-09-10       Impact factor: 4.379

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.