Literature DB >> 16426458

LocustDB: a relational database for the transcriptome and biology of the migratory locust (Locusta migratoria).

Zongyuan Ma1, Jun Yu, Le Kang.   

Abstract

BACKGROUND: The migratory locust (Locusta migratoria) is an orthopteran pest and a representative member of hemimetabolous insects for biological studies. Its transcriptomic data provide invaluable information for molecular entomology and pave a way for the comparative research of other medically, agronomically, and ecologically relevant insects. We developed the first transcriptomic database of the locust (LocustDB), building necessary infrastructures to integrate, organize, and retrieve data that are either currently available or to be acquired in the future. DESCRIPTION: LocustDB currently hosts 45,474 high-quality EST sequences from the locust, which were assembled into 12,161 unigenes. It, through user-friendly web interfaces, allows investigators to freely access sequence data, including homologous/orthologous sequences, functional annotations, and pathway analysis, based on conserved orthologous groups (COG), gene ontology (GO), protein domain (InterPro), and functional pathways (KEGG). It also provides information from comparative analysis based on data from the migratory locust and five other invertebrate species, including the silkworm, the honeybee, the fruitfly, the mosquito and the nematode. The website address of LocustDB is http://locustdb.genomics.org.cn/.
CONCLUSION: LocustDB starts with the first transcriptome information for an orthopteran and hemimetabolous insect and will be extended to provide a framework for incorporating in-coming genomic data of relevant insect groups and a workbench for cross-species comparative studies.

Entities:  

Mesh:

Year:  2006        PMID: 16426458      PMCID: PMC1388198          DOI: 10.1186/1471-2164-7-11

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Studying genetics and ecology of the evolutionarily diversified insects at molecular level is the most exciting area of entomology research today [1]. Acquisitions of sequencing data and their comparative analyses build a foundation for understanding biological pathways, molecular processes, and gene expression patterns, which are all relevant to physiological and genetic mechanisms of development, behavior, immunity, and phenotypic plasticity of the insects [2,3]. Efforts to acquire and integrate transcriptomic and genomic data are initial yet essential steps, and we have done so now by adding invaluable transcriptomic information from a new insect order, Orthoptera, to the existing data involving three other insect orders, Lepidoptera, Diptera, and Hymenoptera. Our data came from a study concerning the migratory locust (Locusta migratoria), a representative member of hemimetabolous insects, which has a unique behavioral phenotype: changing phases from a solitary state to a gregarious one when environmental and genetic factors interact due to crowdedness [4]. Given the importance of studying locust as one of the major agricultural pests, we have developed a comprehensive and high-quality database, LocustDB (the Locust Database), for integrating, organizing, and retrieving sequences and related information. LocustDB provides a permanent platform for comparative studies of biology, genetics, and evolution of the locust. It currently hosts a large collection of expression-sequence-tags (EST), unigenes, and their annotations, and integrated comparative analysis results from five other invertebrate species whose genomic information has become available from large-scale genomic studies, including the silkworm, the honeybee, the fruit fly, the mosquito, and the nematode. LocustDB is the first genomic database for a hemimetabolous insect of orthopterans.

Construction and content

Data acquisition

EST sequences were generated from two types of cDNA libraries, the organ-specific and the mixed. The first is composed of six non-normalized, uni-directionally cloned cDNAs made from mRNAs of heads, hind legs, and midguts of fifth-instar locusts in two phenotypic phases: solitary and gregarious. The mixed library was constructed with mRNAs from the whole-body of the gregarious locust. Clones from these libraries were sequenced from the 5'-ends.

EST assembly and gene annotation

We developed a data mining pipeline that analyzes EST data from multiple resources. The software package, Phred-Phrap-Consed, was used for base-calling, quality assessment, and sequence assembly [5,6]. Poly (A) tails, low quality data, and vector sequences were screened by CROSS_MATCH, and removed from the dataset. Sequences less than 100 bp in length were also discarded. A total of 45,474 high-quality ESTs with an average length of 471 bp were assembled with stringent Phrap parameters, yielding 12,161 contigs. Redundant mitochondrial RNAs, rRNAs, and E. coli contaminations were eliminated from the final assemblies. We carried out a comprehensive annotation procedure for the locust unigenes. The clustered unigenes were annotated, based on a series of blast-based homology analysis [7]: (1) BLASTN versus NCBI's non-redundant nucleotide database, (2) BLASTX (E-value less than 1E-5) versus NCBI's non-redundant protein database, and (3) BLASTX versus the non-redundant protein database from SWISS-PROT. Unigenes were annotated with Gene Ontology (GO) terms by comparing the sequences against the database. Sequences with significant matches and best hits were classified according to the database's classification schemes [8]. We also compared our contigs and singlets using RPS (Reverse PSI) BLAST [9,10] to sequences of the COG (conserved orthologous genes) database and assigned the corresponding unigenes into COG functional classifications [11]. Functional domains from non-redundant sequences were assigned based on information from InterPro database [12]. Pathway analysis was performed against KEGG database with BLAST (Release 33) [13]. In addition, we compared the unigenes with genome data from the silkworm, the honeybee, the fruit fly, the mosquito, and the nematode to further define orthologous genes in Ensembl [14] and SilkDB [15].

Implementation

LocustDB was organized with a relational model and stored in Oracle 9i relation database management system. Its web interface was constructed by using JSP scripts running on the Tom Cat web server, through which users have supervisory access. Java Servlets and JavaBeans were used to mediate interaction between clients and the database.

Utility

LocustDB provides an interactive and user-friendly web interface for retrieving sequences and performing sequence alignment along with useful functional annotations. The main page includes the following interface: home, about, data, search, tool, and other accessory parts. Once clicking on the data icon, users can enter any part of the data modules: unigene model, its annotation and orthologous genes from comparative analysis with those of other insect species, enabling users to have a comprehensive overview of the stored data. Search engine is the entry point to the database, including both simple and advanced search modes. LocustDB hosts an online BLAST server for sequence-based search that yields sequence alignment, score, identity, E-value. And annotations of the corresponding homologous genes can be visualized simultaneously. Upon clicking the search icon, users are presented with the advanced search interface of the database. The query starts with annotation and other basic analysis result of unigenes. Unigene and EST sequences from corresponding assemblies can be obtained individually or directly downloaded in bulk from the data module. For EST search, users can identify a unigene and its ESTs by inputting EST name or ID. For unigene search, users can enter a unigene name or annotation keywords, and detailed information, such as ORF length, GC content, EST linkage, unigene alignment, and unigene annotations will be presented in a result page. Hyperlinks provide as cross references for browsing definitions and associated components (such as KEGG pathway map, InterPro annotation, GO annotation, phylogenetic analysis of COG, and primary BLAST results). Users may search for keywords of function ontology, such as gene ontology number or terms, to find putative genes that possess specific functions as well as orthologous genes in other organisms. Alternatively, clients can choose appropriate definitions from public databases, including NCBI_NR, NCBI_NT, and Swiss_Prot. Links between the best BLAST hit to all unigenes and public databases were also established. A summary of BLAST hits and sequence alignment information from every BLAST analysis can be obtained upon clicking the link button. Furthermore, users can check for homologous genes between locust and the other invertebrate species whose genomic data are publicly available, through hyperlinks to these databases for tracing detailed information.

Discussion and conclusion

The current aim of constructing LocustDB is to provide a catalog of genes expressed in the locust tissues and cells according to anatomic and phenotypic features to promote molecular entomology research. It will be modified frequently to serve as a framework for incorporating new genomic and proteomic data from the locust itself as well as other orthopteran and hemimetabolous insects. The database will also be updated for new versions with new data and biological information collected from the relevant literature in an ongoing effort. As a note for future development of this database, we plan to transform LocustDB into an integrated knowledgebase hosting information from genomic, biology, and ecology studies on the locust as well as other insects.

Availability and requirements

LocustDB is maintained at the Beijing Genomics Institute and Institute of Zoology, Chinese Academy of Science. It is freely available at by using web browsers. An e-mail message addressed to lkang@ioz.ac.cn may also be used for comments, corrections, and data submission. This database is freely available for download in the download entry.

Authors' contributions

ZM carried out the data collection, test procedures, drafted the manuscript and also participated in the design. LK and JY coordinated and supervised the whole project, suggesting the general direction and innovative features of the database and giving final approval of the version to be published. All authors have read and approved the final manuscript.
  15 in total

1.  The Gene Ontology (GO) database and informatics resource.

Authors:  M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  The KEGG resource for deciphering the genome.

Authors:  Minoru Kanehisa; Susumu Goto; Shuichi Kawashima; Yasushi Okuno; Masahiro Hattori
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  The InterPro Database, 2003 brings increased coverage and new features.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Daniel Barrell; Alex Bateman; David Binns; Margaret Biswas; Paul Bradley; Peer Bork; Phillip Bucher; Richard R Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Laurent Falquet; Wolfgang Fleischmann; Sam Griffiths-Jones; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; Rodrigo Lopez; Ivica Letunic; David Lonsdale; Ville Silventoinen; Sandra E Orchard; Marco Pagni; David Peyruc; Chris P Ponting; Jeremy D Selengut; Florence Servant; Christian J A Sigrist; Robert Vaughan; Evgueni M Zdobnov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

4.  CDD: a curated Entrez database of conserved domain alignments.

Authors:  Aron Marchler-Bauer; John B Anderson; Carol DeWeese-Scott; Natalie D Fedorova; Lewis Y Geer; Siqian He; David I Hurwitz; John D Jackson; Aviva R Jacobs; Christopher J Lanczycki; Cynthia A Liebert; Chunlei Liu; Thomas Madej; Gabriele H Marchler; Raja Mazumder; Anastasia N Nikolskaya; Anna R Panchenko; Bachoti S Rao; Benjamin A Shoemaker; Vahan Simonyan; James S Song; Paul A Thiessen; Sona Vasudevan; Yanli Wang; Roxanne A Yamashita; Jodie J Yin; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

Review 5.  Genomics in pure and applied entomology.

Authors:  David G Heckel
Journal:  Annu Rev Entomol       Date:  2002-06-04       Impact factor: 19.686

6.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

Review 7.  Sociogenomics: social life in molecular terms.

Authors:  Gene E Robinson; Christina M Grozinger; Charles W Whitfield
Journal:  Nat Rev Genet       Date:  2005-04       Impact factor: 53.242

8.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

Authors:  B Ewing; L Hillier; M C Wendl; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

Review 9.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

10.  The COG database: new developments in phylogenetic classification of proteins from complete genomes.

Authors:  R L Tatusov; D A Natale; I V Garkavtsev; T A Tatusova; U T Shankavaram; B S Rao; B Kiryutin; M Y Galperin; N D Fedorova; E V Koonin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

View more
  21 in total

1.  Helicoidal Organization of Chitin in the Cuticle of the Migratory Locust Requires the Function of the Chitin Deacetylase2 Enzyme (LmCDA2).

Authors:  Rongrong Yu; Weimin Liu; Daqi Li; Xiaoming Zhao; Guowei Ding; Min Zhang; Enbo Ma; KunYan Zhu; Sheng Li; Bernard Moussian; Jianzhen Zhang
Journal:  J Biol Chem       Date:  2016-09-16       Impact factor: 5.157

2.  Modulation of behavioral phase changes of the migratory locust by the catecholamine metabolic pathway.

Authors:  Zongyuan Ma; Wei Guo; Xiaojiao Guo; Xianhui Wang; Le Kang
Journal:  Proc Natl Acad Sci U S A       Date:  2011-02-15       Impact factor: 11.205

3.  Behavioural phase polyphenism in the Australian plague locust (Chortoicetes terminifera).

Authors:  Lindsey J Gray; Gregory A Sword; Michael L Anstey; Fiona J Clissold; Stephen J Simpson
Journal:  Biol Lett       Date:  2009-03-04       Impact factor: 3.703

4.  Assessment and validation of a suite of reverse transcription-quantitative PCR reference genes for analyses of density-dependent behavioural plasticity in the Australian plague locust.

Authors:  Marie-Pierre Chapuis; Donya Tohidi-Esfahani; Tim Dodgson; Laurence Blondin; Fleur Ponton; Darron Cullen; Stephen J Simpson; Gregory A Sword
Journal:  BMC Mol Biol       Date:  2011-02-16       Impact factor: 2.946

5.  CSP and takeout genes modulate the switch between attraction and repulsion during behavioral phase change in the migratory locust.

Authors:  Wei Guo; Xianhui Wang; Zongyuan Ma; Liang Xue; Jingyao Han; Dan Yu; Le Kang
Journal:  PLoS Genet       Date:  2011-02-03       Impact factor: 5.917

6.  De novo analysis of transcriptome dynamics in the migratory locust during the development of phase traits.

Authors:  Shuang Chen; Pengcheng Yang; Feng Jiang; Yuanyuan Wei; Zongyuan Ma; Le Kang
Journal:  PLoS One       Date:  2010-12-30       Impact factor: 3.240

7.  Transcriptome analysis of the desert locust central nervous system: production and annotation of a Schistocerca gregaria EST database.

Authors:  Liesbeth Badisco; Jurgen Huybrechts; Gert Simonet; Heleen Verlinden; Elisabeth Marchal; Roger Huybrechts; Liliane Schoofs; Arnold De Loof; Jozef Vanden Broeck
Journal:  PLoS One       Date:  2011-03-21       Impact factor: 3.240

8.  Identification of representative genes of the central nervous system of the locust, Locusta migratoria manilensis by deep sequencing.

Authors:  Zhengyi Zhang; Zhi-Yu Peng; Kang Yi; Yanbing Cheng; Yuxian Xia
Journal:  J Insect Sci       Date:  2012       Impact factor: 1.857

9.  Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

Authors:  Victor Zeng; Ben Ewen-Campen; Hadley W Horch; Siegfried Roth; Taro Mito; Cassandra G Extavour
Journal:  PLoS One       Date:  2013-05-06       Impact factor: 3.240

10.  Altered immunity in crowded locust reduced fungal (Metarhizium anisopliae) pathogenesis.

Authors:  Yundan Wang; Pengcheng Yang; Feng Cui; Le Kang
Journal:  PLoS Pathog       Date:  2013-01-10       Impact factor: 6.823

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.