Literature DB >> 21904439

DEB: A web interface for RNA-seq digital gene expression analysis.

Ji Qiang Yao1, Fahong Yu.   

Abstract

UNLABELLED: Digital expression (DE) is an important application of RNA-seq technology to quantify the transcriptome. The number of mapped reads to each transcript or gene varies under different conditions and replicates. Currently, three different statistical algorithms (edgeR, DESeq and bayseq) are available as R packages, to compare the reads to identify significantly expressed transcripts or genes. So far, users have to manually install and run each R package separately. It is also of users' interest to compare the results of different approaches. Here, we present a pipeline DEB which automates all the steps in file preparation, computation and result comparison. AVAILABILITY: The database is available for free at http://www.ijbcb.org/DEB/php/onlinetool.php.

Entities:  

Keywords:  DEB; DESeq; RNA-seq; baySeq; digital expression; edgeR; nextGen sequencing

Year:  2011        PMID: 21904439      PMCID: PMC3163933          DOI: 10.6026/97320630007044

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

RNA-seq is a revolutionary tool for transcriptomics [1]. This high-throughput sequencing technology quickly becomes valuable for many functional genomics applications such as digital gene expression (DGE) study. Typically, RNA-Seq reads are classified based on their mapping to a common region of the target genome such as exon or transcript. One of the fundamental data analysis tasks for RNA-seq studies is to determine whether there is evidence that read counts for a transcript or gene are significantly different across experimental conditions. At present, there are three major algorithms to address this problem. EdgeR is designed for the analysis of replicated count-based expression data and is an implementation of methodology developed by Robinson and Smyth [2]. DESeq is similar to edgeR. Both assume negative binomial distribution model. The difference between the two methods lies on their estimation of the squared coefficient of variation (SCV) [3]. Recently, Hardcastle and Kelly developed bayseq, which assumes a negative binomial distribution for the data and derives an empirically determined prior distribution from the entire dataset [4]. DEB is a web interface (Figure 1) that integrates the three algorithms into one place. The user can select any or all of the algorithms for data analysis. In case more than one algorithm is selected, the shared genes among the algorithms are generated.
Figure 1

The DEB web-interface.

Implementation

DEB was developed using HTML, PHP, Perl scripting language, R programing language and MYSQL as the database backend. The pipeline book-keeps users' input information, initiates job-running process, updates job status, delivers the final results and deletes the results after 48 hours. DEB has been tested on the following browsers, Firefox 4.0, IE 8.0, Chrome 12.0, Safari 5.0 and Opera 11.5

Software Input

The input to DEB is a tab-delimited count data file. The format of the input file is as follows: 1) The leftmost column must be the gene list, the rest columns are count data; 2) The first row (header) contains sample names. The samples are categorized to two groups, e.g. patient and control; 3) Sample names within one group are differentiated only by the last suffix, e.g. T1, T2, T3. A demo input file is provided for illustration and test purpose. The user can select one of the following False Discovery Rate (FDR) level, i.e. 1%, 5%, 10%, 15% and 20%. The default value is 10%. The user can also select any one, two or all of the algorithms provided, i.e. edgeR, DESeq and baySeq. The default is all the three algorithms. It is required that an email address is provided so that the final result is conveniently delivered to the user in case the computation takes long time to complete.

Software output

For each algorithm selected, the final result generates a list of genes that are significantly expressed under the user-selected FDR level. If more than two algorithms are selected, the shared gene lists among all the results of the selected algorithms, are also provided. For users’ convenience, the gene list files are made in different formats, such as text, excel and html. In addition to the data files, smear plots (Figure 2) of significantly expressed genes are also generated for the user to download. We tested the software with the demo file that contains 25,668 genes and a selection of all three algorithms. The total time response is about three minutes.
Figure 2

The edgeR smear plot showing significantly expressed genes (red)

Conclusion

DEB is a convenient web tool to identify significantly expressed genes for RNA-seq data analysis using edgeR, DESeq and baySeq algorithms.

Caveat and future development

Currently, the program can only accept count data which is generated by users using other bioinformatics tools. It is our plan to develop a pipeline so that the user can submit the raw sequencing data files to the server. The server can automate cleaning, mapping and counting processes to generate the count file.
  4 in total

1.  baySeq: empirical Bayesian methods for identifying differential expression in sequence count data.

Authors:  Thomas J Hardcastle; Krystyna A Kelly
Journal:  BMC Bioinformatics       Date:  2010-08-10       Impact factor: 3.169

Review 2.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

3.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

4.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

  4 in total
  18 in total

1.  Persistent effects on bovine granulosa cell transcriptome after resolution of uterine disease.

Authors:  Rachel L Piersanti; Anthony D Horlock; Jeremy Block; José E P Santos; I Martin Sheldon; John J Bromfield
Journal:  Reproduction       Date:  2019-07       Impact factor: 3.906

2.  Digital gene expression analysis of early root infection resistance to Sporisorium reilianum f. sp. zeae in maize.

Authors:  Shaopeng Zhang; Yannong Xiao; Jiuran Zhao; Fengge Wang; Yonglian Zheng
Journal:  Mol Genet Genomics       Date:  2012-11-30       Impact factor: 3.291

3.  Sucrose- and Fructose-Specific Effects on the Transcriptome of Streptococcus mutans, as Determined by RNA Sequencing.

Authors:  Lin Zeng; Robert A Burne
Journal:  Appl Environ Microbiol       Date:  2015-10-16       Impact factor: 4.792

4.  Uterine infection alters the transcriptome of the bovine reproductive tract three months later.

Authors:  Anthony D Horlock; Rachel L Piersanti; Rosabel Ramirez-Hernandez; Fahong Yu; Zhengxin Ma; KwangCheol C Jeong; Martin J D Clift; Jeremy Block; José E P Santos; John J Bromfield; I Martin Sheldon
Journal:  Reproduction       Date:  2020-07       Impact factor: 3.906

5.  Mice lacking uterine enhancer of zeste homolog 2 have transcriptomic changes associated with uterine epithelial proliferation.

Authors:  Ana M Mesa; Jiude Mao; Manjunatha K Nanjappa; Theresa I Medrano; Sergei Tevosian; Fahong Yu; Jessica Kinkade; Zhen Lyu; Yang Liu; Trupti Joshi; Duolin Wang; Cheryl S Rosenfeld; Paul S Cooke
Journal:  Physiol Genomics       Date:  2019-12-16       Impact factor: 3.107

6.  Endocrine, immune and renal toxicity in male largemouth bass after chronic exposure to glyphosate and Rodeo®.

Authors:  Maite De Maria; Kevin J Kroll; Fahong Yu; Mohammad-Zaman Nouri; Cecilia Silva-Sanchez; Juan Guillermo Perez; David A Moraga Amador; Yanping Zhang; Mike T Walsh; Nancy D Denslow
Journal:  Aquat Toxicol       Date:  2022-03-12       Impact factor: 4.964

7.  Transcriptional Basis of Drought-Induced Susceptibility to the Rice Blast Fungus Magnaporthe oryzae.

Authors:  Przemyslaw Bidzinski; Elsa Ballini; Aurélie Ducasse; Corinne Michel; Paola Zuluaga; Annamaria Genga; Remo Chiozzotto; Jean-Benoit Morel
Journal:  Front Plant Sci       Date:  2016-10-27       Impact factor: 5.753

8.  Transcriptome analysis of endometrial tissues following GnRH agonist treatment in a mouse adenomyosis model.

Authors:  Song Guo; Xiaowei Lu; Ruihuan Gu; Di Zhang; Yijuan Sun; Yun Feng
Journal:  Drug Des Devel Ther       Date:  2017-03-09       Impact factor: 4.162

9.  A multifaceted study of Pseudomonas aeruginosa shutdown by virulent podovirus LUZ19.

Authors:  Rob Lavigne; Elke Lecoutere; Jeroen Wagemans; William Cenens; Abram Aertsen; Liliane Schoofs; Bart Landuyt; Jan Paeshuyse; Maurice Scheer; Max Schobert; Pieter-Jan Ceyssens
Journal:  MBio       Date:  2013-03-19       Impact factor: 7.867

10.  IscR is essential for yersinia pseudotuberculosis type III secretion and virulence.

Authors:  Halie K Miller; Laura Kwuan; Leah Schwiesow; David L Bernick; Erin Mettert; Hector A Ramirez; James M Ragle; Patricia P Chan; Patricia J Kiley; Todd M Lowe; Victoria Auerbuch
Journal:  PLoS Pathog       Date:  2014-06-12       Impact factor: 6.823

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.