Literature DB >> 28174599

DEApp: an interactive web interface for differential expression analysis of next generation sequence data.

Yan Li1, Jorge Andrade1.   

Abstract

BACKGROUND: A growing trend in the biomedical community is the use of Next Generation Sequencing (NGS) technologies in genomics research. The complexity of downstream differential expression (DE) analysis is however still challenging, as it requires sufficient computer programing and command-line knowledge. Furthermore, researchers often need to evaluate and visualize interactively the effect of using differential statistical and error models, assess the impact of selecting different parameters and cutoffs, and finally explore the overlapping consensus of cross-validated results obtained with different methods. This represents a bottleneck that slows down or impedes the adoption of NGS technologies in many labs.
RESULTS: We developed DEApp, an interactive and dynamic web application for differential expression analysis of count based NGS data. This application enables models selection, parameter tuning, cross validation and visualization of results in a user-friendly interface.
CONCLUSIONS: DEApp enables labs with no access to full time bioinformaticians to exploit the advantages of NGS applications in biomedical research. This application is freely available at https://yanli.shinyapps.io/DEAppand https://gallery.shinyapps.io/DEApp.

Entities:  

Keywords:  Differential Expression (DE) analysis; Genomics; Next Generation Sequence (NGS); R; Shiny; Web interface

Year:  2017        PMID: 28174599      PMCID: PMC5291987          DOI: 10.1186/s13029-017-0063-4

Source DB:  PubMed          Journal:  Source Code Biol Med        ISSN: 1751-0473


Background

Next Generation Sequencing (NGS) technologies provide significant advantages over its predecessors for the study of complex genomic features associated with human disease in the filed of biomedical research [1-5]. Significant progress have been made for the analysis of NGS data, this includes improvement on the accuracy of reads alignment for highly repetitive genomes, precise quantification of transcripts and exons, analysis of transcript isoforms and allele specific expressions. However, large-scale data management and the complexity of downstream differential expression (DE) analysis still remain a challenge that restrains the use of NGS technologies. Even though several open source analysis tools are currently available for the DE analysis of count based sequence data, each tool implements a different algorithm, uses a specific statistical model, and is susceptible to a specific error model. Changing the models or the parameters used in a particular tool often results in dramatic changes on the detected DE features. Additionally, the use and manipulation of available bioinformatics tools requires of computer programing and command line knowledge that is not always present in many biomedical labs. To address these challenges, we have developed DEApp, a web based application designed to aid with data manipulation and visualization when performing DE analysis on count-based summaries from sequencing data. DEApp can be used to perform differential gene expression analysis using read counts from RNA-Seq data, differential methylated regions analysis using read counts from ChIP-Seq data, and differential expression small RNA analysis using counts from small RNA-Seq data. DEApp is a self-oriented web based user friendly graphical interface, which enables users lacking of sufficient computational programing knowledge to conduct and cross-validate DE analysis with three different methods: edgeR [6], limma-voom [7], or DESeq2 [8].

Implementation

DEApp is developed in R [9] with Shiny [10]. It has been configured and launched at the RStudio Shinyapps.io cloud server, and can be easily accessed using any operating system, without requiring any software installation. With DEApp users are able to upload their data, evaluate the effect of model selections, interactively visualize parameter cutoffs modifications, and finally cross validate the analysis results obtained from different methods. DEApp implements the entire computational analysis on the background server, and display results dynamically on the graphical web interface. All result files and figures displayed on the interface can be saved locally.

Results and discussion

DE analysis with DEApp is performed in 4 steps: ‘Data Input’, ‘Data Summarization’, ‘DE analysis’, and ‘Methods Comparison’. Figure 1 shows an example of the graphical web interface of DEApp with edgeR for DE analysis. Two files are required as input data for this application, the ‘Raw Count Data’ and ‘Meta-data Table’. The ‘Raw Count Data’ contains summarized count results of all samples in the experiment, and the ‘Meta-data Table’ contains summarized experimental design information for each sample. Examples of valid input files for this application are embedded at the ‘Data Input’ sections to facilitate file formatting and preparation.
Fig. 1

Illustration of DEApp web interface, edgeR analysis section. The left black dashboard sidebar illustrates the analysis workflow; the top blue box panel of each analysis section shows the input panels for various DE cutoffs; the green box panels show the analysis results and visualizations

Illustration of DEApp web interface, edgeR analysis section. The left black dashboard sidebar illustrates the analysis workflow; the top blue box panel of each analysis section shows the input panels for various DE cutoffs; the green box panels show the analysis results and visualizations DEApp can be used for the analysis of single-factor and multi-factor experiments, even though by default DEApp is used for DE analysis of RNA-Seq data, DEApp can also be used for the identification of differential binding analysis using ChIP-Seq data, and differentially expressed micro RNA analysis using miRNA-Seq data. After the data is uploaded on the ‘Data Input’ section, the ‘Data Summarization’ panel allows users to set up the cutoff values to filter out genetic features with very low count, as genetic features must present at certain minimal level to provide enough statistical significance for the DE multiple comparison tests. Usually it is recommended to keep genetic features which are expressed in at least one sample out of each factorial group level [11] with a defined number of reads represented by counts per million (CPM) value. By default, the application removes low expression genetic features after alignment with CPM value ≤1 in less than 2 samples. A detailed explanation on how to choose the optimal cutoff values for this step is available in the ‘introduction’ page of the system. Based on the provided cutoff values, a summary of library sizes and normalization factors for each experimental sample, before and after removal of low expression genomic features is displayed on the web interface. The sample’s normalization and multidimensional scaling (MDS) plot are also presented on the web interface to illustrate samples distribution and relationship after filtering out the low expression genomic features. Once this step is completed, the user will be presented with three commonly used methods to perform DE identification. For a single-factor experiment, the DE analysis can be conducted between any 2 factorial groups of that single-factor; for a multi-factor experiment, the DE analysis can be conducted between any 2 selected groups out of a combination of all group levels. After specifying the group levels, the user will then need to select the parameter cutoffs to determine statistical significance. This includes nominal p-value, false discovery rate (FDR) adjusted p-value, and fold change (FC). The cutoffs for these parameters can be modified interactively on the web interface for each DE analysis section. The system then will display the dispersion plot, overall DE analysis results, and statistically significant DE results together with a volcano plot interactively corresponding to the specified parameters and cutoff values. Additionally, DEApp also provides a ‘Methods Comparison’ section that enables the comparison and cross-validation of DE analysis results with the implemented analysis methods. A summarized Venn diagram and a table will be presented on the user interface to illustrate the overlapped DE genomic features out of any 2 or all 3 selected analysis methods. DEApp represents an intuitive alternative to the use of command line commands and scripts, or a basic functionality open source alternative to commercial packages like Partek [12] and CLC Genomics workbench (CLC bio, Aaarhus, Denmark), that are able to offer extensive analytics and sophisticated visualizations for a premium. The functionality of DEApp can be further expanded to cover complex experiment designs with nested interactions, additive blocking, etc. It will also be possible to expand the automation of further downstream analysis to cover functional annotation and enrichment analysis.

Conclusion

DEApp enables researchers without sufficient programming experience to perform, evaluate, cross validate, and interactively visualize DE analysis of count-based NGS data easily. This application could potentially expedite the adoption of NGS application in the biomedical research labs.

Availability and requirements

Project name: DEAppProject home page: https://yanli.shinyapps.io/DEApp and https://gallery.shinyapps.io/DEApp Project source code: https://github.com/yan-cri/DEAppOperating system: Platform independentProgramming language: R (>=3.2) shinyOther requirement: Requested R packages including shiny, edgeR, limma, DESeq2 etc.License: GPLv2Any restrictions to use by non-academics: None
  8 in total

1.  Screening for possible miRNA-mRNA associations in a colon cancer cell line.

Authors:  Sotaro Kanematsu; Kousuke Tanimoto; Yutaka Suzuki; Sumio Sugano
Journal:  Gene       Date:  2013-08-09       Impact factor: 3.688

2.  Comprehensive molecular characterization of clear cell renal cell carcinoma.

Authors: 
Journal:  Nature       Date:  2013-06-23       Impact factor: 49.962

3.  Whole transcriptome sequencing reveals gene expression and splicing differences in brain regions affected by Alzheimer's disease.

Authors:  Natalie A Twine; Karolina Janitz; Marc R Wilkins; Michal Janitz
Journal:  PLoS One       Date:  2011-01-21       Impact factor: 3.240

4.  RNA-seq reveals novel transcriptome of genes and their isoforms in human pulmonary microvascular endothelial cells treated with thrombin.

Authors:  Li Qin Zhang; Dilyara Cheranova; Margaret Gibson; Shinghua Ding; Daniel P Heruth; Deyu Fang; Shui Qing Ye
Journal:  PLoS One       Date:  2012-02-16       Impact factor: 3.240

5.  Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

Authors:  Wenqian Zhang; Ying Yu; Falk Hertwig; Jean Thierry-Mieg; Wenwei Zhang; Danielle Thierry-Mieg; Jian Wang; Cesare Furlanello; Viswanath Devanarayan; Jie Cheng; Youping Deng; Barbara Hero; Huixiao Hong; Meiwen Jia; Li Li; Simon M Lin; Yuri Nikolsky; André Oberthuer; Tao Qing; Zhenqiang Su; Ruth Volland; Charles Wang; May D Wang; Junmei Ai; Davide Albanese; Shahab Asgharzadeh; Smadar Avigad; Wenjun Bao; Marina Bessarabova; Murray H Brilliant; Benedikt Brors; Marco Chierici; Tzu-Ming Chu; Jibin Zhang; Richard G Grundy; Min Max He; Scott Hebbring; Howard L Kaufman; Samir Lababidi; Lee J Lancashire; Yan Li; Xin X Lu; Heng Luo; Xiwen Ma; Baitang Ning; Rosa Noguera; Martin Peifer; John H Phan; Frederik Roels; Carolina Rosswog; Susan Shao; Jie Shen; Jessica Theissen; Gian Paolo Tonini; Jo Vandesompele; Po-Yen Wu; Wenzhong Xiao; Joshua Xu; Weihong Xu; Jiekun Xuan; Yong Yang; Zhan Ye; Zirui Dong; Ke K Zhang; Ye Yin; Chen Zhao; Yuanting Zheng; Russell D Wolfinger; Tieliu Shi; Linda H Malkas; Frank Berthold; Jun Wang; Weida Tong; Leming Shi; Zhiyu Peng; Matthias Fischer
Journal:  Genome Biol       Date:  2015-06-25       Impact factor: 13.583

6.  voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.

Authors:  Charity W Law; Yunshun Chen; Wei Shi; Gordon K Smyth
Journal:  Genome Biol       Date:  2014-02-03       Impact factor: 13.583

7.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

Authors:  Michael I Love; Wolfgang Huber; Simon Anders
Journal:  Genome Biol       Date:  2014       Impact factor: 13.583

8.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

  8 in total
  28 in total

1.  MVApp-Multivariate Analysis Application for Streamlined Data Analysis and Curation.

Authors:  Magdalena M Julkowska; Stephanie Saade; Gaurav Agarwal; Ge Gao; Yveline Pailles; Mitchell Morton; Mariam Awlia; Mark Tester
Journal:  Plant Physiol       Date:  2019-05-06       Impact factor: 8.340

2.  Motor neurons from ALS patients with mutations in C9ORF72 and SOD1 exhibit distinct transcriptional landscapes.

Authors:  Ching-On Wong; Kartik Venkatachalam
Journal:  Hum Mol Genet       Date:  2019-08-15       Impact factor: 6.150

3.  BioJupies: Automated Generation of Interactive Notebooks for RNA-Seq Data Analysis in the Cloud.

Authors:  Denis Torre; Alexander Lachmann; Avi Ma'ayan
Journal:  Cell Syst       Date:  2018-11-14       Impact factor: 10.304

4.  Analysis of the Hypoxic Response in a Mouse Cortical Collecting Duct-Derived Cell Line Suggests That Esrra Is Partially Involved in Hif1α-Mediated Hypoxia-Inducible Gene Expression in mCCDcl1 Cells.

Authors:  Anna Keppner; Darko Maric; Ilaria Maria Christina Orlando; Laurent Falquet; Edith Hummler; David Hoogewijs
Journal:  Int J Mol Sci       Date:  2022-06-30       Impact factor: 6.208

5.  Deconstructing Adipogenesis Induced by β3-Adrenergic Receptor Activation with Single-Cell Expression Profiling.

Authors:  Rayanne B Burl; Vanesa D Ramseyer; Elizabeth A Rondini; Roger Pique-Regi; Yun-Hee Lee; James G Granneman
Journal:  Cell Metab       Date:  2018-06-21       Impact factor: 27.287

6.  Excessive R-loops trigger an inflammatory cascade leading to increased HSPC production.

Authors:  Joshua T Weinreb; Noura Ghazale; Kith Pradhan; Varun Gupta; Kathryn S Potts; Brad Tricomi; Noah J Daniels; Richard A Padgett; Sofia De Oliveira; Amit Verma; Teresa V Bowman
Journal:  Dev Cell       Date:  2021-03-01       Impact factor: 12.270

7.  IVAG: An Integrative Visualization Application for Various Types of Genomic Data Based on R-Shiny and the Docker Platform.

Authors:  Tae-Rim Lee; Jin Mo Ahn; Gyuhee Kim; Sangsoo Kim
Journal:  Genomics Inform       Date:  2017-12-29

8.  PIVOT: platform for interactive analysis and visualization of transcriptomics data.

Authors:  Qin Zhu; Stephen A Fisher; Hannah Dueck; Sarah Middleton; Mugdha Khaladkar; Junhyong Kim
Journal:  BMC Bioinformatics       Date:  2018-01-05       Impact factor: 3.169

9.  DiCoExpress: a tool to process multifactorial RNAseq experiments from quality controls to co-expression analysis through differential analysis based on contrasts inside GLM models.

Authors:  Ilana Lambert; Christine Paysant-Le Roux; Stefano Colella; Marie-Laure Martin-Magniette
Journal:  Plant Methods       Date:  2020-05-12       Impact factor: 4.993

10.  NASQAR: a web-based platform for high-throughput sequencing data analysis and visualization.

Authors:  Ayman Yousif; Nizar Drou; Jillian Rowe; Mohammed Khalfan; Kristin C Gunsalus
Journal:  BMC Bioinformatics       Date:  2020-06-29       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.