| Literature DB >> 31195976 |
Federico Marini1,2, Harald Binder3.
Abstract
BACKGROUND: Principal component analysis (PCA) is frequently used in genomics applications for quality assessment and exploratory analysis in high-dimensional data, such as RNA sequencing (RNA-seq) gene expression assays. Despite the availability of many software packages developed for this purpose, an interactive and comprehensive interface for performing these operations is lacking.Entities:
Keywords: Bioconductor; Exploratory data analysis; Principal component analysis; R; RNA-Seq; Reproducible research; Shiny; User-friendly
Mesh:
Substances:
Year: 2019 PMID: 31195976 PMCID: PMC6567655 DOI: 10.1186/s12859-019-2879-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Overview of the pcaExplorer workflow. A typical analysis with pcaExplorer starts by providing the matrix of raw counts for the sequenced samples, together with the corresponding experimental design information. Alternatively, a combination of a DESeqDataSet and a DESeqTransform objects can be given as input. Specifying a gene annotation can allow displaying of alternative IDs, mapped to the row names of the main expression matrix. Documentation is provided at multiple levels (tooltips and instructions in the app, on top of the package vignette). After launching the app, the interactive session allows detailed exploration capability, and the output can be exported (images, tables) also in form of a R Markdown/HTML report, which can be stored or shared. (Icons contained in this figure are contained in the collections released by Font Awesome under the CC BY 4.0 license)
Fig. 2Selected screenshots of the pcaExplorer application. a Principal components from the point of view of the samples, with a zoomable 2D PCA plot (3D now shown due to space) and a scree plot. Additional boxes show loadings plots for the PCs under inspection, and let users explore the effect of the removal of outlier samples. b Principal components, focused on the gene level. Genes are shown in the PCA plot, with sample labels displayed as in a biplot. A profile explorer and heatmaps (not shown due to space) can be plotted for the subset selected after user interaction. Single genes can also be inspected with boxplots. c Functional annotation of principal components, with an overview of the GO-based functions enriched in the loadings in each direction for the selected PCs. The pca2go object can be provided at launch, or also computed during the exploration. d Report Editor panel, with markdown-related and general options shown. Below, the text editor displays the content of the analysis for building the report, defaulting to a comprehensive template provided with the package