Literature DB >> 32176249

Exploring high-dimensional biological data with sparse contrastive principal component analysis.

Philippe Boileau1, Nima S Hejazi1,2, Sandrine Dudoit2,3,4.   

Abstract

MOTIVATION: Statistical analyses of high-throughput sequencing data have re-shaped the biological sciences. In spite of myriad advances, recovering interpretable biological signal from data corrupted by technical noise remains a prevalent open problem. Several classes of procedures, among them classical dimensionality reduction techniques and others incorporating subject-matter knowledge, have provided effective advances. However, no procedure currently satisfies the dual objectives of recovering stable and relevant features simultaneously.
RESULTS: Inspired by recent proposals for making use of control data in the removal of unwanted variation, we propose a variant of principal component analysis (PCA), sparse contrastive PCA that extracts sparse, stable, interpretable and relevant biological signal. The new methodology is compared to competing dimensionality reduction approaches through a simulation study and via analyses of several publicly available protein expression, microarray gene expression and single-cell transcriptome sequencing datasets.
AVAILABILITY AND IMPLEMENTATION: A free and open-source software implementation of the methodology, the scPCA R package, is made available via the Bioconductor Project. Code for all analyses presented in this article is also available via GitHub. CONTACT: philippe_boileau@berkeley.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Year:  2020        PMID: 32176249     DOI: 10.1093/bioinformatics/btaa176

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2 in total

1.  A Pipeline for Natural Small Molecule Inhibitors of Endoplasmic Reticulum Stress.

Authors:  Daniela Correia da Silva; Patrícia Valentão; Paula B Andrade; David M Pereira
Journal:  Front Pharmacol       Date:  2022-07-22       Impact factor: 5.988

2.  scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling.

Authors:  Dongyuan Song; Kexin Li; Zachary Hemminger; Roy Wollman; Jingyi Jessica Li
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.