Literature DB >> 31504178

Partition: a surjective mapping approach for dimensionality reduction.

Joshua Millstein1, Francesca Battaglin2,3, Malcolm Barrett1, Shu Cao1, Wu Zhang2, Sebastian Stintzing4, Volker Heinemann5, Heinz-Josef Lenz2.   

Abstract

MOTIVATION: Large amounts of information generated by genomic technologies are accompanied by statistical and computational challenges due to redundancy, badly behaved data and noise. Dimensionality reduction (DR) methods have been developed to mitigate these challenges. However, many approaches are not scalable to large dimensions or result in excessive information loss.
RESULTS: The proposed approach partitions data into subsets of related features and summarizes each into one and only one new feature, thus defining a surjective mapping. A constraint on information loss determines the size of the reduced dataset. Simulation studies demonstrate that when multiple related features are associated with a response, this approach can substantially increase the number of true associations detected as compared to principal components analysis, non-negative matrix factorization or no DR. This increase in true discoveries is explained both by a reduced multiple-testing challenge and a reduction in extraneous noise. In an application to real data collected from metastatic colorectal cancer tumors, more associations between gene expression features and progression free survival and response to treatment were detected in the reduced than in the full untransformed dataset.
AVAILABILITY AND IMPLEMENTATION: Freely available R package from CRAN, https://cran.r-project.org/package=partition. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2020        PMID: 31504178      PMCID: PMC8215926          DOI: 10.1093/bioinformatics/btz661

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  29 in total

1.  Metagenes and molecular pattern discovery using matrix factorization.

Authors:  Jean-Philippe Brunet; Pablo Tamayo; Todd R Golub; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2004-03-11       Impact factor: 11.205

Review 2.  The genetics of Parkinson disease.

Authors:  Hao Deng; Peng Wang; Joseph Jankovic
Journal:  Ageing Res Rev       Date:  2017-12-26       Impact factor: 10.895

3.  Secretory phospholipase Pla2g2a confers resistance to intestinal tumorigenesis.

Authors:  R T Cormier; K H Hong; R B Halberg; T L Hawkins; P Richardson; R Mulherkar; W F Dove; E S Lander
Journal:  Nat Genet       Date:  1997-09       Impact factor: 38.330

4.  The secretory phospholipase A2 gene is a candidate for the Mom1 locus, a major modifier of ApcMin-induced intestinal neoplasia.

Authors:  M MacPhee; K P Chepenik; R A Liddell; K K Nelson; L D Siracusa; A M Buchberg
Journal:  Cell       Date:  1995-06-16       Impact factor: 41.582

Review 5.  Integrative omics for health and disease.

Authors:  Konrad J Karczewski; Michael P Snyder
Journal:  Nat Rev Genet       Date:  2018-02-26       Impact factor: 53.242

6.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

7.  SMAD7 polymorphisms and colorectal cancer risk: a meta-analysis of case-control studies.

Authors:  Yongsheng Huang; Wenting Wu; Meng Nie; Chuang Li; Lin Wang
Journal:  Oncotarget       Date:  2016-11-15

8.  Eigengene networks for studying the relationships between co-expression modules.

Authors:  Peter Langfelder; Steve Horvath
Journal:  BMC Syst Biol       Date:  2007-11-21

9.  Mapping gene expression quantitative trait loci by singular value decomposition and independent component analysis.

Authors:  Shameek Biswas; John D Storey; Joshua M Akey
Journal:  BMC Bioinformatics       Date:  2008-05-20       Impact factor: 3.169

Review 10.  The associations between Parkinson's disease and cancer: the plot thickens.

Authors:  Danielle D Feng; Waijiao Cai; Xiqun Chen
Journal:  Transl Neurodegener       Date:  2015-10-26       Impact factor: 8.014

View more
  2 in total

1.  partition: A fast and flexible framework for data reduction in R.

Authors:  Malcolm Barrett; Joshua Millstein
Journal:  J Open Source Softw       Date:  2020-03-18

2.  fdrci: FDR confidence interval selection and adjustment for large-scale hypothesis testing.

Authors:  Joshua Millstein; Francesca Battaglin; Hiroyuki Arai; Wu Zhang; Priya Jayachandran; Shivani Soni; Aparna R Parikh; Christoph Mancao; Heinz-Josef Lenz
Journal:  Bioinform Adv       Date:  2022-06-13
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.