Literature DB >> 29360996

Feature specific quantile normalization enables cross-platform classification of molecular subtypes using gene expression data.

Jennifer M Franks1, Guoshuai Cai2, Michael L Whitfield1,3.   

Abstract

Motivation: Molecular subtypes of cancers and autoimmune disease, defined by transcriptomic profiling, have provided insight into disease pathogenesis, molecular heterogeneity and therapeutic responses. However, technical biases inherent to different gene expression profiling platforms present a unique problem when analyzing data generated from different studies. Currently, there is a lack of effective methods designed to eliminate platform-based bias. We present a method to normalize and classify RNA-seq data using machine learning classifiers trained on DNA microarray data and molecular subtypes in two datasets: breast invasive carcinoma (BRCA) and colorectal cancer (CRC).
Results: Multiple analyses show that feature specific quantile normalization (FSQN) successfully removes platform-based bias from RNA-seq data, regardless of feature scaling or machine learning algorithm. We achieve up to 98% accuracy for BRCA data and 97% accuracy for CRC data in assigning molecular subtypes to RNA-seq data normalized using FSQN and a support vector machine trained exclusively on DNA microarray data. We find that maximum accuracy was achieved when normalizing RNA-seq datasets that contain at least 25 samples. FSQN allows comparison of RNA-seq data to existing DNA microarray datasets. Using these techniques, we can successfully leverage information from existing gene expression data in new analyses despite different platforms used for gene expression profiling. Availability and implementation: FSQN has been submitted as an R package to CRAN. All code used for this study is available on Github (https://github.com/jenniferfranks/FSQN). Contact: michael.l.whitfield@dartmouth.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2018        PMID: 29360996      PMCID: PMC5972664          DOI: 10.1093/bioinformatics/bty026

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  20 in total

1.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.

Authors:  B M Bolstad; R A Irizarry; M Astrand; T P Speed
Journal:  Bioinformatics       Date:  2003-01-22       Impact factor: 6.937

2.  The huge Package for High-dimensional Undirected Graph Estimation in R.

Authors:  Tuo Zhao; Han Liu; Kathryn Roeder; John Lafferty; Larry Wasserman
Journal:  J Mach Learn Res       Date:  2012-04       Impact factor: 3.654

3.  A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

Authors:  Sarah E Reese; Kellie J Archer; Terry M Therneau; Elizabeth J Atkinson; Celine M Vachon; Mariza de Andrade; Jean-Pierre A Kocher; Jeanette E Eckel-Passow
Journal:  Bioinformatics       Date:  2013-08-19       Impact factor: 6.937

4.  Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Authors:  Robert Tibshirani; Trevor Hastie; Balasubramanian Narasimhan; Gilbert Chu
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-14       Impact factor: 11.205

5.  Molecular portraits of human breast tumours.

Authors:  C M Perou; T Sørlie; M B Eisen; M van de Rijn; S S Jeffrey; C A Rees; J R Pollack; D T Ross; H Johnsen; L A Akslen; O Fluge; A Pergamenschikov; C Williams; S X Zhu; P E Lønning; A L Børresen-Dale; P O Brown; D Botstein
Journal:  Nature       Date:  2000-08-17       Impact factor: 49.962

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Supervised risk predictor of breast cancer based on intrinsic subtypes.

Authors:  Joel S Parker; Michael Mullins; Maggie C U Cheang; Samuel Leung; David Voduc; Tammi Vickery; Sherri Davies; Christiane Fauron; Xiaping He; Zhiyuan Hu; John F Quackenbush; Inge J Stijleman; Juan Palazzo; J S Marron; Andrew B Nobel; Elaine Mardis; Torsten O Nielsen; Matthew J Ellis; Charles M Perou; Philip S Bernard
Journal:  J Clin Oncol       Date:  2009-02-09       Impact factor: 44.544

8.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.

Authors:  Bo Li; Colin N Dewey
Journal:  BMC Bioinformatics       Date:  2011-08-04       Impact factor: 3.307

9.  Molecular subtyping for clinically defined breast cancer subgroups.

Authors:  Xi Zhao; Einar Andreas Rødland; Robert Tibshirani; Sylvia Plevritis
Journal:  Breast Cancer Res       Date:  2015-02-26       Impact factor: 6.466

10.  Cross-platform normalization of microarray and RNA-seq data for machine learning applications.

Authors:  Jeffrey A Thompson; Jie Tan; Casey S Greene
Journal:  PeerJ       Date:  2016-01-21       Impact factor: 2.984

View more
  17 in total

1.  Solid-type adenoid cystic carcinoma of the breast, a distinct molecular entity enriched in NOTCH and CREBBP mutations.

Authors:  Julie Massé; Caroline Truntzer; Romain Boidot; Emmanuel Khalifa; Gaëlle Pérot; Valérie Velasco; Laétitia Mayeur; Claire Billerey-Larmonier; Larry Blanchard; Hélène Charitansky; Isabelle Soubeyran; Richard Iggo; Laurent Arnould; Gaëtan MacGrogan
Journal:  Mod Pathol       Date:  2019-12-19       Impact factor: 7.842

2.  Transient exposure to miR-203 enhances the differentiation capacity of established pluripotent stem cells.

Authors:  María Salazar-Roa; Marianna Trakala; Mónica Álvarez-Fernández; Fátima Valdés-Mora; Cuiqing Zhong; Jaime Muñoz; Yang Yu; Timothy J Peters; Osvaldo Graña-Castro; Rosa Serrano; Elisabet Zapatero-Solana; María Abad; María José Bueno; Marta Gómez de Cedrón; José Fernández-Piqueras; Manuel Serrano; María A Blasco; Da-Zhi Wang; Susan J Clark; Juan Carlos Izpisua-Belmonte; Sagrario Ortega; Marcos Malumbres
Journal:  EMBO J       Date:  2020-07-02       Impact factor: 11.598

3.  Global skin gene expression analysis of early diffuse cutaneous systemic sclerosis shows a prominent innate and adaptive inflammatory profile.

Authors:  Brian Skaug; Dinesh Khanna; William R Swindell; Monique E Hinchcliff; Tracy M Frech; Virginia D Steen; Faye N Hant; Jessica K Gordon; Ami A Shah; Lisha Zhu; W Jim Zheng; Jeffrey L Browning; Alexander M S Barron; Minghua Wu; Sudha Visvanathan; Patrick Baum; Jennifer M Franks; Michael L Whitfield; Victoria K Shanmugam; Robyn T Domsic; Flavia V Castelino; Elana J Bernstein; Nancy Wareing; Marka A Lyons; Jun Ying; Julio Charles; Maureen D Mayes; Shervin Assassi
Journal:  Ann Rheum Dis       Date:  2019-11-25       Impact factor: 19.103

Review 4.  Big-Data Glycomics: Tools to Connect Glycan Biosynthesis to Extracellular Communication.

Authors:  Benjamin P Kellman; Nathan E Lewis
Journal:  Trends Biochem Sci       Date:  2020-12-18       Impact factor: 13.807

5.  A Machine Learning Classifier for Assigning Individual Patients With Systemic Sclerosis to Intrinsic Molecular Subsets.

Authors:  Jennifer M Franks; Viktor Martyanov; Guoshuai Cai; Yue Wang; Zhenghui Li; Tammara A Wood; Michael L Whitfield
Journal:  Arthritis Rheumatol       Date:  2019-09-02       Impact factor: 15.483

6.  A single-cell Arabidopsis root atlas reveals developmental trajectories in wild-type and cell identity mutants.

Authors:  Rachel Shahan; Che-Wei Hsu; Trevor M Nolan; Benjamin J Cole; Isaiah W Taylor; Laura Greenstreet; Stephen Zhang; Anton Afanassiev; Anna Hendrika Cornelia Vlot; Geoffrey Schiebinger; Philip N Benfey; Uwe Ohler
Journal:  Dev Cell       Date:  2022-02-07       Impact factor: 13.417

7.  Expression based biomarkers and models to classify early and late-stage samples of Papillary Thyroid Carcinoma.

Authors:  Sherry Bhalla; Harpreet Kaur; Rishemjit Kaur; Suresh Sharma; Gajendra P S Raghava
Journal:  PLoS One       Date:  2020-04-23       Impact factor: 3.240

8.  Microbiome dysbiosis is associated with disease duration and increased inflammatory gene expression in systemic sclerosis skin.

Authors:  Michael E Johnson; Jennifer M Franks; Guoshuai Cai; Bhaven K Mehta; Tammara A Wood; Kimberly Archambault; Patricia A Pioli; Robert W Simms; Nicole Orzechowski; Sarah Arron; Michael L Whitfield
Journal:  Arthritis Res Ther       Date:  2019-02-06       Impact factor: 5.606

9.  Rank-in: enabling integrative analysis across microarray and RNA-seq for cancer.

Authors:  Kailin Tang; Xuejie Ji; Mengdi Zhou; Zeliang Deng; Yuwei Huang; Genhui Zheng; Zhiwei Cao
Journal:  Nucleic Acids Res       Date:  2021-09-27       Impact factor: 16.971

10.  Development and Validation of a Gene Signature Classifier for Consensus Molecular Subtyping of Colorectal Carcinoma in a CLIA-Certified Setting.

Authors:  Scott Kopetz; Dipen M Maru; Jeffrey S Morris; Rajyalakshmi Luthra; Yusha Liu; Dzifa Y Duose; Wonyul Lee; Neelima G Reddy; Justin Windham; Huiqin Chen; Zhimin Tong; Baili Zhang; Wei Wei; Manyam Ganiraju; Bradley M Broom; Hector A Alvarez; Alicia Mejia; Omkara Veeranki; Mark J Routbort; Van K Morris; Michael J Overman; David Menter; Riham Katkhuda; Ignacio I Wistuba; Jennifer S Davis
Journal:  Clin Cancer Res       Date:  2020-10-27       Impact factor: 13.801

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.