Literature DB >> 29939200

Using multivariate mixed-effects selection models for analyzing batch-processed proteomics data with non-ignorable missingness.

Jiebiao Wang1, Pei Wang2, Donald Hedeker1, Lin S Chen1.   

Abstract

In quantitative proteomics, mass tag labeling techniques have been widely adopted in mass spectrometry experiments. These techniques allow peptides (short amino acid sequences) and proteins from multiple samples of a batch being detected and quantified in a single experiment, and as such greatly improve the efficiency of protein profiling. However, the batch-processing of samples also results in severe batch effects and non-ignorable missing data occurring at the batch level. Motivated by the breast cancer proteomic data from the Clinical Proteomic Tumor Analysis Consortium, in this work, we developed two tailored multivariate MIxed-effects SElection models (mvMISE) to jointly analyze multiple correlated peptides/proteins in labeled proteomics data, considering the batch effects and the non-ignorable missingness. By taking a multivariate approach, we can borrow information across multiple peptides of the same protein or multiple proteins from the same biological pathway, and thus achieve better statistical efficiency and biological interpretation. These two different models account for different correlation structures among a group of peptides or proteins. Specifically, to model multiple peptides from the same protein, we employed a factor-analytic random effects structure to characterize the high and similar correlations among peptides. To model biological dependence among multiple proteins in a functional pathway, we introduced a graphical lasso penalty on the error precision matrix, and implemented an efficient algorithm based on the alternating direction method of multipliers. Simulations demonstrated the advantages of the proposed models. Applying the proposed methods to the motivating data set, we identified phosphoproteins and biological pathways that showed different activity patterns in triple negative breast tumors versus other breast tumors. The proposed methods can also be applied to other high-dimensional multivariate analyses based on clustered data with or without non-ignorable missingness.
© The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  Alternating direction method of multipliers; Expectation-maximization algorithm; Graphical lasso; Missing not at random; Multivariate mixed-effects models; Proteomics; Selection model

Mesh:

Year:  2019        PMID: 29939200      PMCID: PMC6797056          DOI: 10.1093/biostatistics/kxy022

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  11 in total

1.  Addressing accuracy and precision issues in iTRAQ quantitation.

Authors:  Natasha A Karp; Wolfgang Huber; Pawel G Sadowski; Philip D Charles; Svenja V Hester; Kathryn S Lilley
Journal:  Mol Cell Proteomics       Date:  2010-04-10       Impact factor: 5.911

2.  Protein labeling by iTRAQ: a new tool for quantitative mass spectrometry in proteome research.

Authors:  Sebastian Wiese; Kai A Reidegeld; Helmut E Meyer; Bettina Warscheid
Journal:  Proteomics       Date:  2007-02       Impact factor: 3.984

3.  A mixed-effects regression model for longitudinal multivariate ordinal data.

Authors:  Li C Liu; Donald Hedeker
Journal:  Biometrics       Date:  2006-03       Impact factor: 2.571

4.  Protein quantification in label-free LC-MS experiments.

Authors:  Timothy Clough; Melissa Key; Ilka Ott; Susanne Ragg; Gunther Schadow; Olga Vitek
Journal:  J Proteome Res       Date:  2009-11       Impact factor: 4.466

5.  Connecting genomic alterations to cancer biology with proteomics: the NCI Clinical Proteomic Tumor Analysis Consortium.

Authors:  Matthew J Ellis; Michael Gillette; Steven A Carr; Amanda G Paulovich; Richard D Smith; Karin K Rodland; R Reid Townsend; Christopher Kinsinger; Mehdi Mesri; Henry Rodriguez; Daniel C Liebler
Journal:  Cancer Discov       Date:  2013-10       Impact factor: 39.397

Review 6.  Rationale for targeting the Ras/MAPK pathway in triple-negative breast cancer.

Authors:  Jennifer M Giltnane; Justin M Balko
Journal:  Discov Med       Date:  2014-05       Impact factor: 2.970

7.  The joint graphical lasso for inverse covariance estimation across multiple classes.

Authors:  Patrick Danaher; Pei Wang; Daniela M Witten
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2014-03       Impact factor: 4.488

8.  A MIXED-EFFECTS MODEL FOR INCOMPLETE DATA FROM LABELING-BASED QUANTITATIVE PROTEOMICS EXPERIMENTS.

Authors:  Lin S Chen; Jiebiao Wang; Xianlong Wang; Pei Wang
Journal:  Ann Appl Stat       Date:  2017-04-08       Impact factor: 2.083

9.  AHNAK suppresses tumour proliferation and invasion by targeting multiple pathways in triple-negative breast cancer.

Authors:  Bo Chen; Jin Wang; Danian Dai; Qingyu Zhou; Xiaofang Guo; Zhi Tian; Xiaojia Huang; Lu Yang; Hailin Tang; Xiaoming Xie
Journal:  J Exp Clin Cancer Res       Date:  2017-05-12

10.  Proteogenomics connects somatic mutations to signalling in breast cancer.

Authors:  Philipp Mertins; D R Mani; Kelly V Ruggles; Michael A Gillette; Karl R Clauser; Pei Wang; Xianlong Wang; Jana W Qiao; Song Cao; Francesca Petralia; Emily Kawaler; Filip Mundt; Karsten Krug; Zhidong Tu; Jonathan T Lei; Michael L Gatza; Matthew Wilkerson; Charles M Perou; Venkata Yellapantula; Kuan-lin Huang; Chenwei Lin; Michael D McLellan; Ping Yan; Sherri R Davies; R Reid Townsend; Steven J Skates; Jing Wang; Bing Zhang; Christopher R Kinsinger; Mehdi Mesri; Henry Rodriguez; Li Ding; Amanda G Paulovich; David Fenyö; Matthew J Ellis; Steven A Carr
Journal:  Nature       Date:  2016-05-25       Impact factor: 49.962

View more
  2 in total

1.  ESTIMATION AND INFERENCE IN METABOLOMICS WITH NON-RANDOM MISSING DATA AND LATENT FACTORS.

Authors:  Chris McKennan; Carole Ober; Dan Nicolae
Journal:  Ann Appl Stat       Date:  2020-06-29       Impact factor: 2.083

2.  A robust two-sample transcriptome-wide Mendelian randomization method integrating GWAS with multi-tissue eQTL summary statistics.

Authors:  Kevin J Gleason; Fan Yang; Lin S Chen
Journal:  Genet Epidemiol       Date:  2021-04-09       Impact factor: 2.344

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.