Literature DB >> 31081021

Another look at matrix correlations.

Ahmad Borzou1, Razie Yousefi1, Rovshan G Sadygov1.   

Abstract

MOTIVATION: High throughput technologies are widely employed in modern biomedical research. They yield measurements of a large number of biomolecules in a single experiment. The number of experiments usually is much smaller than the number of measurements in each experiment. The simultaneous measurements of biomolecules provide a basis for a comprehensive, systems view for describing relevant biological processes. Often it is necessary to determine correlations between the data matrices under different conditions or pathways. However, the techniques for analyzing the data with a low number of samples for possible correlations within or between conditions are still in development. Earlier developed correlative measures, such as the RV coefficient, use the trace of the product of data matrices as the most relevant characteristic. However, a recent study has shown that the RV coefficient consistently overestimates the correlations in the case of low sample numbers. To correct for this bias, it was suggested to discard the diagonal elements of the outer products of each data matrix. In this work, a principled approach based on the matrix decomposition generates three trace-independent parts for every matrix. These components are unique, and they are used to determine different aspects of correlations between the original datasets.
RESULTS: Simulations show that the decomposition results in the removal of high correlation bias and the dependence on the sample number intrinsic to the RV coefficient. We then use the correlations to analyze a real proteomics dataset.
AVAILABILITY AND IMPLEMENTATION: The python code can be downloaded from http://dynamic-proteome.utmb.edu/MatrixCorrelations.aspx. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2019        PMID: 31081021      PMCID: PMC6853692          DOI: 10.1093/bioinformatics/btz281

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  7 in total

1.  An empirical Bayes approach to inferring large-scale gene association networks.

Authors:  Juliane Schäfer; Korbinian Strimmer
Journal:  Bioinformatics       Date:  2004-10-12       Impact factor: 6.937

2.  Matrix correlations for high-dimensional data: the modified RV-coefficient.

Authors:  A K Smilde; H A L Kiers; S Bijlsma; C M Rubingh; M J van Erk
Journal:  Bioinformatics       Date:  2008-12-10       Impact factor: 6.937

3.  Exploratory analysis of multiple omics datasets using the adjusted RV coefficient.

Authors:  Claus-Dieter Mayer; Julie Lorent; Graham W Horgan
Journal:  Stat Appl Genet Mol Biol       Date:  2011

4.  d2ome, Software for in Vivo Protein Turnover Analysis Using Heavy Water Labeling and LC-MS, Reveals Alterations of Hepatic Proteome Dynamics in a Mouse Model of NAFLD.

Authors:  Rovshan G Sadygov; Jayant Avva; Mahbubur Rahman; Kwangwon Lee; Sergei Ilchenko; Takhar Kasumov; Ahmad Borzou
Journal:  J Proteome Res       Date:  2018-10-19       Impact factor: 4.466

5.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

6.  The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

Authors:  Damian Szklarczyk; John H Morris; Helen Cook; Michael Kuhn; Stefan Wyder; Milan Simonovic; Alberto Santos; Nadezhda T Doncheva; Alexander Roth; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2016-10-18       Impact factor: 16.971

7.  A large dataset of protein dynamics in the mammalian heart proteome.

Authors:  Edward Lau; Quan Cao; Dominic C M Ng; Brian J Bleakley; T Umut Dincer; Brian M Bot; Ding Wang; David A Liem; Maggie P Y Lam; Junbo Ge; Peipei Ping
Journal:  Sci Data       Date:  2016-03-15       Impact factor: 6.444

  7 in total
  1 in total

1.  A general index for linear and nonlinear correlations for high dimensional genomic data.

Authors:  Zhihao Yao; Jing Zhang; Xiufen Zou
Journal:  BMC Genomics       Date:  2020-11-30       Impact factor: 3.969

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.