Literature DB >> 32905548

COPA: Constrained PARAFAC2 for Sparse & Large Datasets.

Ardavan Afshar1, Ioakeim Perros1, Evangelos E Papalexakis2, Elizabeth Searles3, Joyce Ho4, Jimeng Sun1.   

Abstract

PARAFAC2 has demonstrated success in modeling irregular tensors, where the tensor dimensions vary across one of the modes. An example scenario is modeling treatments across a set of patients with the varying number of medical encounters over time. Despite recent improvements on unconstrained PARAFAC2, its model factors are usually dense and sensitive to noise which limits their interpretability. As a result, the following open challenges remain: a) various modeling constraints, such as temporal smoothness, sparsity and non-negativity, are needed to be imposed for interpretable temporal modeling and b) a scalable approach is required to support those constraints efficiently for large datasets. To tackle these challenges, we propose a COnstrained PARAFAC2 (COPA) method, which carefully incorporates optimization constraints such as temporal smoothness, sparsity, and non-negativity in the resulting factors. To efficiently support all those constraints, COPA adopts a hybrid optimization framework using alternating optimization and alternating direction method of multiplier (AO-ADMM). As evaluated on large electronic health record (EHR) datasets with hundreds of thousands of patients, COPA achieves significant speedups (up to 36× faster) over prior PARAFAC2 approaches that only attempt to handle a subset of the constraints that COPA enables. Overall, our method outperforms all the baselines attempting to handle a subset of the constraints in terms of speed, while achieving the same level of accuracy. Through a case study on temporal phenotyping of medically complex children, we demonstrate how the constraints imposed by COPA reveal concise phenotypes and meaningful temporal profiles of patients. The clinical interpretation of both the phenotypes and the temporal profiles was confirmed by a medical expert.

Entities:  

Keywords:  Computational Phenotyping; Tensor Factorization; Unsupervised Learning

Year:  2018        PMID: 32905548      PMCID: PMC7472553          DOI: 10.1145/3269206.3271775

Source DB:  PubMed          Journal:  Proc ACM Int Conf Inf Knowl Manag        ISSN: 2155-0751


  4 in total

1.  The International Classification of Diseases: ninth revision (ICD-9)

Authors:  V N Slee
Journal:  Ann Intern Med       Date:  1978-03       Impact factor: 25.391

2.  Rubik: Knowledge Guided Tensor Factorization and Completion for Health Data Analytics.

Authors:  Yichen Wang; Robert Chen; Joydeep Ghosh; Joshua C Denny; Abel Kho; You Chen; Bradley A Malin; Jimeng Sun
Journal:  KDD       Date:  2015-08

3.  Estimating latent trends in multivariate longitudinal data via Parafac2 with functional and structural constraints.

Authors:  Nathaniel E Helwig
Journal:  Biom J       Date:  2016-12-26       Impact factor: 2.207

4.  Limestone: high-throughput candidate phenotype generation via tensor factorization.

Authors:  Joyce C Ho; Joydeep Ghosh; Steve R Steinhubl; Walter F Stewart; Joshua C Denny; Bradley A Malin; Jimeng Sun
Journal:  J Biomed Inform       Date:  2014-07-16       Impact factor: 6.317

  4 in total
  2 in total

1.  LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values.

Authors:  Kejing Yin; Ardavan Afshar; Joyce C Ho; William K Cheung; Chao Zhang; Jimeng Sun
Journal:  KDD       Date:  2020-08

2.  Tracing Evolving Networks Using Tensor Factorizations vs. ICA-Based Approaches.

Authors:  Evrim Acar; Marie Roald; Khondoker M Hossain; Vince D Calhoun; Tülay Adali
Journal:  Front Neurosci       Date:  2022-04-25       Impact factor: 5.152

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.