Rezzy Eko Caraka1,2, Rung-Ching Chen3, Su-Wen Huang4,5, Shyue-Yow Chiou6, Prana Ugiana Gio7, Bens Pardamean8,9. 1. Executive Secretariat, National Research and Innovation Agency (BRIN), DKI Jakarta, 10340, Indonesia. 2. Department of Information Management, College of Informatics, Chaoyang University of Technology, Taichung City, 41349, Taiwan. 3. Department of Information Management, College of Informatics, Chaoyang University of Technology, Taichung City, 41349, Taiwan. crching@cyut.edu.tw. 4. Department of Information Management, College of Informatics, Chaoyang University of Technology, Taichung City, 41349, Taiwan. dale33663366@vghtc.gov.tw. 5. Taichung Veterans General Hospital, Taichung City, 40705, Taiwan. dale33663366@vghtc.gov.tw. 6. Taichung Veterans General Hospital, Taichung City, 40705, Taiwan. 7. Department of Mathematics, Universitas Sumatera Utara, Medan, 20155, Indonesia. 8. Bioinformatics and Data Science Research Center, Bina Nusantara University, DKI Jakarta, 11480, Indonesia. 9. Computer Science Department, Bina Nusantara University, DKI Jakarta, 11480, Indonesia.
Abstract
BACKGROUND: In heart data mining and machine learning, dimension reduction is needed to remove multicollinearity. Meanwhile, it has been proven to improve the interpretation of the parameter model. In addition, dimension reduction can also increase the time of computing in high dimensional data. METHODS: In this paper, we perform high dimensional ordination towards event counts in intensive care hospital for Emergency Department (ED 1), First Intensive Care Unit (ICU1), Second Intensive Care Unit (ICU2), Respiratory Care Intensive Care Unit (RICU), Surgical Intensive Care Unit (SICU), Subacute Respiratory Care Unit (RCC), Trauma and Neurosurgery Intensive Care Unit (TNCU), Neonatal Intensive Care Unit (NICU) which use the Generalized Linear Latent Variable Models (GLLVM's). RESULTS: During the analysis, we measure the performance and calculate the time computing of GLLVM by employing variational approximation and Laplace approximation, and compare the different distributions, including Negative Binomial, Poisson, Gaussian, ZIP, and Tweedie, respectively. GLLVMs (Generalized Linear Latent Variable Models), an extended version of GLMs (Generalized Linear Models) with latent variables, have fast computing time. The major challenge in latent variable modelling is that the function [Formula: see text] is not trivial to solve since the marginal likelihood involves integration over the latent variable u. CONCLUSIONS: In a nutshell, GLLVMs lead as the best performance reaching the variance of 98% comparing other methods. We get the best model negative binomial and Variational approximation, which provides the best accuracy by accuracy value of AIC, AICc, and BIC. In a nutshell, our best model is GLLVM-VA Negative Binomial with AIC 7144.07 and GLLVM-LA Negative Binomial with AIC 6955.922.
BACKGROUND: In heart data mining and machine learning, dimension reduction is needed to remove multicollinearity. Meanwhile, it has been proven to improve the interpretation of the parameter model. In addition, dimension reduction can also increase the time of computing in high dimensional data. METHODS: In this paper, we perform high dimensional ordination towards event counts in intensive care hospital for Emergency Department (ED 1), First Intensive Care Unit (ICU1), Second Intensive Care Unit (ICU2), Respiratory Care Intensive Care Unit (RICU), Surgical Intensive Care Unit (SICU), Subacute Respiratory Care Unit (RCC), Trauma and Neurosurgery Intensive Care Unit (TNCU), Neonatal Intensive Care Unit (NICU) which use the Generalized Linear Latent Variable Models (GLLVM's). RESULTS: During the analysis, we measure the performance and calculate the time computing of GLLVM by employing variational approximation and Laplace approximation, and compare the different distributions, including Negative Binomial, Poisson, Gaussian, ZIP, and Tweedie, respectively. GLLVMs (Generalized Linear Latent Variable Models), an extended version of GLMs (Generalized Linear Models) with latent variables, have fast computing time. The major challenge in latent variable modelling is that the function [Formula: see text] is not trivial to solve since the marginal likelihood involves integration over the latent variable u. CONCLUSIONS: In a nutshell, GLLVMs lead as the best performance reaching the variance of 98% comparing other methods. We get the best model negative binomial and Variational approximation, which provides the best accuracy by accuracy value of AIC, AICc, and BIC. In a nutshell, our best model is GLLVM-VA Negative Binomial with AIC 7144.07 and GLLVM-LA Negative Binomial with AIC 6955.922.
Authors: A Corrado; C Roussos; N Ambrosino; M Confalonieri; A Cuvelier; M Elliott; M Ferrer; M Gorini; O Gurkan; J F Muir; L Quareni; D Robert; D Rodenstein; A Rossi; B Schoenhofer; A K Simonds; K Strom; A Torres; S Zakynthinos Journal: Eur Respir J Date: 2002-11 Impact factor: 16.671