Literature DB >> 21869441

Synthesizing statistical knowledge from incomplete mixed-mode data.

A K Wong1, D K Chiu.   

Abstract

The difficulties in analyzing and clustering (synthesizing) multivariate data of the mixed type (discrete and continuous) are largely due to: 1) nonuniform scaling in different coordinates, 2) the lack of order in nominal data, and 3) the lack of a suitable similarity measure. This paper presents a new approach which bypasses these difficulties and can acquire statistical knowledge from incomplete mixed-mode data. The proposed method adopts an event-covering approach which covers a subset of statistically relevant outcomes in the outcome space of variable-pairs. And once the covered event patterns are acquired, subsequent analysis tasks such as probabilistic inference, cluster analysis, and detection of event patterns for each cluster based on the incomplete probability scheme can be performed. There are four phases in our method: 1) the discretization of the continuous components based on a maximum entropy criterion so that the data can be treated as n-tuples of discrete-valued features; 2) the estimation of the missing values using our newly developed inference procedure; 3) the initial formation of clusters by analyzing the nearest-neighbor distance on subsets of selected samples; and 4) the reclassification of the n-tuples into more reliable clusters based on the detected interdependence relationships. For performance evaluation, experiments have been conducted using both simulated and real life data.

Year:  1987        PMID: 21869441     DOI: 10.1109/tpami.1987.4767986

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  5 in total

1.  Supervised mutual-information based feature selection for motor unit action potential classification.

Authors:  N Sheikholeslami; D Stashuk
Journal:  Med Biol Eng Comput       Date:  1997-11       Impact factor: 2.602

2.  Individualized markers optimize class prediction of microarray data.

Authors:  Pavlos Pavlidis; Panayiota Poirazi
Journal:  BMC Bioinformatics       Date:  2006-07-14       Impact factor: 3.169

3.  Characterization of System Status Signals for Multivariate Time Series Discretization Based on Frequency and Amplitude Variation.

Authors:  Woonsang Baek; Sujeong Baek; Duck Young Kim
Journal:  Sensors (Basel)       Date:  2018-01-08       Impact factor: 3.576

4.  Merging of Numerical Intervals in Entropy-Based Discretization.

Authors:  Jerzy W Grzymala-Busse; Teresa Mroczek
Journal:  Entropy (Basel)       Date:  2018-11-16       Impact factor: 2.524

5.  Applying sequential pattern mining to investigate cerebrovascular health outpatients' re-visit patterns.

Authors:  Chao Ou-Yang; Chandrawati Putri Wulandari; Rizka Aisha Rahmi Hariadi; Han-Cheng Wang; Chiehfeng Chen
Journal:  PeerJ       Date:  2018-07-09       Impact factor: 2.984

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.