Literature DB >> 34149160

An Extended DEIM Algorithm for Subset Selection and Class Identification.

Emily P Hendryx1, Béatrice M Rivière2, Craig G Rusin3.   

Abstract

The discrete empirical interpolation method (DEIM) has been shown to be a viable index-selection technique for identifying representative subsets in data. Having gained some popularity in reducing dimensionality of physical models involving differential equations, its use in subset-/pattern-identification tasks is not yet broadly known within the machine learning community. While it has much to offer as is, the DEIM algorithm is limited in that the number of selected indices cannot exceed the rank of the corresponding data matrix. Although this is not an issue for many data sets, there are cases in which the number of classes represented in a given data set is greater than the rank of the data matrix; in such cases, it is impossible for the standard DEIM algorithm to identify all classes. To overcome this issue, we present a novel extension of DEIM, called E-DEIM. With the proposed algorithm, we also provide some theoretical results for using extensions of DEIM to form the CUR matrix factorization in identifying both rows and columns to approximate the original data matrix. Results from applying variations of E-DEIM to two different data sets indicate that the presented extension can indeed allow for the identification of additional classes along with those selected by standard DEIM. In addition, comparing these results to those of some more familiar methods demonstrates that the proposed deterministic E-DEIM approach including coherence performs comparably to or better than the other evaluated methods and should be considered in future class-identification tasks.

Entities:  

Keywords:  class identification; discrete empirical interpolation method; low rank data; subset selection

Year:  2021        PMID: 34149160      PMCID: PMC8211103          DOI: 10.1007/s10994-021-05954-3

Source DB:  PubMed          Journal:  Mach Learn        ISSN: 0885-6125            Impact factor:   2.940


  5 in total

1.  PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

Authors:  A L Goldberger; L A Amaral; L Glass; J M Hausdorff; P C Ivanov; R G Mark; J E Mietus; G B Moody; C K Peng; H E Stanley
Journal:  Circulation       Date:  2000-06-13       Impact factor: 29.690

2.  The impact of the MIT-BIH arrhythmia database.

Authors:  G B Moody; R G Mark
Journal:  IEEE Eng Med Biol Mag       Date:  2001 May-Jun

3.  Automatic classification of heartbeats using ECG morphology and heartbeat interval features.

Authors:  Philip de Chazal; Maria O'Dwyer; Richard B Reilly
Journal:  IEEE Trans Biomed Eng       Date:  2004-07       Impact factor: 4.538

4.  CUR matrix decompositions for improved data analysis.

Authors:  Michael W Mahoney; Petros Drineas
Journal:  Proc Natl Acad Sci U S A       Date:  2009-01-12       Impact factor: 11.205

5.  Finding representative electrocardiogram beat morphologies with CUR.

Authors:  Emily P Hendryx; Béatrice M Rivière; Danny C Sorensen; Craig G Rusin
Journal:  J Biomed Inform       Date:  2017-12-07       Impact factor: 6.317

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.