Literature DB >> 23830416

Locally centred Mahalanobis distance: a new distance measure with salient features towards outlier detection.

Roberto Todeschini1, Davide Ballabio, Viviana Consonni, Faizan Sahigara, Peter Filzmoser.   

Abstract

Outlier detection is a prerequisite to identify the presence of aberrant samples in a given set of data. The identification of such diverse data samples is significant particularly for multivariate data analysis where increasing data dimensionality can easily hinder the data exploration and such outliers often go undetected. This paper is aimed to introduce a novel Mahalanobis distance measure (namely, a pseudo-distance) termed as locally centred Mahalanobis distance, derived by centering the covariance matrix at each data sample rather than at the data centroid as in the classical covariance matrix. Two parameters, called as Remoteness and Isolation degree, were derived from the resulting pairwise distance matrix and their salient features facilitated a better identification of atypical samples isolated from the rest of the data, thus reflecting their potential application towards outlier detection. The Isolation degree demonstrated to be able to detect a new kind of outliers, that is, isolated samples within the data domain, thus resulting in a useful diagnostic tool to evaluate the reliability of predictions obtained by local models (e.g. k-NN models). To better understand the role of Remoteness and Isolation degree in identification of such aberrant data samples, some simulated and published data sets from literature were considered as case studies and the results were compared with those obtained by using Euclidean distance and classical Mahalanobis distance.
Copyright © 2013 Elsevier B.V. All rights reserved.

Keywords:  Covariance matrix; Data mining; Isolation degree; Mahalanobis distance; Outlier detection; Remoteness; Similarity

Year:  2013        PMID: 23830416     DOI: 10.1016/j.aca.2013.04.034

Source DB:  PubMed          Journal:  Anal Chim Acta        ISSN: 0003-2670            Impact factor:   6.558


  9 in total

1.  Molecular Scaffold Hopping via Holistic Molecular Representation.

Authors:  Francesca Grisoni; Gisbert Schneider
Journal:  Methods Mol Biol       Date:  2021

2.  Locally adaptive decision in detection of clustered microcalcifications in mammograms.

Authors:  María V Sainz de Cea; Robert M Nishikawa; Yongyi Yang
Journal:  Phys Med Biol       Date:  2018-02-15       Impact factor: 3.609

3.  Increased vascular α1-adrenergic receptor sensitivity in older adults with posttraumatic stress disorder.

Authors:  Cortnie L Hartwig; Justin D Sprick; Jinhee Jeong; Yingtian Hu; Doree G Morison; C Michael Stein; Sachin Paranjape; Jeanie Park
Journal:  Am J Physiol Regul Integr Comp Physiol       Date:  2020-09-23       Impact factor: 3.619

4.  Defining a novel k-nearest neighbours approach to assess the applicability domain of a QSAR model for reliable predictions.

Authors:  Faizan Sahigara; Davide Ballabio; Roberto Todeschini; Viviana Consonni
Journal:  J Cheminform       Date:  2013-05-30       Impact factor: 5.514

5.  Supervised extensions of chemography approaches: case studies of chemical liabilities assessment.

Authors:  Svetlana I Ovchinnikova; Arseniy A Bykov; Aslan Yu Tsivadze; Evgeny P Dyachkov; Natalia V Kireeva
Journal:  J Cheminform       Date:  2014-05-07       Impact factor: 5.514

6.  High speed railway environment safety evaluation based on measurement attribute recognition model.

Authors:  Qizhou Hu; Ningbo Gao; Bing Zhang
Journal:  Comput Intell Neurosci       Date:  2014-11-09

7.  A composite robotic-based measure of upper limb proprioception.

Authors:  Jeffrey M Kenzie; Jennifer A Semrau; Michael D Hill; Stephen H Scott; Sean P Dukelow
Journal:  J Neuroeng Rehabil       Date:  2017-11-13       Impact factor: 4.262

8.  OPERA models for predicting physicochemical properties and environmental fate endpoints.

Authors:  Kamel Mansouri; Chris M Grulke; Richard S Judson; Antony J Williams
Journal:  J Cheminform       Date:  2018-03-08       Impact factor: 5.514

9.  CLOUD: a non-parametric detection test for microbiome outliers.

Authors:  Emmanuel Montassier; Gabriel A Al-Ghalith; Benjamin Hillmann; Kimberly Viskocil; Amanda J Kabage; Christopher E McKinlay; Michael J Sadowsky; Alexander Khoruts; Dan Knights
Journal:  Microbiome       Date:  2018-08-06       Impact factor: 14.650

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.