Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests.

Literature DB >> 28472232

Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests.

Trang T Le¹, W Kyle Simmons^2,3, Masaya Misaki², Jerzy Bodurka^2,4, Bill C White⁵, Jonathan Savitz^2,3, Brett A McKinney^1,5.

Abstract

MOTIVATION: Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting.
METHODS: We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection.
RESULTS: On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder.
AVAILABILITY AND IMPLEMENTATION: Code available at http://insilico.utulsa.edu/software/privateEC . CONTACT: brett-mckinney@utulsa.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Mesh：

Year: 2017 PMID： 28472232 PMCID： PMC5870708 DOI： 10.1093/bioinformatics/btx298

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

24 in total

1. Evaporative cooling feature selection for genotypic data involving interactions.

Authors: B A McKinney; D M Reif; B C White; J E Crowe; J H Moore
Journal: Bioinformatics Date: 2007-06-22 Impact factor: 6.937

2. Revisiting default mode network function in major depression: evidence for disrupted subsystem connectivity.

Authors: F Sambataro; N D Wolf; M Pennuto; N Vasic; R C Wolf
Journal: Psychol Med Date: 2013-10-31 Impact factor: 7.723

Review 3. Resting state networks in major depressive disorder.

Authors: Arpan Dutta; Shane McKie; J F William Deakin
Journal: Psychiatry Res Date: 2014-10-13 Impact factor: 3.222

Review 4. Resting-state functional connectivity in major depressive disorder: A review.

Authors: Peter C Mulders; Philip F van Eijndhoven; Aart H Schene; Christian F Beckmann; Indira Tendolkar
Journal: Neurosci Biobehav Rev Date: 2015-07-30 Impact factor: 8.989

5. Fractionation of social brain circuits in autism spectrum disorders.

Authors: Stephen J Gotts; W Kyle Simmons; Lydia A Milbury; Gregory L Wallace; Robert W Cox; Alex Martin
Journal: Brain Date: 2012-07-11 Impact factor: 13.501

6. Regional homogeneity in depression and its relationship with separate depressive symptom clusters: a resting-state fMRI study.

Authors: Zhijian Yao; Li Wang; Qing Lu; Haiyan Liu; Gaojun Teng
Journal: J Affect Disord Date: 2008-11-12 Impact factor: 4.839

7. Bias in error estimation when using cross-validation for model selection.

Authors: Sudhir Varma; Richard Simon
Journal: BMC Bioinformatics Date: 2006-02-23 Impact factor: 3.169

8. ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data.

Authors: Brett A McKinney; Bill C White; Diane E Grill; Peter W Li; Richard B Kennedy; Gregory A Poland; Ann L Oberg
Journal: PLoS One Date: 2013-12-10 Impact factor: 3.240

9. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays.

Authors: Nils Homer; Szabolcs Szelinger; Margot Redman; David Duggan; Waibhav Tembe; Jill Muehling; John V Pearson; Dietrich A Stephan; Stanley F Nelson; David W Craig
Journal: PLoS Genet Date: 2008-08-29 Impact factor: 5.917

10. Identify changes of brain regional homogeneity in bipolar disorder and unipolar depression using resting-state FMRI.

Authors: Min-Jie Liang; Quan Zhou; Kan-Rong Yang; Xiao-Ling Yang; Jin Fang; Wen-Li Chen; Zheng Huang
Journal: PLoS One Date: 2013-12-04 Impact factor: 3.240

9 in total

1. EpistasisRank and EpistasisKatz: interaction network centrality methods that integrate prior knowledge networks.

Authors: Saeid Parvandeh; Brett A McKinney
Journal: Bioinformatics Date: 2019-07-01 Impact factor: 6.937

2. Consensus features nested cross-validation.

Authors: Saeid Parvandeh; Hung-Wen Yeh; Martin P Paulus; Brett A McKinney
Journal: Bioinformatics Date: 2020-05-01 Impact factor: 6.937

3. Theoretical properties of distance distributions and novel metrics for nearest-neighbor feature selection.

Authors: Bryan A Dawkins; Trang T Le; Brett A McKinney
Journal: PLoS One Date: 2021-02-08 Impact factor: 3.240

Review 4. Relief-based feature selection: Introduction and review.

Authors: Ryan J Urbanowicz; Melissa Meeker; William La Cava; Randal S Olson; Jason H Moore
Journal: J Biomed Inform Date: 2018-07-18 Impact factor: 6.317

Review 5. The role of systems biology approaches in determining molecular signatures for the development of more effective vaccines.

Authors: Abdulmohammad Pezeshki; Inna G Ovsyannikova; Brett A McKinney; Gregory A Poland; Richard B Kennedy
Journal: Expert Rev Vaccines Date: 2019-02-11 Impact factor: 5.217