Literature DB >> 24807526

Study on the impact of partition-induced dataset shift on k-fold cross-validation.

Jose García Moreno-Torres, José A Saez, Francisco Herrera.   

Abstract

Cross-validation is a very commonly employed technique used to evaluate classifier performance. However, it can potentially introduce dataset shift, a harmful factor that is often not taken into account and can result in inaccurate performance estimation. This paper analyzes the prevalence and impact of partition-induced covariate shift on different k-fold cross-validation schemes. From the experimental results obtained, we conclude that the degree of partition-induced covariate shift depends on the cross-validation scheme considered. In this way, worse schemes may harm the correctness of a single-classifier performance estimation and also increase the needed number of repetitions of cross-validation to reach a stable performance estimation.

Year:  2012        PMID: 24807526     DOI: 10.1109/TNNLS.2012.2199516

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  14 in total

1.  Development of a knowledge mining approach to uncover heterogeneous risk predictors of acute kidney injury across age groups.

Authors:  Lijuan Wu; Yong Hu; Xiangzhou Zhang; Jia Zhang; Mei Liu
Journal:  Int J Med Inform       Date:  2021-12-09       Impact factor: 4.730

2.  Highly polygenic architecture of antidepressant treatment response: Comparative analysis of SSRI and NRI treatment in an animal model of depression.

Authors:  Karim Malki; Maria Grazia Tosto; Héctor Mouriño-Talín; Sabela Rodríguez-Lorenzo; Oliver Pain; Irfan Jumhaboy; Tina Liu; Panos Parpas; Stuart Newman; Artem Malykh; Lucia Carboni; Rudolf Uher; Peter McGuffin; Leonard C Schalkwyk; Kevin Bryson; Mark Herbster
Journal:  Am J Med Genet B Neuropsychiatr Genet       Date:  2016-10-01       Impact factor: 3.568

3.  Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry.

Authors:  Nicolas Schneider; Keywan Sohrabi; Henning Schneider; Klaus-Peter Zimmer; Patrick Fischer; Jan de Laffolie
Journal:  Front Med (Lausanne)       Date:  2021-05-24

4.  Mobile App to Streamline the Development of Wearable Sensor-Based Exercise Biofeedback Systems: System Development and Evaluation.

Authors:  Martin O'Reilly; Joe Duffin; Tomas Ward; Brian Caulfield
Journal:  JMIR Rehabil Assist Technol       Date:  2017-08-21

5.  Feature-Free Activity Classification of Inertial Sensor Data With Machine Vision Techniques: Method, Development, and Evaluation.

Authors:  Jose Juan Dominguez Veiga; Martin O'Reilly; Darragh Whelan; Brian Caulfield; Tomas E Ward
Journal:  JMIR Mhealth Uhealth       Date:  2017-08-04       Impact factor: 4.773

6.  RGIFE: a ranked guided iterative feature elimination heuristic for the identification of biomarkers.

Authors:  Nicola Lazzarini; Jaume Bacardit
Journal:  BMC Bioinformatics       Date:  2017-06-30       Impact factor: 3.169

7.  Assessment of autoregressive integrated moving average (ARIMA), generalized linear autoregressive moving average (GLARMA), and random forest (RF) time series regression models for predicting influenza A virus frequency in swine in Ontario, Canada.

Authors:  Tatiana Petukhova; Davor Ojkic; Beverly McEwen; Rob Deardon; Zvonimir Poljak
Journal:  PLoS One       Date:  2018-06-01       Impact factor: 3.240

8.  Locally Oriented Scene Complexity Analysis Real-Time Ocean Ship Detection from Optical Remote Sensing Images.

Authors:  Yin Zhuang; Baogui Qi; He Chen; Fukun Bi; Lianlin Li; Yizhuang Xie
Journal:  Sensors (Basel)       Date:  2018-11-06       Impact factor: 3.576

9.  Predicting antifreeze proteins with weighted generalized dipeptide composition and multi-regression feature selection ensemble.

Authors:  Shunfang Wang; Lin Deng; Xinnan Xia; Zicheng Cao; Yu Fei
Journal:  BMC Bioinformatics       Date:  2021-06-23       Impact factor: 3.169

10.  A Distributed Parallel Algorithm Based on Low-Rank and Sparse Representation for Anomaly Detection in Hyperspectral Images.

Authors:  Yi Zhang; Zebin Wu; Jin Sun; Yan Zhang; Yaoqin Zhu; Jun Liu; Qitao Zang; Antonio Plaza
Journal:  Sensors (Basel)       Date:  2018-10-25       Impact factor: 3.576

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.