Literature DB >> 24257187

Effect of separate sampling on classification accuracy.

Mohammad Shahrokh Esfahani1, Edward R Dougherty.   

Abstract

MOTIVATION: Measurements are commonly taken from two phenotypes to build a classifier, where the number of data points from each class is predetermined, not random. In this 'separate sampling' scenario, the data cannot be used to estimate the class prior probabilities. Moreover, predetermined class sizes can severely degrade classifier performance, even for large samples.
RESULTS: We employ simulations using both synthetic and real data to show the detrimental effect of separate sampling on a variety of classification rules. We establish propositions related to the effect on the expected classifier error owing to a sampling ratio different from the population class ratio. From these we derive a sample-based minimax sampling ratio and provide an algorithm for approximating it from the data. We also extend to arbitrary distributions the classical population-based Anderson linear discriminant analysis minimax sampling ratio derived from the discriminant form of the Bayes classifier. AVAILABILITY: All the codes for synthetic data and real data examples are written in MATLAB. A function called mmratio, whose output is an approximation of the minimax sampling ratio of a given dataset, is also written in MATLAB. All the codes are available at: http://gsp.tamu.edu/Publications/supplementary/shahrokh13b.

Entities:  

Mesh:

Year:  2013        PMID: 24257187     DOI: 10.1093/bioinformatics/btt662

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Cross-validation under separate sampling: strong bias and how to correct it.

Authors:  Ulisses M Braga-Neto; Amin Zollanvari; Edward R Dougherty
Journal:  Bioinformatics       Date:  2014-08-13       Impact factor: 6.937

2.  Phenotype Classification Using Moment Features of Single-Cell Data.

Authors:  Chao Sima; Jianping Hua; Michael L Bittner; Seungchan Kim; Edward R Dougherty
Journal:  Cancer Inform       Date:  2018-04-23

3.  Detection of Anomalous Diffusion with Deep Residual Networks.

Authors:  Miłosz Gajowczyk; Janusz Szwabiński
Journal:  Entropy (Basel)       Date:  2021-05-22       Impact factor: 2.524

4.  Predicting the Valence of a Scene from Observers' Eye Movements.

Authors:  Hamed R-Tavakoli; Adham Atyabi; Antti Rantanen; Seppo J Laukka; Samia Nefti-Meziani; Janne Heikkilä
Journal:  PLoS One       Date:  2015-09-25       Impact factor: 3.240

5.  MCMC implementation of the optimal Bayesian classifier for non-Gaussian models: model-based RNA-Seq classification.

Authors:  Jason M Knight; Ivan Ivanov; Edward R Dougherty
Journal:  BMC Bioinformatics       Date:  2014-12-10       Impact factor: 3.169

6.  Incorporating prior knowledge induced from stochastic differential equations in the classification of stochastic observations.

Authors:  Amin Zollanvari; Edward R Dougherty
Journal:  EURASIP J Bioinform Syst Biol       Date:  2016-01-20

7.  A data-driven artificial intelligence model for remote triage in the prehospital environment.

Authors:  Dohyun Kim; Sungmin You; Soonwon So; Jongshill Lee; Sunhyun Yook; Dong Pyo Jang; In Young Kim; Eunkyoung Park; Kyeongwon Cho; Won Chul Cha; Dong Wook Shin; Baek Hwan Cho; Hoon-Ki Park
Journal:  PLoS One       Date:  2018-10-23       Impact factor: 3.240

8.  QSAR Implementation for HIC Retention Time Prediction of mAbs Using Fab Structure: A Comparison between Structural Representations.

Authors:  Micael Karlberg; João Victor de Souza; Lanyu Fan; Arathi Kizhedath; Agnieszka K Bronowska; Jarka Glassey
Journal:  Int J Mol Sci       Date:  2020-10-28       Impact factor: 5.923

9.  High-density surface electromyography signals during isometric contractions of elbow muscles of healthy humans.

Authors:  Mónica Rojas-Martínez; Leidy Yanet Serna; Mislav Jordanic; Hamid Reza Marateb; Roberto Merletti; Miguel Ángel Mañanas
Journal:  Sci Data       Date:  2020-11-16       Impact factor: 6.444

10.  A Methodology for Texture Feature-based Quality Assessment in Nucleus Segmentation of Histopathology Image.

Authors:  Si Wen; Tahsin M Kurc; Yi Gao; Tianhao Zhao; Joel H Saltz; Wei Zhu
Journal:  J Pathol Inform       Date:  2017-09-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.