| Literature DB >> 30958174 |
W Thomson1, S Jabbari1,2, A E Taylor3,4, W Arlt3,4, D J Smith1,3,4.
Abstract
We introduce a Bayesian prior distribution, the logit-normal continuous analogue of the spike-and-slab, which enables flexible parameter estimation and variable/model selection in a variety of settings. We demonstrate its use and efficacy in three case studies-a simulation study and two studies on real biological data from the fields of metabolomics and genomics. The prior allows the use of classical statistical models, which are easily interpretable and well known to applied scientists, but performs comparably to common machine learning methods in terms of generalizability to previously unseen data.Entities:
Keywords: Bayesian; shrinkage; spike-and-slab; variable selection
Mesh:
Year: 2019 PMID: 30958174 PMCID: PMC6364637 DOI: 10.1098/rsif.2018.0572
Source DB: PubMed Journal: J R Soc Interface ISSN: 1742-5662 Impact factor: 4.118
Figure 1.(a) The logit-normal distribution with μ = 0 and σ given by (blue, orange, green) = (2.5, 5, 50). (b) The LN-CASS priors induced by the logit-normal distributions of (a).
Figure 2.Agreement between ground truth and estimated parameters for the simulation study in the (a) p = 120 case, (b) p = 70 case; (c) performance measures for each method. HS, horseshoe; OLS, ordinary least squares; SGL, sparse group LASSO.
Figure 3.Metabolomics case study. (a) Boxplots of AUCs for each method computed via 16 × 10-fold cross-validation; (b) estimated mean functions f from the LN-CASS hierarchical GAM. Functions have been smoothed for presentation purposes with a LOESS smoother using a small span.
Figure 4.Mean predictions and observed outcomes from the LN-CASS model for the microarray data. Circled points have been identified as potentially mislabelled by [12,18].