Literature DB >> 17238259

An improved approximation to the estimation of the critical F values in best subset regression.

David W Salt1, Subhash Ajmani, Ray Crichton, David J Livingstone.   

Abstract

Variable selection methods are routinely applied in regression modeling to identify a small number of descriptors which "best" explain the variation in the response variable. Most statistical packages that perform regression have some form of stepping algorithm that can be used in this identification process. Unfortunately, when a subset of p variables measured on a sample of n objects are selected from a set of k (>p) to maximize the squared sample multiple regression coefficient, the significance of the resulting regression is upwardly biased. The extent of this bias is investigated by using Monte Carlo simulation and is presented as an inflation factor which when multiplied by the usual tabulated F ratio gives an estimate of the true 5% critical value. The results show that selection bias can be very high even for moderate-size data sets. Selecting three variables from 50 generated at random with 20 observations will almost certainly provide a significant result if the usual tabulated F values are used. An interpolation formula is provided for the calculation of the inflation factor for different combinations of (n, p, k). Four real data sets are examined to illustrate the effect of correlated descriptor variables on the degree of inflation.

Mesh:

Year:  2007        PMID: 17238259     DOI: 10.1021/ci060113n

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  2 in total

1.  DFT-based QSAR study of alkanols and alkanthiols using the conductor-like polarizable continuum model (CPCM).

Authors:  Khaled Azizi; Mohammad Ali Safarpour; Maryam Keykhaee; Ahmad Reza Mehdipour
Journal:  J Mol Model       Date:  2009-05-22       Impact factor: 1.810

2.  Towards Higher Oil Yield and Quality of Essential Oil Extracted from Aquilaria malaccensis Wood via the Subcritical Technique.

Authors:  M Samadi; Z Zainal Abidin; H Yoshida; R Yunus; D R Awang Biak
Journal:  Molecules       Date:  2020-08-26       Impact factor: 4.411

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.