| Literature DB >> 15208198 |
Trevor Hastie1, Robert Tibshirani.
Abstract
Gene expression arrays typically have 50 to 100 samples and 1000 to 20,000 variables (genes). There have been many attempts to adapt statistical models for regression and classification to these data, and in many cases these attempts have challenged the computational resources. In this article we expose a class of techniques based on quadratic regularization of linear models, including regularized (ridge) regression, logistic and multinomial regression, linear and mixture discriminant analysis, the Cox model and neural networks. For all of these models, we show that dramatic computational savings are possible over naive implementations, using standard transformations in numerical linear algebra.Entities:
Mesh:
Year: 2004 PMID: 15208198 DOI: 10.1093/biostatistics/5.3.329
Source DB: PubMed Journal: Biostatistics ISSN: 1465-4644 Impact factor: 5.899