| Literature DB >> 25916593 |
Marta Avalos, Hélène Pouyes, Yves Grandvalet, Ludivine Orriols, Emmanuel Lagarde.
Abstract
This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety.Entities:
Mesh:
Year: 2015 PMID: 25916593 PMCID: PMC4416185 DOI: 10.1186/1471-2105-16-S6-S1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Main publicly available R packages that solves the Lasso and other sparse penalties for the Cox, logistic or conditional logistic models (surveyed October 1st, 2014).
| Package | 1:1 matching? | 1:M matching? | Amenable to processing of with grouping penalties | with large | K:M matching? |
|---|---|---|---|---|---|
| glmpath [ | NO | NO | NO | NO | NO |
| penalized [ | YES | NO | NO | NO | NO |
| glmnet [ | NO | NO | NO | NO | NO |
| glmpath [ | NO | NO | NO | NO | NO |
| penalized [ | YES | YES | NO | NO | NO |
| glmnet [ | NO | NO | NO | NO | NO |
| pclogit [ | YES | YES | YES | NO | NO |
| clogitL1 [ | YES | YES | NO | NO | YES |
| clogitLasso [ | YES | YES | NO | YES | NO |
Figure 1Flowchart of the inclusion procedure.
Figure 2Individually matched case-control study. The Lasso regularization path as a function of λ for the paired case-control design. The black vertical line indicates the λ value optimizing the cross-validation criterion. Coefficient values of drugs selected are indicated in black, the others are in gray. Only drugs estimated to be risk factors (positive ) are displayed. Potential confounders (sex, age, chronic disease), forced in the model, are omitted.
Odds ratio (OR) by study design.
| ATC class second level | ATC class fourth level | Case-crossover | Matched case-control |
|---|---|---|---|
| Drugs for acid related disorders | A02BA | 1.88 | |
| A02BX | 1.19 | ||
| Drugs for functional gastrointestinal disorders | A03FA | 1.24 | |
| Laxatives | A06AD | 1.37 | |
| Mineral supplements | A12CC | 1.10 | |
| A12AX | 1.57 | ||
| Antianemic preparations | B03AA | 1.20 | |
| B03BB | 1.24 | ||
| Peripheral vasodilators | C04AX | 1.15 | |
| Antifungals for dermatological use | D01AE | 1.13 | |
| Corticosteroids | D07AB | 1.16 | |
| Sex hormones and modulators of the genital system | G03CA | 1.20 | |
| Muscle relaxants | M03BX | 1.23 | |
| Analgesics | N02BG | 1.09 | |
| Antiepileptics | N03AA | 2.93 | |
| N03AF | 1.34 | 2.11 | |
| N03AX | 1.19 | ||
| Psycholeptics | N05BA | 1.11 | |
| N05CD | 1.37 | 1.09 | |
| N05CX | 1.01 | 1.46 | |
| Psychoanaleptics | N06AB | 1.06 | |
| N06AX | 1.05 | 1.11 | |
| Drugs for obstructive airway diseases | R03BB | 1.23 | |
| Cough and cold preparations | R05DA | 1.08 | |
| Antihistamines for systemic use | R06AX | 1.06 |
Odds ratio estimates are displayed only for selected risk factors.