| Literature DB >> 22834855 |
Christophe Lalanne1, Bruno Falissard, Bernard Golse, Laurence Vaivre-Douret.
Abstract
BACKGROUND: With a large number of potentially relevant clinical indicators penalization and ensemble learning methods are thought to provide better predictive performance than usual linear predictors. However, little is known about how they perform in clinical studies where few cases are available. We used Random Forests and Partial Least Squares Discriminant Analysis to select the most salient impairments in Developmental Coordination Disorder (DCD) and assess patients similarity.Entities:
Mesh:
Year: 2012 PMID: 22834855 PMCID: PMC3464628 DOI: 10.1186/1471-2288-12-107
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Descriptive statistics for the training and validation samples
| | ||||
|---|---|---|---|---|
| | | |||
| Diagnosis: IM | 63 | 9% ( 4) | 6% ( 1) | 8% ( 5) |
| VSC | | 52% (24) | 53% ( 9) | 52% (33) |
| MX | | 39% (18) | 41% ( 7) | 40% (25) |
| Gender: Male | 63 | 78% (36) | 94% (16) | 83% (52) |
| Age (years) | 63 | 6.8 8.0 9.7 | 6.6 8.7 12.3 | 6.8 8.1 10.4 |
| Term: Yes | 63 | 96% (44) | 88% (15) | 94% (59) |
| FIQ | 62 | 85 98 114 | 92 108 121 | 86 100 115 |
| PIQ | 62 | 73 87 102 | 75 93 107 | 74 90 105 |
| VIQ | 62 | 92 107 122 | 100 119 130 | 92 110 124 |
Three-number summaries are lower quartile, median, and upper quartile.
N is the number of non–missing values.
Figure 1Conditional frequencies of impairment on all items.
Figure 2Scree plot of the measures of variable importance in RF.
Class-specific measures of variable importance for RF, PLS-DA and sPLS-DA
| | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| | |||||||||
| SITA | 0.17 | 0.14 | 0.15 | 0.01 | 0.10 | 0.16 | — | — | — |
| CRAW | 0.17 | 0.12 | 0.18 | 0.04 | 0.06 | 0.19 | — | — | — |
| WALK | 0.11 | 0.18 | 0.10 | 0.03 | 0.06 | 0.18 | — | — | — |
| FISE | 0.12 | 0.20 | 0.12 | 0.04 | 0.18 | 0.174,6 | — | — | — |
| OTRH | 0.15 | 0.13 | 0.27 | 0.07 | 0.06 | 0.173,6 | — | — | — |
| VISR | 0.31 | 0.13 | 0.09 | 0.20 | 0.24 | 0.19 | — | — | — |
| LEBL | 0.72 | 0.23 | 0.26⋆ | 0.35 | 0.16 | 0.20 | 0.96 | 0.03 | 0.03 — |
| PUZL | 0.64 | 0.16 | 0.10 | 0.36 | 0.21 | 0.182 | 0.96 | 0.09 | 0.09 — |
| ARTH | 0.18 | 0.14 | 0.34 | 0.11 | 0.20 | 0.24 | — | — | — |
| READ | 0.16 | 0.16 | 0.09 | 0.13 | 0.22 | 0.216 | — | — | — |
| HAWR | 0.31 | 0.20 | 0.16 | 0.02 | 0.04 | 0.125 | — | — | — |
| DYGR | 0.17 | 0.20 | 0.12 | 0.02 | 0.02 | 0.044 | — | — | — |
| HYPT | 0.08 | 0.24 | 0.14 | 0.01 | 0.31 | 0.32 | — | — | — |
| MOPA | 0.17 | 0.12 | 0.14 | 0.32 | 0.32 | 0.40 | — | — | — |
| SYNK | 0.54 | 0.13 | 0.28 | 0.25 | 0.18 | 0.23 | 0.81 | 0.16 | 0.16 — |
| DYSD | 0.30 | 0.22 | 0.22 | 0.30 | 0.37 | 0.39 | — | — | — |
| STDT | 0.06 | 0.15 | 0.15 | 0.16 | 0.04 | 0.153 | — | — | — |
| DIPR | 0.75 | 0.58 | 0.67⋆ | 0.45 | 0.98 | 0.96 | 0.91 | 0.00 | 0.91 — |
| BIDX | 0.09 | 0.16 | 0.25 | 0.07 | 0.42 | 0.39 | — | — | — |
| PRSL | 0.11 | 0.18 | 0.16 | 0.18 | 0.18 | 0.28 | — | — | — |
| IMOG | 0.79 | 0.60 | 0.65⋆ | 0.55 | 0.99 | 1.00 | 1.00 | 0.88 | 0.88 — |
| OROP | 0.36 | 0.22 | 0.28 | 0.03 | 0.27 | 0.33 | — | — | — |
| DRES | 0.28 | 0.23 | 0.19 | 0.21 | 0.09 | 0.06 | — | — | — |
| DIPE | 0.69 | 0.48 | 0.39 | 0.35 | 0.72 | 0.59 | 0.96 | 0.67 | 0.67 — |
| VISP | 0.14 | 0.20 | 0.16 | 0.00 | 0.13 | 0.16 | — | — | — |
| STAB | 0.30 | 0.17 | 0.16 | 0.14 | 0.25 | 0.20 | — | — | — |
| DYNB | 0.18 | 0.20 | 0.20 | 0.11 | 0.25 | 0.473 | — | — | — |
| CULL | 0.50 | 0.29 | 0.27⋆ | 0.11 | 0.53 | 0.61 | 0.81 | 0.59 | 0.59 — |
| POSC | 0.08 | 0.12 | 0.07 | 0.11 | 0.04 | 0.213,5 | — | — | — |
| HLUL | 0.00 | 0.17 | 0.16 | 0.14 | 0.06 | 0.09 | — | — | — |
| HMLS | 0.09 | 0.15 | 0.19 | 0.08 | 0.10 | 0.06 | — | — | — |
| HULU | 0.11 | 0.19 | 0.17 | 0.09 | 0.04 | 0.04 | — | — | — |
| MAND | 0.56 | 0.31 | 0.41⋆ | 0.10 | 0.57 | 0.66 | 0.81 | 0.64 | 0.64 — |
| BSPI | 0.17 | 0.22 | 0.22 | 0.10 | 0.08 | 0.11 | — | — | — |
| RHYA | 0.13 | 0.10 | 0.17 | 0.13 | 0.26 | 0.30 | — | — | — |
| VIMI | 1.00 | 0.30 | 0.42⋆ | 0.39 | 0.15 | 0.272 | 0.00 | 0.00 | 0.00 — |
| VISS | 0.94 | 0.34 | 0.17⋆ | 0.39 | 0.20 | 0.272 | 0.99 | 0.06 | 0.00 — |
| VISC | 0.52 | 0.22 | 0.17⋆ | 0.31 | 0.14 | 0.142 | 0.87 | 0.09 | 0.09 — |
| EXEF | 0.24 | 0.25 | 0.16 | 0.07 | 0.27 | 0.26 | — | — | — |
| AUDM | 0.10 | 0.15 | 0.18 | 0.19 | 0.05 | 0.184 | — | — | — |
| WRKM | 0.24 | 0.14 | 0.17 | 0.23 | 0.10 | 0.124 | — | — | — |
| KINM | 0.17 | 0.12 | 0.15 | 0.10 | 0.03 | 0.083,6 | — | — | — |
| VISM | 0.38 | 0.24 | 0.18 | 0.34 | 0.14 | 0.082 | — | — | — |
| AUDA | 0.17 | 0.24 | 0.15 | 0.17 | 0.27 | 0.34 | — | — | — |
| VISA | 0.24 | 0.16 | 0.19 | 0.16 | 0.21 | 0.23 | — | — | — |
| HYPK | 0.17 | 0.14 | 0.10 | 0.13 | 0.10 | 0.145 | — | — | — |
| HORP | 0.08 | 0.26 | 0.14 | 0.09 | 0.23 | 0.29 | — | — | — |
| VERP | 0.11 | 0.20 | 0.19 | 0.09 | 0.16 | 0.21 | — | — | — |
| VEPN | 0.17 | 0.23 | 0.17 | 0.00 | 0.14 | 0.21 | — | — | — |
⋆denote significant measure of variable importance in the PLS case, and upper script numbers indicate on which PLS component a variable was found significant at the 5% level.
Figure 3Sparse PLS-DA regression coefficients with associated 95% confidence intervals computed usingB=1,000 bootstrap samples.
Figure 4Plots of individual factor scores computed from (a) the RF proximities matrix and (b) PLS loadings.
Measures of predictive accuracy and clustering fitness
| | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| PLS-DA | IM | 1.00 | 1.00 | 0.625 | L* | 0.462 | 2.784 | 1.206 | |
| | VSC | 0.89 | 1.00 | 0.369 | No | 0.665 | 3.106 | 2.315 | |
| | MX | 1.00 | 0.90 | 0.270 | No | 0.605 | 2.811 | 2.416 | |
| sPLS-DA | IM | 1.00 | 1.00 | 1.000 | L* | 0.792 | 0.437 | 0.386 | |
| | VSC | 0.89 | 1.00 | 0.712 | No | 0.928 | 0.468 | 0.282 | |
| MX | 1.00 | 0.90 | 0.479 | No | 0.854 | 0.487 | 0.330 | ||
L or L* denotes isolated cluster (See text for details).
Figure 5Individual coordinates from sPLS-DA for the training (open circles) and validation (filled circles) sample. The medoids for each cluster are shown using a cross.
Figure 6Association between clinical group and variables ranked or selected as most important in RF and sPLS-DA.