| Literature DB >> 26766425 |
Erika L Moen1,2, Catherine J Fricano-Kugler3, Bryan W Luikart3, A James O'Malley1,4.
Abstract
A conventional study design among medical and biological experimentalists involves collecting multiple measurements from a study subject. For example, experiments utilizing mouse models in neuroscience often involve collecting multiple neuron measurements per mouse to increase the number of observations without requiring a large number of mice. This leads to a form of statistical dependence referred to as clustering. Inappropriate analyses of clustered data have resulted in several recent critiques of neuroscience research that suggest the bar for statistical analyses within the field is set too low. We compare naïve analytical approaches to marginal, fixed-effect, and mixed-effect models and provide guidelines for when each of these models is most appropriate based on study design. We demonstrate the influence of clustering on a between-mouse treatment effect, a within-mouse treatment effect, and an interaction effect between the two. Our analyses demonstrate that these statistical approaches can give substantially different results, primarily when the analyses include a between-mouse treatment effect. In a novel analysis from a neuroscience perspective, we also refine the mixed-effect approach through the inclusion of an aggregate mouse-level counterpart to a within-mouse (neuron level) treatment as an additional predictor by adapting an advanced modeling technique that has been used in social science research and show that this yields more informative results. Based on these findings, we emphasize the importance of appropriate analyses of clustered data, and we aim for this work to serve as a resource for when one is deciding which approach will work best for a given study.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26766425 PMCID: PMC4713068 DOI: 10.1371/journal.pone.0146721
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Experimental design underlying the neuroscience dataset.
The mouse-level treatment was fatty acid delivery, vehicle control, or no treatment. The neuron-level treatment was Pten or control shRNA. The two levels of treatment resulted in a hierarchical study design with a between-mouse and within-mouse treatment factor.
Potential research questions testable by Pten knockdown and fatty acid environment study data.
| Research question | Relevant neuron/mouse population in a dedicated study of the research question |
|---|---|
| 1. Is there an effect of fatty acid on soma size? | Neurons not exposed to Pten shRNA in mice exposed to fatty acid or vehicle control |
| 2. Is there an effect of Pten knockdown on soma size? | Mice not exposed to fatty acid or vehicle control |
| 3. Does the proportion of neurons with Pten shRNA ( | Mice not exposed to fatty acid or vehicle control |
| 4. Is there an interaction effect of Pten knockdown and fatty acid on soma size? | Mice exposed to fatty acid or vehicle control |
Note: These examples are expanded upon in later sections of the text.
Characteristics of marginal, fixed-effect, and mixed-effect models.
| Characteristic | Marginal | Fixed-effect | Mixed-effect |
|---|---|---|---|
| Distinguishes observations belonging to the same or different subjects | Yes | Yes | Yes |
| Reliant on distribution of subject-specific effects | No | No | Yes |
| Subjects considered a sample from a population larger than the sample itself | Yes | No | Yes |
| Computation handles few subjects well | No | Yes | No |
| Computation handles a very large number of subjects well | Yes | No | Yes |
| Noisy for few observations per subject | No | Yes | No |
| Computation handles a large number of observations per subject | Depends | Yes | Yes |
| Accommodates variable observations per subject | Yes | Yes | Yes |
Note: aOnly for calculation of standard errors.
bProblems can arise under some specifications of the working covariance structure and depending on the estimation method used.
Fig 2Visualization of clustered data.
(A) Visualization of a between-mouse factor. Each point represents the mean soma size of a mouse ± standard error (SE). [SE = standard deviation (mean soma size)]. (B) Visualization of a within-mouse factor. Each point represents the soma size of an individual neuron within a mouse. The colors correspond to the mouse to which the neurons belonged. Each mouse has neurons with control and Pten shRNA. (C) Visualization of an interaction effect between the within-mouse and between-mouse factor. The red dotted line represents the vehicle control mice and the blue solid line represents the fatty acid delivery mice. Pten knockdown status 0 = control shRNA and 1 = Pten shRNA. The black dotted line depicts the expected result if there were no interaction effect, and the space between the black dotted line and the blue line, denoted by the curly bracket, represents the size of the interaction effect.
Results of fatty acid exposure on soma size from different regression models (Research Q1).
| Coefficient | Std Err | 95% CI | ||
|---|---|---|---|---|
| Neuron-level linear regression | 3.15 | 1.90 | 0.099 | -0.60, 6.90 |
| Mouse-level regression (mean); no weighting | 0.61 | 6.37 | 0.926 | -14.47, 15.69 |
| Mouse-level regression (mean); analytic weights | 3.15 | 4.73 | 0.527 | -8.04, 14.33 |
| Marginal regression | 2.31 | 4.86 | 0.635 | -7.22, 11.83 |
| Fixed effect regression | n/a | |||
| Mixed-effect regression | 1.56 | 5.04 | 0.756 | -8.32, 11.44 |
Note: The coefficient captures the effect of the fatty acid environment compared to the vehicle control.
aThese are neuron-level regressions.
Results of Pten knockdown effect on soma size from different regression models (Research Q2).
| Coefficient | Std Err | 95% CI | ||
|---|---|---|---|---|
| Neuron-level linear regression | 11.04 | 1.98 | <0.001 | 7.15, 14.92 |
| Marginal regression | 11.51 | 2.46 | <0.001 | 6.70, 16.32 |
| Fixed-effect regression | 11.54 | 1.86 | <0.001 | 7.88, 15.20 |
| Mixed-effect regression | 11.50 | 2.45 | <0.001 | 6.70, 16.30 |
| Mixed effect regression with | 11.54 | 2.49 | <0.001 | 6.66, 16.41 |
Note: The coefficient represents the effect of Pten knockdown on soma size.
Results of () knockdown effect on soma size from different regression models (Research Q3).
| Coefficient | Std Err | 95% CI | ||
|---|---|---|---|---|
| Mouse-level regression ( | -0.24 | 0.67 | 0.744 | -2.36, 1.88 |
| Marginal regression | -0.11 | 0.20 | 0.581 | -0.50, 0.28 |
| Mixed-effect regression | -0.004 | 0.19 | 0.981 | -0.37, 0.37 |
| Mixed-effect regression with Pten as an additional predictor | -0.11 | 0.20 | 0.562 | -0.51, 0.28 |
Note: The coefficient represents the effect of on soma size.
Results of interaction between Pten knockdown and fatty acid exposure models (Research Q4).
| Coefficient | Std Err | 95% CI | ||
|---|---|---|---|---|
| Marginal regression | 10.38 | 5.12 | 0.043 | 3.38, 20.42 |
| Fixed-effect regression | 8.18 | 2.75 | 0.003 | 2.77, 13.58 |
| Mixed-effect regression | 10.18 | 5.08 | 0.045 | 0.23, 20.13 |
Note: The coefficient captures the effect of the interaction (fattyacid×pten) on soma size.
Fig 3Main decision points for statistical analysis of clustered data.
The flow chart outlines the primary questions researchers should address when weighing options of statistical research design of a study with clustered data. *Readers should refer to Table 2 for subtler differences between the marginal and mixed-effect model.