| Literature DB >> 17963513 |
Bernd Genser, Philip J Cooper, Maria Yazdanbakhsh, Mauricio L Barreto, Laura C Rodrigues.
Abstract
BACKGROUND: The number of subjects that can be recruited in immunological studies and the number of immunological parameters that can be measured has increased rapidly over the past decade and is likely to continue to expand. Large and complex immunological datasets can now be used to investigate complex scientific questions, but to make the most of the potential in such data and to get the right answers sophisticated statistical approaches are necessary. Such approaches are used in many other scientific disciplines, but immunological studies on the whole still use simple statistical techniques for data analysis.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17963513 PMCID: PMC2234437 DOI: 10.1186/1471-2172-8-27
Source DB: PubMed Journal: BMC Immunol ISSN: 1471-2172 Impact factor: 3.615
Results of a literature review conducted in the medical database MEDLINE (1980–2006) about statistical methods found in immunological studies investigating cytokine expressions.
| Analysis of variance | 2908 |
| T-test | 420 |
| Mann-Whitney U-test | 316 |
| Wilcoxon/McNemar test | 193 |
| Univariate linear regression | 163 |
| Bivariate correlation analysis | 157 |
| Kruskal-Wallis H-test | 95 |
| Repeated measures analysis of variance | 31 |
| Friedman test | 7 |
| Non linear regression | 5 |
| Logistic regression | 629 |
| Cluster analysis | 192 |
| Multivariate analysis of variance | 144 |
| Multiple linear regression | 91 |
| Factor analysis/Principal components analysis | 80 |
| Analysis of covariance | 56 |
| Linear discriminant analysis | 51 |
| Partial correlation coefficient | 24 |
| Multinomial logistic regression | 9 |
| Multivariate analysis of covariance | 7 |
| Path analysis/Structural equation modelling | 4 |
Selection of important statistical methods suitable for the analysis of immunological data.
| Compare expression of a cytokine between two independent groups (e.g. treatment vs. control) | D: continuous | Normal distribution homogeneity of variances | t-test |
| D: continuous or ordinal | Mann Whitney-U test | ||
| Compare expression of a cytokine between two related groups (e.g. before and after treatment) | D: continuous | Normal distribution, homogeneity of variances | Paired t-test |
| D: continuous or ordinal | Wilcoxon rank sum test | ||
| Compare expression of a cytokine between three or more independent groups defined by one factor (e.g. treatments A, B, C) | D: continuous | Normal distribution, homogeneity of variances | One-way analysis of variance |
| D: continuous or ordinal | Kruskal Wallis – H test | ||
| Compare expression of a cytokine between three or more related groups (e.g. measurements 1, 2, and 3 weeks after treatment) | D: continuous | Multivariate normal distribution, assumptions about covariance | Repeated measurements analysis of variance |
| D: continuous or ordinal | Friedman's ANOVA | ||
| Quantify association between two cytokines or a cytokine and another continuous variable | D: continuous | Linear relationship, normality | Pearson correlation coefficient |
| D: continuous or ordinal | Linear relationship | Spearman rank correlation coefficient | |
| Predicting expression of a cytokine by a continuous independent variable | D: continuous | Specified relationship (e.g. linearity for linear regression), normal distribution (for parametric regression) | Univariate regression |
| Quantify associations between two cytokines adjusted for the effect of a third continuous variable | All variables: continuous | Linear relationship, normality | Partial correlation coefficient |
| Predicting a continuous outcome (e.g. a cytokine) by several continuous or categorical independent variables | D: continuous | Specified relationship (e.g. linearity for linear regression), normal distribution for parametric regression, No multi-collinearity | Multiple regression |
| Specified relationship, multi-collinearity | Partial least squares regression | ||
| Quantifying the magnitude of correlation between two groups of continuous variables (e.g. Th1 and Th2 related cytokines) | All variables: continuous | Canonical correlation analysis | |
| Compare cytokine expressions between three or more independent groups defined by two or more factors (e.g. treatment and gender) | D: continuous | Normal distribution, homogeneity of variances | Multi-way analysis of variance (ANOVA) |
| Simultaneously compare expressions of two or more cytokines between three or more independent groups defined by two or more factors | D: continuous | Multivariate normal distribution, homogeneity of covariance matrices | Multivariate analysis of variance (MANOVA) |
| Compare cytokine expressions between three or more related groups defined by two or more factors (e.g. measurements at different time points during a study and treatment) | D: continuous | Multivariate normality, homogeneity of covariance matrices | Multi-way repeated measurements analysis of variance |
| Grouping set of correlated cytokines to summary variables ("principal components") | All variables: continuous | High degree of multicollinearity | Factor analysis/Principal components analysis |
| Grouping subjects in homogenous subgroups according to similar expression levels of two or more cytokines | All variables: continuous | Low degree of multicollinearity | Cluster analysis |
| Explaining or predicting group membership of two or more independent groups by cytokine levels | D: categorical | Multivariate normal distribution, equal covariance matrices, low degree of multicollinearity | Linear discriminant analysis |
| Explaining or predicting group membership of two independent groups by cytokine levels | D: categorical | Logistic regression | |
| Explaining or predicting group membership of three or more groups by cytokine levels | D: categorical | Multinomial logistic regression | |
| Modelling multiple relationships between several immunological parameters and one or more outcome variables | All variables: categorical, ordinal or continuous data | Conceptual framework specifying the multiple relationships among the study variables | Path analysis/Structural equation modelling |
All univariate and multivariate statistical approaches listed above can be implemented in general purpose statistical packages, e.g. among others S-PLUS® (Insightful Corporation, Seattle, WA), SAS® (SAS Institute Cary, NC, USA), SPSS® (Chicago: SPSS Inc.) or STATA® (StataCorp. Stata Statistical Software. College Station, TX: StataCorp LP). Path analysis/structural equation modelling can be implemented in STATA and SPSS that provide the extensions modules GLLAMM and AMOS, respectively, as well as in several special purpose software packages, e.g. among others LISREL® (Scientific Software International, Inc, IL, USA) or MX® (MCV, Department of Psychiatry, Richmond, VA, USA).
Figure 1Selecting the appropriate statistical technique for analysis of immunological data.
Figure 2Conceptual framework that specifies multiple associations between potential risk factors, immunological parameters and outcomes (atopy and asthma).
Figure 3Example path diagram that specifies a structural equation model with two latent variables.