| Literature DB >> 25903082 |
Tian Chen1, Wan Tang1, Ying Lu2, Xin Tu1.
Abstract
Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.Entities:
Keywords: linear regression; non-normal distribution; normal distribution; rank regression; semi-parametric regression models; sexual health
Year: 2014 PMID: 25903082 PMCID: PMC4248265 DOI: 10.11919/j.issn.1002-0829.214148
Source DB: PubMed Journal: Shanghai Arch Psychiatry ISSN: 1002-0829
Estimates (mean), asymptotic and empirical standard errors, and empirical type I error rates from fitting the classic linear, semi-parametric, and rank regression models to data simulated from normal-distributed errors
| Models | ||||||||
|---|---|---|---|---|---|---|---|---|
| mean | standard error | type I | mean | standard error | type I | |||
| asymptotic | empirical | asymptotic | empirical | |||||
| classic linear | 1.00 | 0.16 | 0.16 | 0.06 | 1.00 | 0.06 | 0.06 | 0.04 |
| semi-parametric | 1.00 | 0.16 | 0.17 | 0.05 | 1.00 | 0.07 | 0.06 | 0.04 |
| rank regression | 1.00 | 0.16 | 0.16 | 0.07 | 1.00 | 0.06 | 0.06 | 0.04 |
| classic linear | >105 | >104 | >104 | 0.09 | >105 | >104 | >104 | 1 |
| semi-parametric | >105 | >104 | >104 | 0.09 | >105 | >104 | >104 | 1 |
| rank regression | 1.11 | 0.18 | 0.18 | 0.09 | 1.06 | 0.07 | 0.07 | 0.11 |
Estimates (mean), asymptotic and empirical standard errors, and empirical type I error rates from fitting the classic linear, semi-parametric, and rank regression models to data simulated from t-distributed errors
| Models | ||||||||
|---|---|---|---|---|---|---|---|---|
| mean | standard error | type I | mean | standard error | type I | |||
| asymptotic | empirical | asymptotic | empirical | |||||
| classic linear | 0.98 | 0.16 | 0.35 | 0.05 | 1.00 | 0.07 | 0.11 | 0.05 |
| semi-parametric | 0.98 | 0.16 | 0.35 | 0.05 | 1.00 | 0.06 | 0.11 | 0.05 |
| rank regression | 1.00 | 0.12 | 0.11 | 0.05 | 1.00 | 0.05 | 0.05 | 0.06 |
| classic linear | >104 | >104 | >104 | 0.25 | >104 | >104 | >104 | 0.80 |
| semi-parametric | >104 | >104 | >104 | 0.25 | >104 | >104 | >104 | 0.80 |
| rank regression | 1.05 | 0.30 | 0.29 | 0.06 | 0.99 | 0.31 | 0.30 | 0.07 |
Estimates, standard errors, and p-values from fitting the classic linear, semi-parametric, rank regression, classic linear with trimmed outliers, and semi-parametric with trimmed outliers models to the risk-reduction intervention study
| Models | |||
|---|---|---|---|
| estimate | standard | error p-value | |
| classic linear | -6707.0 | 6667.7 | 0.315 |
| semi-parametric | -6707.0 | 6667.7 | 0.315 |
| rank regression | -0.4286 | 0.4630 | 0.355 |
| classic linear with trimmed outliers | -0.6738 | 0.9818 | 0.493 |
| semi-parametric with trimmed outliers | -0.6738 | 0.9775 | 0.491 |