| Literature DB >> 25071990 |
Abstract
The maximum likelihood estimation (MLE) method, typically used for polytomous logistic regression, is prone to bias due to both misclassification in outcome and contamination in the design matrix. Hence, robust estimators are needed. In this study, we propose such a method for nominal response data with continuous covariates. A generalized method of weighted moments (GMWM) approach is developed for dealing with contaminated polytomous response data. In this approach, distances are calculated based on individual sample moments. And Huber weights are applied to those observations with large distances. Mellow-type weights are also used to downplay leverage points. We describe theoretical properties of the proposed approach. Simulations suggest that the GMWM performs very well in correcting contamination-caused biases. An empirical application of the GMWM estimator on data from a survey demonstrates its usefulness.Entities:
Keywords: Generalized method of weighted moments; Polytomous logistic model; Robust statistics
Year: 2014 PMID: 25071990 PMCID: PMC4103096 DOI: 10.7717/peerj.467
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Summary statistics for surveyed subjects.
| Covariate | Hypertension categories | ||||
|---|---|---|---|---|---|
| Normal | Pre-hypertension | Stage 1 | Stage 2 | ||
| Gender | Male | 138 | 104 | 29 | 8 |
| Female | 87 | 114 | 31 | 9 | |
| Age | Mean | 43.2 | 48.8 | 54.3 | 60.3 |
| Std. Dev. | 13.7 | 13.8 | 12.2 | 13.4 | |
| BMI | Mean | 43.2 | 48.8 | 54.3 | 60.3 |
| Std. Dev. | 13.7 | 13.8 | 12.2 | 13.4 | |
| Sodium intake | Mean | 3.7 | 3.7 | 4.6 | 2.7 |
| Std. Dev. | 3.0 | 2.4 | 5.0 | 2.1 | |
Polytomous logistic regression of a hypertension data: coefficient estimates and standard errors from GMWM and MLE.
| Variable | Coefficients | MLE | GMWM | ||||
|---|---|---|---|---|---|---|---|
| Estimates | Std. Err | Estimates | Std. Err | ||||
| Sex |
| 0.7062 | 0.2022 | 0.0002 | 1.3339 | 0.2269 | <0.0001 |
|
| 0.9789 | 0.3235 | 0.0012 | 1.0368 | 0.3013 | 0.0003 | |
|
| 1.4193 | 0.5746 | 0.0068 | 0.6753 | 0.2195 | 0.0010 | |
| Age |
| 0.0350 | 0.0075 | <0.0001 | 0.0671 | 0.0086 | <0.0001 |
|
| 0.0715 | 0.0121 | <0.0001 | 0.1139 | 0.0133 | <0.0001 | |
|
| 0.1096 | 0.0216 | <0.0001 | 0.0753 | 0.0103 | <0.0001 | |
| BMI |
| 0.1147 | 0.0316 | 0.0001 | 0.1681 | 0.0360 | <0.0001 |
|
| 0.2422 | 0.0474 | <0.0001 | 0.4382 | 0.0538 | <0.0001 | |
|
| 0.4351 | 0.0884 | <0.0001 | 0.2279 | 0.0388 | <0.0001 | |
| Sodium |
| 0.0104 | 0.0349 | 0.3829 | 0.1831 | 0.0355 | <0.0001 |
|
| 0.0919 | 0.0426 | 0.0155 | 0.2315 | 0.0486 | <0.0001 | |
|
| −0.2699 | 0.1580 | 0.9562 | 0.2294 | 0.0353 | <0.0001 | |
Notes.
Std. Err, standard error.
Figure 2Compare odds plots of sodium intakes between MLE estimates and GMWM estimates on the population of female, age 40, and BMI 23.
Figure 1Scatter plot of distance vs. leverage, which are based on MLE.
Criteria for the distance and for the leverage are demonstrated.
Bias of parameter estimates and MSE from randomly generated data without outliers.
|
| Parameter | True | MLE | GMWM | ||||
|---|---|---|---|---|---|---|---|---|
| Bias | MSE | Coverage | Bias | MSE | Coverage | |||
| 100 |
| 1.0 | 0.0666 | 0.1030 | 0.945 | 0.0488 | 0.1986 | 0.949 |
|
| −0.3 | −0.0059 | 0.1206 | 0.957 | −0.1440 | 0.5578 | 0.952 | |
|
| −0.8 | −0.0654 | 0.1190 | 0.938 | −0.0513 | 0.2550 | 0.961 | |
|
| 0.7 | 0.0566 | 0.1892 | 0.963 | 0.2318 | 0.5468 | 0.923 | |
|
| −1.0 | −0.0853 | 0.1764 | 0.969 | −0.0691 | 0.2380 | 0.950 | |
|
| −0.5 | −0.0624 | 0.1453 | 0.945 | 0.0203 | 0.3195 | 0.964 | |
| 1000 |
| 1.0 | 0.0050 | 0.0087 | 0.956 | 0.0043 | 0.0181 | 0.962 |
|
| −0.3 | −0.0055 | 0.0105 | 0.984 | −0.0106 | 0.0333 | 0.950 | |
|
| −0.8 | −0.0039 | 0.0099 | 0.943 | −0.0013 | 0.0251 | 0.956 | |
|
| 0.7 | 0.0081 | 0.0160 | 0.968 | 0.0162 | 0.0401 | 0.954 | |
|
| −1.0 | −0.0071 | 0.0145 | 0.987 | −0.0025 | 0.0258 | 0.948 | |
|
| −0.5 | −0.0047 | 0.0122 | 0.948 | 0.0041 | 0.0361 | 0.947 | |
Comparison between GMWM and MLE estimation from randomly generated data with outliers added.
| Size | Parameter | 5% contamination | 10% contamination | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GMWM | MLE | GMWM | MLE | ||||||||||
| Bias | MSE | Coverage | Bias | MSE | Coverage | Bias | MSE | Coverage | Bias | MSE | Coverage | ||
| 100 |
| 0.0568 | 0.1102 | 0.956 | 0.0860 | 0.0884 | 0.957 | 0.0489 | 0.0999 | 0.971 | 0.0868 | 0.0819 | 0.970 |
|
| −0.0038 | 0.1427 | 0.954 | −0.0055 | 0.1528 | 0.949 | −0.0057 | 0.1510 | 0.945 | −0.0431 | 0.1461 | 0.814 | |
|
| −0.0392 | 0.1464 | 0.949 | 0.2377 | 0.1360 | 0.785 | 0.0319 | 0.1227 | 0.946 | 0.3607 | 0.1933 | 0.579 | |
|
| 0.0175 | 0.2020 | 0.944 | −0.1072 | 0.1270 | 0.921 | −0.0235 | 0.1770 | 0.943 | −0.1631 | 0.1283 | 0.949 | |
|
| 0.0374 | 0.1207 | 0.949 | 0.3848 | 0.2115 | 0.578 | 0.0207 | 0.0968 | 0.945 | 0.6088 | 0.4151 | 0.526 | |
|
| −0.0548 | 0.1572 | 0.956 | −0.0964 | 0.0904 | 0.964 | −0.0817 | 0.1349 | 0.977 | −0.1069 | 0.0803 | 0.967 | |
| 1000 |
| 0.0172 | 0.0189 | 0.939 | 0.0490 | 0.0102 | 0.932 | 0.0451 | 0.0202 | 0.944 | 0.0657 | 0.0120 | 0.900 |
|
| 0.0012 | 0.0340 | 0.945 | 0.0124 | 0.0075 | 0.952 | −0.0071 | 0.0336 | 0.952 | −0.0111 | 0.0063 | 0.822 | |
|
| 0.0260 | 0.0242 | 0.937 | 0.2874 | 0.0885 | 0.101 | 0.0164 | 0.0207 | 0.936 | 0.3876 | 0.1545 | 0.002 | |
|
| −0.0058 | 0.0356 | 0.950 | −0.1423 | 0.0345 | 0.697 | −0.0497 | 0.0346 | 0.917 | −0.2269 | 0.0658 | 0.521 | |
|
| 0.0366 | 0.0237 | 0.936 | 0.4390 | 0.2032 | 0.000 | 0.0238 | 0.0182 | 0.938 | 0.6500 | 0.4322 | 0.000 | |
|
| −0.0106 | 0.0292 | 0.951 | −0.0538 | 0.0103 | 0.940 | −0.0434 | 0.0250 | 0.953 | −0.0629 | 0.0106 | 0.902 | |