S Devika, L Jeyaseelan1, G Sebastian. 1. Department of Biostatistics, Christian Medical College, Vellore, Tamil Nadu, India.
Abstract
BACKGROUND AND OBJECTIVE: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs) with very wide 95% confidence interval (CI) (OR: >999.999, 95% CI: <0.001, >999.999). In this paper, we addressed this issue by using penalized logistic regression (PLR) method. MATERIALS AND METHODS: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. RESULTS: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13%) of the cases and in four (8.0%) of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0%) were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: <0.001, >999.999) whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48) using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86) times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41) using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. CONCLUSIONS: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in small cell values.
BACKGROUND AND OBJECTIVE: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs) with very wide 95% confidence interval (CI) (OR: >999.999, 95% CI: <0.001, >999.999). In this paper, we addressed this issue by using penalized logistic regression (PLR) method. MATERIALS AND METHODS: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. RESULTS: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13%) of the cases and in four (8.0%) of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0%) were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: <0.001, >999.999) whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48) using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86) times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41) using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. CONCLUSIONS: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in small cell values.
Authors: Shinhye Kim; Erica Sy Chuang; Suzana Sabaiduc; Romy Olsha; Samantha E Kaweski; Nathan Zelyas; Jonathan B Gubbay; Agatha N Jassem; Hugues Charest; Gaston De Serres; James A Dickinson; Danuta M Skowronski Journal: Euro Surveill Date: 2022-09
Authors: Elizabeth Buckley; Elisabeth Elder; Sarah McGill; Zahra Shahabi Kargar; Ming Li; David Roder; David Currow Journal: Breast Cancer Res Treat Date: 2021-03-21 Impact factor: 4.872
Authors: Danuta M Skowronski; Suzana Sabaiduc; Siobhan Leir; Caren Rose; Macy Zou; Michelle Murti; James A Dickinson; Romy Olsha; Jonathan B Gubbay; Matthew A Croxen; Hugues Charest; Nathalie Bastien; Yan Li; Agatha Jassem; Mel Krajden; Gaston De Serres Journal: Euro Surveill Date: 2019-11