| Literature DB >> 34952854 |
Abstract
Family medicine has traditionally prioritised patient care over research. However, recent recommendations to strengthen family medicine include calls to focus more on research including improving research methods used in the field. Binary logistic regression is one method frequently used in family medicine research to classify, explain or predict the values of some characteristic, behaviour or outcome. The binary logistic regression model relies on assumptions including independent observations, no perfect multicollinearity and linearity. The model produces ORs, which suggest increased, decreased or no change in odds of being in one category of the outcome with an increase in the value of the predictor. Model significance quantifies whether the model is better than the baseline value (ie, the percentage of people with the outcome) at explaining or predicting whether the observed cases in the data set have the outcome. One model fit measure is the count- [Formula: see text], which is the percentage of observations where the model correctly predicted the outcome variable value. Related to the count- [Formula: see text] are model sensitivity-the percentage of those with the outcome who were correctly predicted to have the outcome-and specificity-the percentage of those without the outcome who were correctly predicted to not have the outcome. Complete model reporting for binary logistic regression includes descriptive statistics, a statement on whether assumptions were checked and met, ORs and CIs for each predictor, overall model significance and overall model fit. © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.Entities:
Keywords: education; epidemiology; public health
Mesh:
Year: 2021 PMID: 34952854 PMCID: PMC8710907 DOI: 10.1136/fmch-2021-001290
Source DB: PubMed Journal: Fam Med Community Health ISSN: 2305-6983
Figure 1Histogram showing distribution of years smoking for a sample of 32 smokers.
Example table showing characteristics of people in a small data set (n=32)
| Characteristic | Category | n (%) |
| Ever diagnosed with lung cancer | No lung cancer diagnosis | 18 (56.2) |
| Yes lung cancer diagnosis | 14 (43.8) | |
| Body mass index category | Underweight or normal | 19 (59.4) |
| Overweight or obese | 13 (40.6) | |
| Years spent smoking | Median (IQR) | 19.2 (15.4–22.8) |
Example of a stratified table showing characteristics of people by lung cancer status in a small data set (n=32)
| Lung cancer | No | Yes | |
| Years spent smoking | Median (IQR) | 15.7 (14.8–19.1) | 22.8 (21.4–29.6) |
| Body mass index category | Underweight or normal | 12 (66.7) | 7 (50.0) |
| Overweight or obese | 6 (33.3) | 7 (50.0) |
Figure 2Checking the linearity assumption graphically.
Figure 3The logistic function with example data.
Contingency table showing observed and predicted values of the outcome for the lung cancer model
| Number observed | |||
| Number predicted | 1 | 0 | Sum |
| 1 | 10 | 3 | 13 |
| 0 | 4 | 15 | 19 |
| Sum | 14 | 18 | 32 |