| Literature DB >> 34228766 |
Rhoda K Moise1, Raymond Balise2, Camille Ragin3, Erin Kobetz2.
Abstract
Although decreasing rates of cervical cancer in the U.S. are attributable to health policy, immigrant women, particularly Haitians, experience disproportionate disease burden related to delayed detection and treatment. However, risk prediction and dynamics of access remain largely underexplored and unresolved in this population. This study seeks to assess cervical cancer risk and access of unscreened Haitian women. Extracted and merged from two studies, this sample includes n = 346 at-risk Haitian women in South Florida, the largest U.S. enclave of Haitians (ages 30-65 and unscreened in the previous three years). Three approaches (logistic regression [LR]; classification and regression trees [CART]; and random forest [RF]) were employed to assess the association between screening history and sociodemographic variables. LR results indicated women who reported US citizenship (OR = 3.22, 95% CI = 1.52-6.84), access to routine care (OR = 2.11, 95%CI = 1.04-4.30), and spent more years in the US (OR = 1.01, 95%CI = 1.00-1.03) were significantly more likely to report previous screening. CART results returned an accuracy of 0.75 with a tree initially splitting on women who were not citizens, then on 43 or fewer years in the U.S., and without access to routine care. RF model identified U.S. years, citizenship, and access to routine care as variables of highest importance indicated by greatest mean decreases in Gini index. The model was .79 accurate (95% CI = 0.74-0.84). This multi-pronged analysis identifies previously undocumented barriers to health screening for Haitian women. Recent US immigrants without citizenship or perceived access to routine care may be at higher risk for disease due to barriers in accessing U.S. health-systems.Entities:
Mesh:
Year: 2021 PMID: 34228766 PMCID: PMC8259954 DOI: 10.1371/journal.pone.0254089
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The PEN-3 Cultural Model.
PEN-3 domain of relationships and expectations are the key serves as the theoretical framework element for this study.
Fig 2The PEN-3 Cultural Model relationships and expectations domain.
Relationships and expectations of the PEN-3 Cultural Model were applied to theoretically organize the study variables.
Sample descriptive statistics categorical variables (n = 346).
| Variable | |
|---|---|
| Previous Pap | 226 (65.3) |
| U.S. Citizens | 80 (23.1) |
| Insured | 67 (19.4) |
| Routine Care | 76 (22.0) |
| Married | 194 (56.1) |
| Employed | 98 (28.3) |
| HS Educated | 152 (43.9) |
| Dataset (HIYA) | 148 (42.8) |
Sample descriptive statistics continuous variables (n = 346).
| Variable | Average (Interquartile Range) |
|---|---|
| Age | 46 (39, 53) |
| Length in U.S. (years) | 30.5 (8, 47) |
Bivariate and multivariate logistic regression statistics.
| Variables | Bivariate OR, (95%CI) | Multivariate OR, (95%CI) | |
|---|---|---|---|
| Age | |||
| A (30–40 years) | 1 (ref) | 1 (ref) | |
| B (41–50 years) | 2.10 (1.19, 3.72) * | 3.10 (1.19, 8.10) * | |
| C (51–65 years) | 2.29 (1.27, 4.11) ** | 2.56 (0.95, 6.89) | |
| Length in U.S. | |||
| A (= <5 years) | 1 (ref) | 1 (ref) | |
| B (6–25 years) | 3.92 (1.81, 8.50) *** | 1.85 (0.76, 4.54) | |
| C (26–40 years) | 2.45 (1.23, 4.86) * | 2.89 (1.14, 7.35) * | |
| D (41–50 years) | 3.19 (1.59, 6.38) ** | 1.52 (0.64, 3.63) | |
| E (51–65 years) | 3.83 (1.79, 8.19) *** | 1.72 (0.63, 4.67) | |
| U.S. Citizenship | 4.94 (2.44, 10.00) *** | 3.19 (1.36, 7.49) ** | |
| Insured | 2.09 (1.12, 3.90) * | 1.12 (0.51, 2.46) | |
| Routine Care | 3.18 (1.67, 6.06) *** | 2.60 (1.16, 5.81) * | |
| Married | 1.01 (0.65, 1.59) | 0.94 (0.55, 1.59) | |
| Employed | 2.09 (1.23, 3.57) ** | 1.06 (0.55, 2.02) | |
| HS Educated | 1.10 (0.07, 1.74) | 1.58 (0.89, 2.82) | |
| Dataset (HIYA) | 1.41 (0.90, 2.21) | N/A | |
Logistic regression statistics for the bivariate and multivariate analyses.
Note: Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1.
OR = Odds ratio; CI = confidence interval; Ref = Referent group; N/A = Not applicable.
Stepwise regression model (n = 333).
| Variables | Stepwise OR, (95%CI) | |
|---|---|---|
| Age | ||
| A (30–40 years) | 1 (ref) | |
| B (41–50 years) | 2.21 (1.17, 4.17) * | |
| C (51–65 years) | 1.96 (0.98, 3.94) | |
| U.S. Citizenship | 4.11 (1.83, 9.26) *** | |
| Routine Care | 2.69 (1.26, 5.71) * | |
| HS Educated | 1.60 (0.91, 2.80) | |
Stepwise regression statistics for the multivariate analysis.
Note: Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1.
OR = Odds ratio; CI = confidence interval; Ref = Referent group; N/A = Not applicable.
Fig 3Decision tree–CART model, fancy rpart plot.
There are numeric and text outputs associated with respective splits to aid in result interpretation. The figure featured at the top represents the predicted value per split. The number featured at the lower left represents the likelihood of the predicted split value. The number featured at the lower right represents the likelihood of the opposite endorsement value. The percentage at the bottom represents the size of the sample passing through the node per split. For example, with node number 2, the group is inclined to take on the activity, “yes” regarding screening history. The probability for people to say “yes” in this group is .41 while the probability for the group to take on the opposite activity, “no,” is .59. Of the 346 study subjects, 74% of the data passed through this node. Overall, the decision tree selected citizenship, length of time in the U.S., and age as important predictors in ascertaining a women’s screening history. For example, with node 1, the model asks if the participant is not a citizen. If the participant responds affirmatively, she is not a citizen, she will be classified into the node on the left. If the participant responds in opposition, “no” they are indeed as citizen, she will be classified in the node on the right.
Confusion matrix statistics for random forest (RF) training (n = 241) and test sets (n = 105).
| Statistic Type | Training Set Result | Test Set |
|---|---|---|
| Accuracy | 0.78 | 0.72 |
| 95% CI | (0.72, 0.83) | (0.61, 0.81) |
| No Information Rate | 0.65 | 0.72 |
| P-Value [Acc > NIR] | 3.01e-05 | 0.55 |
| Pos Pred Value | 0.82 | 0.50 |
| Neg Pred Value | 0.77 | 0.81 |
| Specificity | 0.94 | 0.80 |
| Sensitivity | 0.47 | 0.52 |
| McNemar’s Test P-Value | 4.84e-06 | 1.00 |
| Kappa | 0.46 | 0.31 |
| Prevalence | 0.35 | 0.28 |
| Detection Prevalence | 0.20 | 0.29 |
| Balanced Accuracy | 0.71 | 0.66 |
Fig 4Random Forest (RF) important variables.
Results of RF variables of importance are featured including order of ranked importance.