| Literature DB >> 34686604 |
Lisa Lu1, Benjamin Anderson1, Raymond Ha1, Alexis D'Agostino2, Sarah L Rudman2, Derek Ouyang1, Daniel E Ho3.
Abstract
Contact tracing is a pillar of COVID-19 response, but language access and equity have posed major obstacles. COVID-19 has disproportionately affected minority communities with many non-English-speaking members. Language discordance can increase processing times and hamper the trust building necessary for effective contact tracing. We demonstrate how matching predicted patient language with contact tracer language can enhance contact tracing. First, we show how to use machine learning to combine information from sparse laboratory reports with richer census data to predict the language of an incoming case. Second, we embed this method in the highly demanding environment of actual contact tracing with high volumes of cases in Santa Clara County, CA. Third, we evaluate this language-matching intervention in a randomized controlled trial. We show that this low-touch intervention results in 1) significant time savings, shortening the time from opening of cases to completion of the initial interview by nearly 14 h and increasing same-day completion by 12%, and 2) improved engagement, reducing the refusal to interview by 4%. These findings have important implications for reducing social disparities in COVID-19; improving equity in healthcare access; and, more broadly, leveling language differences in public services.Entities:
Keywords: COVID-19; contact tracing; health equity; language access; machine learning
Mesh:
Year: 2021 PMID: 34686604 PMCID: PMC8639369 DOI: 10.1073/pnas.2109443118
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1Distribution of Spanish speakers across all age, address score, first-name score, and last-name score bins in the train split of the L2 dataset. Each tile represents a bin that an individual may fall into based on the individual’s age, address score, first-name score, and last-name score. The size of the point in the bin corresponds to the total count of individuals in that bin. The color of the point corresponds to the percentage of individuals in that bin that are Spanish speakers. The gray shading represents the risk score cutoffs we use in our algorithm. Any individuals belonging to bins in the darker gray shade are flagged as Spanish speakers. Individuals in bins in the lighter gray shade are not flagged. A description of how the cutoffs are determined is in .
Fig. 2Performance of the heuristic model visualized as receiver operating characteristic and precision-recall curves along with AUCs. These curves are generated by evaluating the model at every classification threshold (depicted by the blue palette) on the test split from the L2 dataset. The high AUCs show the model’s ability to perform well in the context it was trained on.
Balance on subset of covariates between cases randomly assigned to the LST and cases not selected between 3 December 2020 and 6 February 2021
| As randomized | |||
| Control | Treatment | ||
| Individual-level variables | |||
| Spanish propensity score | 0.86 | 0.86 | 0.26 |
| Male | 0.53 | 0.52 | 0.31 |
| CBG-level variables | |||
| Social vulnerability index | 74.55 | 75.44 | 0.41 |
| Below poverty, % | 7.42 | 7.54 | 0.76 |
| Unemployed, % | 3.51 | 3.49 | 0.41 |
| Per capita income, $1,000 | 35.14 | 35.46 | 0.33 |
| No high school diploma, % | 24.93 | 24.66 | 0.55 |
| Aged 65 y or older, % | 11.35 | 11.32 | 0.39 |
| Aged 17 y or younger, % | 24.43 | 24.13 | 0.22 |
| Civilian with a disability, % | 11.09 | 11.05 | 0.97 |
| Single-parent households, % | 13.17 | 13.32 | 0.91 |
| Minority, % | 81.90 | 82.12 | 0.83 |
| Sample size | 1,601 | 1,424 | |
Shown is the balance check on a subset of individual-level and CBG-level variables to ensure that the treatment group (cases randomly assigned to the LST) and control group (cases not assigned to the LST) are comparable. The as randomized data contain units with nonmissing values for all balance covariates of interest. There are no units with missing data for the individual-level variables. There are 2 units with incomplete data for the CBG-level variables, and these are omitted. The sample used for the balance check consists of 64 batches.
Analyses on the effect of language matching on outcomes
| Intention-to-treat effect | Complier average causal effect | |||||||
| Outcome | Control | LST | Effect | SE | Effect | SE | ||
| Interpretation service used | 0.23 | 0.21 | –0.03* | 0.02 | 0.07 | –0.11 | 0.07 | 0.11 |
| Interview completed | 0.74 | 0.75 | 0.00 | 0.02 | 0.93 | 0.01 | 0.08 | 0.87 |
| Refused to interview | 0.02 | 0.01 | –0.01* | 0.00 | 0.05 | –0.04** | 0.02 | <0.05 |
| Refused to provide contacts | 0.02 | 0.02 | 0.00 | 0.01 | 0.76 | 0.00 | 0.03 | 0.86 |
| Time to interview completion, d | 1.25 | 1.04 | –0.14*** | 0.06 | <0.01 | –0.57 ** | 0.27 | 0.04 |
| Time case open, d | 4.14 | 4.01 | 0.14 | 0.35 | 0.68 | 1.01 | 1.48 | 0.50 |
| Interview completed within 24 h | 0.65 | 0.68 | 0.04** | 0.02 | 0.03 | 0.14 | 0.09 | 0.10 |
| Interview completed on the same day | 0.14 | 0.28 | 0.05*** | 0.01 | <0.01 | 0.12 ** | 0.06 | 0.03 |
| Average number of contacts provided | 0.19 | 0.20 | 0.03 | 0.04 | 0.44 | 0.08 | 0.15 | 0.60 |
| At least one contact provided | 0.08 | 0.08 | 0.00 | 0.01 | 0.74 | 0.00 | 0.04 | 0.94 |
| Sample size | 1601 | 1424 | ||||||
*P < 0.10, **P < 0.05, ***P < 0.01. Shown is the effect of the intervention on cases assigned to the LST and cases not assigned to the LST (control). Batches with treatment or control groups smaller than two are dropped from the analyses. There were 62 batches in the ITT sample. For the ITT estimates, the average treatment effect (ATE) and SEs are estimated as shown in ref. 15. The CACE was estimated with the iv_robust() command in the estimatr package. First-stage results and diagnostic statistics are reported in Materials and Methods.