| Literature DB >> 29884882 |
John Schneider1,2,3, L Philip Schumm4, Maya Fraser5, Vijay Yeldandi6, Chuanhong Liao4.
Abstract
Contact tracing for venereal disease control has been widespread since 1936 and relies on reported information about contacts' attributes to determine whether two contacts may represent the same individual. We developed and implemented a gold-standard for determining overlap between contacts reported by different individuals using cell phone numbers as unique identifiers. This method was then used to evaluate the performance of using reported names and demographic characteristics to infer overlap. Cell-phone numbers, names and demographic data for a sample of high-risk men in India and their contacts were collected using a novel, hybrid instrument involving both cell-phone data extraction and Computer-Assisted Personal Interviewing (CAPI). Logistic regression was used to model the probability that a pair of contacts reported by different respondents were identical, based on the correspondence between their reported names and attributes. A discrete mixture model is proposed which provides predictions nearly as good as the logistic model but may be used in a new population without re-calibration. Despite achieving AUCs of 0.83-0.86, the low rate of true overlap among a very large number of contact pairs still results in a high rate of false positives. Next generation contact tracing calls for more archived or digital matching processes.Entities:
Mesh:
Year: 2018 PMID: 29884882 PMCID: PMC5993735 DOI: 10.1038/s41598-018-26794-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Example contact tracing data with phone number as the gold standard.
| Entry | Name | Age | Race | Neighborhood | Marital Status | Phone # |
|---|---|---|---|---|---|---|
| 1 | Pat | 35 | White | West End | Married | 555–1111 |
| 2 | Patrick | 35 | White | Lakeview | Married | 555–2222 |
| 3 | Mark | 25 | Latino | Woodlawn | Married | 555–3333 |
| 4 | PJ | 33 | White | Lincoln Park | Single | 555–2222 |
| 5 | Fred | 20 | Black | Lawndale | Single | 555–4444 |
Contact tracing information is not only subject to standard sources of reporting error, but also intentional error due to sensitivities surrounding sex partner information and disclosure. In this fictitious example, we see a potential equivalence between Entries 1 and 2, however according to the gold standard these are not the same individual because the phone numbers do not match. In contrast, Entries 2 and 4 are identical (i.e. the same person) as determined by the identical phone numbers—a match that may have been missed when following traditional matching algorithms.
Figure 1Subscriber Identity Module (SIM) card reader. The SIM card reader was assembled using a kit from Adafruit Industries (New York, NY). The card reader is operated by means of a program written using PySIM, a free open-source SIM card-reading software package.
Characteristics of MSM respondents and their MSM cell phone contacts.
| MSM respondents (n = 229) | Total MSM contacts (n = 6,718) | |
|---|---|---|
|
| 26.7 (6.8) | 28.2 (6.8) |
|
| ||
| Never married | 147 (63.9%) | 4332 (64.5%) |
| Married | 58 (25.2%) | 2197 (32.7%) |
| Separated/Divorced/Widowed | 25 (10.9%) | 179 (2.7%) |
| Don’t know | 0 (0.0%) | 10 (0.1%) |
|
| ||
| Receptive | 104 (45.2%) | 3368 (50.1%) |
| Versatile | 63 (27.4%) | 2434 (36.2%) |
| Insertive | 63 (27.4%) | 861 (12.8%) |
| Don’t know/Missing | 0 (0.0%) | 55 (0.8%) |
|
| ||
| Hindu | 190 (82.6%) | 5480 (81.6%) |
| Muslim | 25 (10.9%) | 881 (13.1%) |
| Christian | 15 (6.5%) | 295 (4.4%) |
| Other/Don’t know | 0 (0.0%) | 62 (0.9%) |
Logistic regression models predicting gold-standard verified identical pairs.
| Client characteristic | 95% CI | 95% CI | ||
|---|---|---|---|---|
| Odds ratio | Odds ratio | |||
|
| ||||
| Match | 180.7*** | (135.6, 240.8) | 131.7*** | (68.7, 252.6) |
| Sex role | 3.5*** | (3.3, 3.8) | 1.4 | (1.0, 2.0) |
| Religion | 3.1*** | (2.8, 3.4) | 3.6*** | (2.2,6.1) |
| Neighborhood | 3.0*** | (2.7, 3.3) | 2.5*** | (1.6, 4.0) |
| Marital status | 1.9*** | (1.7, 2.0) | 1.7** | (1.2, 2.5) |
| Age +/−5 yrs | 1.6*** | (1.5, 1.7) | 1.7** | (1.2,2.6) |
| Part of a triad | 3.0*** | (2.7, 3.3) | 4.0*** | (2.6, 6.3) |
*p < 0.05; **p < 0.01; ***p < 0.001.
1All identical pairs together with a 2:1 random subsample of non-identical pairs.
Figure 2Heat map and plot demonstrating the probability of a true match given a set of client characteristics among randomly selected pairs of individuals from a large male sex network in South India.
Estimates from latent class model of characteristic matching fit to sample of identical and non-identical pairs (n = 23,459)1.
|
| ||
|---|---|---|
| Identical pair | Non-identical pair | |
| Estimated class proportion | 0.31 | 0.69 |
|
| ||
| First name | 0.35 | 0.03 |
| Sex role | 0.73 | 0.40 |
| Religion | 0.93 | 0.68 |
| Neighborhood | 0.27 | 0.05 |
| Marital status | 0.83 | 0.49 |
| Age +/−5yrs | 0.78 | 0.48 |
1All identical pairs together with a 2:1 random subsample of non-identical pairs.
Figure 3Sensitivities and specificities of contact matches. Panel A depicts the Area under the curve in parentheses (AUC) for likelihood of a contact match. Panel B highlights the number of pairs that match on given characteristics and resulting probability of having identical contacts.