| Literature DB >> 35096744 |
Kristin N Nelson1, Sarah Talarico2, Shameer Poonja3, Clinton J McDaniel2, Martin Cilnis4, Alicia H Chang3, Kala Raz2, Wendy S Noboa3, Lauren Cowan2, Tambi Shaw4, James Posey2, Benjamin J Silk2.
Abstract
Tuberculosis (TB) control programs use whole-genome sequencing (WGS) of Mycobacterium tuberculosis (Mtb) for detecting and investigating TB case clusters. Existence of few genomic differences between Mtb isolates might indicate TB cases are the result of recent transmission. However, the variable and sometimes long duration of latent infection, combined with uncertainty in the Mtb mutation rate during latency, can complicate interpretation of WGS results. To estimate the association between infection duration and single nucleotide polymorphism (SNP) accumulation in the Mtb genome, we first analyzed pairwise SNP differences among TB cases from Los Angeles County, California, with strong epidemiologic links. We found that SNP distance alone was insufficient for concluding that cases are linked through recent transmission. Second, we describe a well-characterized cluster of TB cases in California to illustrate the role of genomic data in conclusions regarding recent transmission. Longer presumed latent periods were inconsistently associated with larger SNP differences. Our analyses suggest that WGS alone cannot be used to definitively determine that a case is attributable to recent transmission. Methods for integrating clinical, epidemiologic, and genomic data can guide conclusions regarding the likelihood of recent transmission, providing local public health practitioners with better tools for monitoring and investigating TB transmission.Entities:
Keywords: genomic sequencing; prevention and control; public health practice; tuberculosis transmission; tuberculosis—epidemiology
Mesh:
Year: 2022 PMID: 35096744 PMCID: PMC8793027 DOI: 10.3389/fpubh.2021.790544
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1A hypothetical neighbor-joining tree (phylogenetic analysis) representing the genetic distances in single nucleotide polymorphisms (SNPs) among 15 isolates of Mycobacterium tuberculosis (Mtb) from culture-confirmed tuberculosis cases reported during 2010–2020a. aIsolates are displayed as circles called nodes; isolates with the same genome sequence are displayed together in one node. Lines between nodes are labeled with the number of SNPs (mutations at a single position in the DNA sequence), and these lines are proportional in length to the number of SNPs. The most recent common ancestor (MRCA) is a hypothetical genome (not an actual isolate) from which all isolates in the phylogenetic analysis are descended. The MRCA serves as a reference point for examining the direction of genetic change. In this hypothetical scenario, a TB control program is using these phylogenetic analysis results to help determine if isolates from two patients reported during 2020 (shaded in gray) are likely attributable to recent transmission. If so, those cases are a priority for further investigation. Interpretation of these results, depending on if the Mtb mutation rate is assumed to be similar during latency or to be slower, is as follows: Case 1 is unlikely to be involved in recent transmission under either assumption because the patient's isolate is genetically distant to that of the patient in Case 2 and all other cases in the analysis (≥19 SNPs). This interpretation might change as new cases are reported and added to the phylogenetic analysis. Case 2 is more challenging to interpret: under an assumption that Mtb mutates at a similar rate during latent infection and disease, Case 2 is likely to be involved in recent transmission because the patient's isolate is genetically close to those of other cases in the analysis (0–2 SNPs). If Case 2 was attributable to reactivation after Mtb infection during the remote past, more SNPs can be expected. Under an assumption that Mtb mutates at a slower rate during latency than disease, Case 2 might be involved in recent transmission or attributable to reactivation after Mtb infection during the remote past because relatively few SNPs are expected to accumulate during latency. Other clinical and epidemiologic data are needed for making a determination.
Figure 2Schematic timeline of infection and transmission of Mycobacterium tuberculosis (Mtb) infection in a hypothetical case pair to illustrate timing of case-pair interval calculationsa. aThe case-pair intervals estimate the duration of latent Mtb infection and disease of the secondary patient or the period during which SNPs are expected to accumulate in the sampled Mtb genomes for differentiating source and secondary patients' isolates. We calculated the case-pair interval as the observed time difference between specimen collection dates for the source–secondary case pair (T1 to T2). However, this does not include a portion of the secondary patient's Mtb infection before specimen collection for the first patient. To account for this portion, we also calculated a modified case-pair interval, which we defined as the time between the midpoint of the source patient's estimated TB infectious period and the sample collection date of the secondary patient (T0 to T2).
Figure 3Inclusion criteria for the analytic data set of genotype-matched TB source–secondary case pairs in Los Angeles (LA) County, California, 2015–2018a. aGenotype-matched clusters are defined on the basis of a combination of spacer oligonucleotide typing (spoligotyping) and 24-locus mycobacterial interspersed repetitive unit–variable number tandem repeat (MIRU-VNTR) genotyping results. Standardized methods for whole-genome sequencing (WGS) and phylogenetic analysis were applied, as described in the text.
Demographic, molecular, and clinical characteristics of patients in the analytic data set of genotype-matched TB source–secondary case pairs vs. other investigated cases in Los Angeles County, California, 2015–2018.
|
|
| |
|---|---|---|
| Demographic characteristics | ||
| Sex | ||
| Male | 66 (62) | 581 (66) |
| Female | 39 (37) | 287 (33) |
| Unknown/missing | 1 (1) | 10 (1) |
| Median age (yrs) (IQR) | 39 (22–47) | 50 (34–63) |
| Race/ethnicity | ||
| Asian | 21 (20) | 292 (33) |
| Black/African American | 5 (5) | 110 (13) |
| Hispanic/Latino | 76 (72) | 428 (49) |
| White | 3 (3) | 35 (4) |
| Unknown/missing | 1 (1) | 10 (1) |
| Birthplace | ||
| US-born | 35 (33) | 250 (28) |
| Non-US–born | 70 (66) | 613 (70) |
| Unknown/missing | 1 (1) | 15 (2) |
| TB diagnosis location | ||
| In Los Angeles County | 103 (97) | 852 (97) |
| Outside Los Angeles County | 3 (3) | 26 (3) |
| Molecular characteristics | ||
| Culture status | ||
| Positive | 105 (99) | 843 (96) |
| Negative | 0 | 14 (2) |
| No results reported | 1 (1) | 21 (2) |
| Number of genotype-matched clusters | 44 | 178 |
| Median number of cases (IQR) | 5 (3–8) | 7 (4–12) |
| Minimum number of cases | 2 | 2 |
| Maximum number of cases | 14 | 38 |
| Social characteristics | ||
| Any substance use | 28 (26) | 182 (21) |
| Excess alcohol use during previous year | 16 (57) | 137 (75) |
| Injection drug use during previous year | 3 (11) | 15 (8) |
| Non-injection drug use during previous year | 17 (61) | 86 (47) |
| No reported history of substance use during previous year | 75 (71) | 650 (74) |
| Unknown/missing | 3 (3) | 46 (5) |
| Homelessness | ||
| Homeless during the year before diagnosis | 7 (7) | 120 (14) |
| No known history of homelessness during the year before diagnosis | 98 (92) | 744 (85) |
| Unknown/missing | 1 (1) | 14 (2) |
| Incarceration | ||
| TB diagnosed while patient was incarcerated | 0 | 24 (3) |
| TB diagnosed while patient was not incarcerated | 105 (99) | 844 (96) |
| Unknown/missing | 1 (1) | 10 (1) |
| Clinical characteristics | ||
| HIV testing | ||
| Positive | 1 (1) | 30 (3) |
| Negative | 88 (83) | 656 (75) |
| Not offered | 7 (7) | 66 (8) |
| Refused | 0 (0) | 7 (1) |
| Unknown/missing | 10 (9) | 119 (14) |
| Other | ||
| Immunosuppression other than HIV | 0 | 45 (5) |
| Tumor necrosis factor-α antagonist therapy | 0 | 8 (1) |
| Post-organ transplantation | 0 | 7 (1) |
| Diabetes mellitus | 20 (19) | 231 (26) |
| End-stage renal disease | 1 (1) | 26 (3) |
| Any first line-drug resistance | 4 (4) | 107 (12) |
IQR, interquartile range.
Hispanic/Latino includes all persons with Hispanic/Latino ethnicity. Other categories include persons with non-Hispanic/Latino ethnicity and the respective race.
US-born is based on eligibility for citizenship at birth and includes people born overseas to parents who are US citizens.
Frequencies of patients with excess alcohol use and injection and non-injection drug use reported during the previous year are tabulated among the subset of patients reporting any substance use. These categories of substance use are not mutually exclusive (i.e., a patient with multiple substance use may be counted multiple times).
Epidemiologic and infectiousness characteristics of genotype-matched TB source–secondary case pairs in Los Angeles County, California, 2015–2018, included in the analytic data set.
|
| |
|---|---|
| Strength of epidemiologic link | |
| Definite | 55 (93) |
| Probable | 4 (7) |
| Possible | 0 (0) |
| Type of epidemiologic link | |
| Named contact | 37 (63) |
| Shared location | 15 (25) |
| Family member | 6 (10) |
| Shared contact(s) or social network | 1 (2) |
| Infectiousness of presumed source case | |
| Pulmonary, smear-positive or cavitary | 46 (96) |
| Pulmonary, smear-negative and non-cavitary | 2 (4) |
| Median TB infectious period of presumed source patients (days) (IQR) | 202 (132–287) |
| Median TB infectious period of secondary patients (days) (IQR) | 144 (121–186) |
IQR, Interquartile range.
See .
Figure 4Association between modified case-pair interval and pairwise single nucleotide polymorphism (SNP) difference between genotype-matched source–secondary case pairs in Los Angeles County, California, 2015–2018a. aThe modified case-pair interval is defined as the time between the estimated midpoint of the source patient's TB infectious period and the sample collection date of the secondary patient. Each dot corresponds to a source–secondary case pair. (A) Scatter plot of modified case-pair interval, in years, against pairwise SNP distance. (B) Pairwise SNP differences of case pairs with modified case-pair intervals defined as recent transmission (during the previous 2 years) or reactivation (transmission occurred >2 years ago).
Figure 5Neighbor-joining tree (phylogenetic analysis) representing the genetic distances in single nucleotide polymorphisms (SNPs) among 10 genotype-matched isolates of Mycobacterium tuberculosis (Mtb), including isolates from a school-based cluster in California, 2001–2017a. aEach node (circle) represents a patient's Mtb isolate in the cluster. The year in each node corresponds to the year of TB diagnosis. SNP distances between each isolate are displayed in boxes on each branch. Bootstrap values obtained from 500 replicates were 100% for all branches. The isolates within the dotted box are the most closely related in the cluster and belong to patients for whom epidemiologic links were identified. Branch lengths were scaled proportionally among these isolates to improve visualization. Case X was diagnosed in 2001 and identified as the most likely source for 4 cases and a possible source for the fifth (2017) case in the dotted box; diagnosis dates for these cases ranged from 2008 to 2017. Solid black lines between Case X and other cases indicate that a definite or probable epidemiologic link was identified; the dotted line represents that a possible epidemiologic link (see Appendix Table 2) was identified. Three genotype-matched isolates not shown are from non-US–born patients whose TB was diagnosed elsewhere in the United States, indicating their cases are unlikely related to the cluster of interest.