| Literature DB >> 28493321 |
Ben Pascoe1,2, Guillaume Méric1, Koji Yahara3,4, Helen Wimalarathna5, Susan Murray4, Matthew D Hitchings4, Emma L Sproston6, Catherine D Carrillo7, Eduardo N Taboada8, Kerry K Cooper9, Steven Huynh10, Alison J Cody5, Keith A Jolley5, Martin C J Maiden5,11, Noel D McCarthy5,11,12, Xavier Didelot13, Craig T Parker10, Samuel K Sheppard1,2,5.
Abstract
The genetic structure of bacterial populations can be related to geographical locations of isolation. In some species, there is a strong correlation between geographical distance and genetic distance, which can be caused by different evolutionary mechanisms. Patterns of ancient admixture in Helicobacter pylori can be reconstructed in concordance with past human migration, whereas in Mycobacterium tuberculosis it is the lack of recombination that causes allopatric clusters. In Campylobacter, analyses of genomic data and molecular typing have been successful in determining the reservoir host species, but not geographical origin. We investigated biogeographical variation in highly recombining genes to determine the extent of clustering between genomes from geographically distinct Campylobacter populations. Whole-genome sequences from 294 Campylobacter isolates from North America and the UK were analysed. Isolates from within the same country shared more recently recombined DNA than isolates from different countries. Using 15 UK/American closely matched pairs of isolates that shared ancestors, we identify regions that have frequently and recently recombined to test their correlation with geographical origin. The seven genes that demonstrated the greatest clustering by geography were used in an attribution model to infer geographical origin which was tested using a further 383 UK clinical isolates to detect signatures of recent foreign travel. Patient records indicated that in 46 cases, travel abroad had occurred <2 weeks prior to sampling, and genomic analysis identified that 34 (74%) of these isolates were of a non-UK origin. Identification of biogeographical markers in Campylobacter genomes will contribute to improved source attribution of clinical Campylobacter infection and inform intervention strategies to reduce campylobacteriosis.Entities:
Keywords: zzm321990Campylobacterzzm321990; allopatry; genomics; phylogeny; recombination; source attribution
Mesh:
Year: 2017 PMID: 28493321 PMCID: PMC5600125 DOI: 10.1111/mec.14176
Source DB: PubMed Journal: Mol Ecol ISSN: 0962-1083 Impact factor: 6.185
Figure 1Population structure of Campylobacter isolates used in this study. Phylogenetic trees were constructed from a whole‐genome alignment of (a) C. jejuni (n = 229) and (b) C. coli (n = 55) isolates based on 103,878 and 806,657 variable sites, respectively, using an approximation of the maximum‐likelihood algorithm (Kumar et al., 2016; Tamura et al., 2013). Leaves on the tree are coloured by source country, the UK (green circles), Canada (red) and the United States (blue). Ancestral C. coli clades (1, 2 and 3) (Sheppard, Dallas et al., 2010) are annotated and common clonal complexes (CC) based on four or more shared alleles in seven MLST housekeeping genes (Dingle et al., 2005)
Isolate pairs matched by clonal complex and host
| Pair | Isolate | Origin | Host | MLST genes | Clonal complex | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| aspA | glnA | gltA | glyA | pgm | tkt | uncA | |||||
| 1 | 2,256 | Canada | Cattle | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 |
| 47 | The UK | Cattle | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 | |
| 2 | 2,280 | Canada | Human | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 |
| 117 | The UK | Human | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 | |
| 3 | 2,271 | Canada | Chicken | 9 | 2 | 4 | 62 | 4 | 5 | 17 | ST‐257 |
| 22 | The UK | Chicken | 9 | 2 | 4 | 62 | 4 | 5 | 6 | ST‐257 | |
| 4 | 2,274 | Canada | Duck | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 |
| 131 | The UK | Duck | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 | |
| 5 | 2,258 | Canada | Chicken | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 |
| 112 | The UK | Chicken | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 | |
| 6 | 2,306 | Canada | Human | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 |
| 33 | The UK | Human | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 | |
| 7 | 2,255 | Canada | Cattle | 1 | 4 | 2 | 2 | 6 | 3 | 17 | ST‐61 |
| 13 | The UK | Cattle | 1 | 4 | 2 | 2 | 6 | 3 | 17 | ST‐61 | |
| 8 | 2,264 | Canada | Chicken | 33 | 39 | 30 | 203 | 113 | 47 | 17 | ST‐828 |
| 21 | The UK | Chicken | 33 | 39 | 30 | 82 | 104 | 43 | 17 | ST‐828 | |
| 9 | 2,257 | Canada | Cattle | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 |
| 59 | The UK | Cattle | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 | |
| 10 | 2,275 | Canada | Human | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 |
| 120 | The UK | Human | 2 | 1 | 1 | 3 | 2 | 1 | 5 | ST‐21 | |
| 11 | 2,270 | Canada | Chicken | 9 | 2 | 4 | 62 | 4 | 5 | 17 | ST‐257 |
| 105 | The UK | Chicken | 9 | 2 | 4 | 62 | 4 | 5 | 6 | ST‐257 | |
| 12 | 2,265 | Canada | Chicken | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 |
| 111 | The UK | Chicken | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 | |
| 13 | 2,266 | Canada | Chicken | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 |
| 70 | The UK | Chicken | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 | |
| 14 | 2,307 | Canada | Human | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 |
| 118 | The UK | Human | 4 | 7 | 10 | 4 | 1 | 7 | 1 | ST‐45 | |
| 15 | 155 | Canada | Cattle | 33 | 39 | 30 | 82 | 104 | 85 | 68 | ST‐828 |
| 98 | The UK | Cattle | 33 | 39 | 30 | 82 | 104 | 56 | 17 | ST‐828 | |
Figure 2Co‐ancestry matrix with population structure and genetic flux. (a) The colour of each cell of the matrix indicates proportion of DNA chunks in a recipient genome (row) from a donor genome (column). The colour ranges from little (yellow) to a large amount of DNA from the donor strain (blue). Diagonal white cells indicate chunks of DNA that are shared between the pairs of isolates and masked in the comparison in (b). The trees above and to the left show clustering of the paired isolates with leaves coloured by source country (the UK in green, Canada in red). (b) Boxplot comparing total proportion of chunks of DNA inherited by a recipient from donors either within or between countries. The total proportion is significantly higher for chunks of DNA from donor strains of the same country compared to those from different countries (p < 10−9, Wilcoxon rank‐sum test)
Figure 3Pairwise comparison of nucleotide diversity in the core genome. Above: Estimated values of the per‐nucleotide statistic reflecting relative intensity of recombination at each site plotted along the NCTC11168 reference genome. Left: Core‐genome phylogeny of selected paired isolates (matched by CC and source host), with clonal complex indicated. Centre: Matrix of gene‐by‐gene pairwise comparison along the NCTC11168 reference genome of our selected pairs. Each row represents a pairwise comparison of selected paired of isolates. Each column is a gene from the NCTC11168 reference genome. Panels of the matrix are coloured based on nucleotide divergence for that gene in each pair: from no nucleotide diversity (0%, white), through some nucleotide diversity (~1%, red) to high levels of nucleotide diversity (up to 2%, blue). The per‐nucleotide scan of relative intensity of recombination is aligned with our gene‐by‐gene pairwise comparison of nucleotide diversity, and the location of seven putative epidemiological markers for geographical segregation is indicated
Shared ancestry analysis and estimation of pairwise recombination rates. The time to the most recent common ancestor (TMRCA) for each selected pair was estimated with 95% confidence intervals (TMRCA‐CI). The ratio of rates at which recombination and mutation introduce polymorphism (r/m) was also calculated with 95% confidence intervals (r/m‐CI). In addition, the number of recombined genes (probability >95%) is also shown. The two C. coli pairs are shown in bold
| Isolate pair | TMRCA | TMRCA‐CI | r/m | r/m‐CI | Definitely recombined genes ( |
|---|---|---|---|---|---|
| 2,256 vs. 47 | 2.8 | [2.5;3.2] | 23.1 | [20.2;26.3] | 210 |
| 2,280 vs. 117 | 3.9 | [3.2;4.5] | 23.1 | [19.0;28.3] | 273 |
| 2,271 vs. 22 | 1.9 | [1.6;2.3] | 34.5 | [28.8;39.6] | 194 |
| 2,274 vs. 131 | 3.3 | [2.9;3.9] | 38.8 | [32.0;43.6] | 385 |
| 2,258 vs. 112 | 3.4 | [3.0;3.8] | 32.1 | [28.4;37.0] | 336 |
| 2,306 vs. 33 | 3.7 | [3.2;4.2] | 24.5 | [21.0;27.9] | 280 |
| 2,255 vs. 13 | 1.2 | [1.0;1.5] | 25.2 | [20.1;30.2] | 99 |
|
|
|
|
|
|
|
| 2,257 vs. 59 | 3 | [2.5;3.5] | 23.5 | [19.3;27.8] | 219 |
| 2,275 vs. 120 | 2.7 | [2.3;3.1] | 24.1 | [20.4;27.9] | 194 |
| 2,270 vs. 105 | 2.2 | [1.9;2.5] | 30.5 | [26.6;34.8] | 224 |
| 2,265 vs. 111 | 3.7 | [3.3;4.2] | 32.8 | [28.6;36.7] | 372 |
| 2,266 vs. 70 | 1.3 | [1.1;1.5] | 38 | [33.4;41.4] | 147 |
| 2,307 vs. 118 | 3.9 | [3.4;4.6] | 31.9 | [26.2;37.4] | 379 |
|
|
|
|
|
|
|
Figure 4Assignment of human clinical cases of campylobacteriosis to origin country, including patients with history of recent foreign travel. (a) Assignment of human clinical cases of campylobacteriosis to origin country using epidemiological markers of biogeography and the Bayesian clustering algorithm Structure. Each isolate is represented by a vertical bar, showing the estimated probability that it comes from each of the putative source countries, including the UK (green), the United States (blue) and Canada (red). Isolates are ordered by attributed source. (b) Boxplots of predicted attribution probabilities for the three locations. (c) Isolates from Oxford clinical data set with declared history of recent foreign travel. The model correctly assigned 34 of 46 (73.9%) isolates to a non‐UK origin. (d) Attribution of Oxford clinical isolates between UK, US and Canadian source populations. Isolates with declared recent foreign travel are shown in blue