| Literature DB >> 33649550 |
A Sarah Walker1,2,3, Nicole Stoesser4,5,6, Liam P Shaw1, William Matlock7, Kevin K Chau1, Manal AbuOun8, Emma Stubberfield8, Leanne Barker1, James Kavanagh1, Hayleah Pickford1, Daniel Gilson8, Richard P Smith8, H Soon Gweon9,10, Sarah J Hoosdally1, Jeremy Swann1, Robert Sebra11, Mark J Bailey9, Timothy E A Peto1,2,3, Derrick W Crook1,2,3, Muna F Anjum8, Daniel S Read9.
Abstract
F-type plasmids are diverse and of great clinical significance, often carrying genes conferring antimicrobial resistance (AMR) such as extended-spectrum β-lactamases, particularly in Enterobacterales. Organising this plasmid diversity is challenging, and current knowledge is largely based on plasmids from clinical settings. Here, we present a network community analysis of a large survey of F-type plasmids from environmental (influent, effluent and upstream/downstream waterways surrounding wastewater treatment works) and livestock settings. We use a tractable and scalable methodology to examine the relationship between plasmid metadata and network communities. This reveals how niche (sampling compartment and host genera) partition and shape plasmid diversity. We also perform pangenome-style analyses on network communities. We show that such communities define unique combinations of core genes, with limited overlap. Building plasmid phylogenies based on alignments of these core genes, we demonstrate that plasmid accessory function is closely linked to core gene content. Taken together, our results suggest that stable F-type plasmid backbone structures can persist in environmental settings while allowing dramatic variation in accessory gene content that may be linked to niche adaptation. The association of F-type plasmids with AMR may reflect their suitability for rapid niche adaptation.Entities:
Mesh:
Substances:
Year: 2021 PMID: 33649550 PMCID: PMC8319146 DOI: 10.1038/s41396-021-00926-w
Source DB: PubMed Journal: ISME J ISSN: 1751-7362 Impact factor: 11.217
Fig. 1Overview of plasmid population.
a Plasmid host genera distribution by compartment. b Distribution of plasmid sequence lengths with predicted mobilities. c Graph representing the association between replicon alleles. F-type nodes are coloured pink. Line weight is proportional to frequency of association in the sample. d Plasmid GC-content subtracted from host chromosome GC-content. A value greater than zero indicates the plasmid is AT-richer than the host. Only plasmids with circularised host chromosomes were used (565/726).
Fig. 2Thresholding the plasmid network.
a Number of communities (at least 10 nodes) detected over a varying Mash similarity threshold. Median and IQR bar shown. b Cumulative proportion of nodes recruited in a detected community of at least ten nodes over a varying Mash similarity threshold. Median and IQR bars shown. c Gaussian kernel density estimates of Mash similarities stratified by compartment. Bandwidth = 0.00864 calculated by Silverman’s ‘rule of thumb’. Density medians are indicated with vertical lines. d Evolution of the largest connected component and number of components over a varying Mash similarity threshold.
Fig. 3Plasmid network communities.
The plasmid network at threshold = 0.95. Each community with at least ten members has a unique colour. Communities are labelled from 1 to 13, which correspond to Figs. 5, S3–4 and S5–15. Unassigned plasmids and those in smaller communities are left white.
Fig. 5Community core gene phylogeny.
A neighbour-joining tree based on alignments of the 68 core genes. A heatmap of the 316 accessory genes is also shown. Node colour represents a host sequence type and node shape represents the farm. Unknown STs are labelled by ‘-’. Branch lengths have been corrected for homologous recombination.
Fig. 4Plasmid network coloured by metadata.
All nodes are coloured, not just those in our detected 13 communities of at least 10 members. a Partition by livestock or WwTW sampling compartment. b Partition by plasmid host genera.
Community metadata homogeneity.
| Mean ± sd homogeneity | |||||||
|---|---|---|---|---|---|---|---|
| Median ± IQR communities with at least 10 plasmids | Livestock, WwTW | Pig, Cattle, Sheep, WwTW | 14 Livestock Farms, WwTW | Livestock, 5 WwTWs | Livestock, Upstream/Influent, Downstream/Effluent | Host Genera | Time-point |
| 13 ± 0 | 0.713 ± 0.014 | 0.592 ± 0.006 | 0.406 ± 0.000 | 0.468 ± 0.032 | 0.553 ± 0.009 | 0.888 ± 0.000 | 0.050 ± 0.000 |
Homogeneity score averages over 100 runs of the Louvain algorithm for all 13 communities.
Community metadata completeness.
| Mean ± sd completeness | |||||||
|---|---|---|---|---|---|---|---|
| Median ± IQR communities with at least 10 plasmids | Livestock, WwTW | Pig, Cattle, Sheep, WwTW | 14 Livestock Farms, WwTW | Livestock, 5 WwTWs | Livestock, upstream/influent, downstream/effluent | Host genera | Time-point |
| 13 ± 0 | 0.200 ± 0.001 | 0.332 ± 0.000 | 0.400 ± 0.000 | 0.238 ± 0.002 | 0.211 ± 0.003 | 0.309 ± 0.000 | 0.023 ± 0.000 |
Completeness score averages over 100 runs of the Louvain algorithm for all 13 communities.
Community pangenomes.
| Community | Nodes | Edges | Mash similarity mean | Core genes | Soft core genes | Shell genes | Cloud genes | Total genes |
|---|---|---|---|---|---|---|---|---|
| 1 | 52 | 1151 | 0.973 | 13 | 12 | 155 | 153 | 333 |
| 2 | 85 | 1935 | 0.968 | 4 | 17 | 140 | 383 | 544 |
| 3 | 46 | 325 | 0.965 | 35 | 8 | 86 | 369 | 498 |
| 4 | 12 | 21 | 0.962 | 2 | 0 | 290 | 129 | 421 |
| 5 | 14 | 23 | 0.962 | 2 | 0 | 225 | 260 | 487 |
| 6 | 21 | 111 | 0.963 | 13 | 6 | 354 | 430 | 803 |
| 7 | 34 | 263 | 0.966 | 2 | 1 | 278 | 359 | 640 |
| 8 | 23 | 135 | 0.978 | 27 | 1 | 142 | 362 | 532 |
| 9 | 12 | 34 | 0.966 | 18 | 0 | 364 | 324 | 706 |
| 10 | 13 | 37 | 0.977 | 0 | 0 | 309 | 38 | 347 |
| 11 | 15 | 55 | 0.981 | 62 | 0 | 116 | 35 | 213 |
| 12 | 30 | 391 | 0.976 | 68 | 3 | 126 | 187 | 384 |
| 13 | 12 | 45 | 0.978 | 88 | 0 | 195 | 48 | 331 |
Characteristics of each of the 13 communities, including a number of nodes, edges and Mash mean (mean weight of all edges), and gene counts at each level of the pangenome: core genes, softcore genes, shell genes and cloud genes are those found in [100, 99], (99, 95], (95, 15], and (15, 0] per cent of plasmids, respectively.