| Literature DB >> 35630311 |
Michelle Qiu Carter1, Nicole Laniohan1, Chien-Chi Lo2, Patrick S G Chain2.
Abstract
Shiga toxin-producing Escherichia coli (STEC) O145:H28 can cause severe disease in humans and is a predominant serotype in STEC O145 environmental isolates. Here, comparative genomics was applied to a set of clinical and environmental strains to systematically evaluate the pathogenicity potential in environmental strains. While the core genes-based tree separated all O145:H28 strains from the non O145:H28 reference strains, it failed to segregate environmental strains from the clinical. In contrast, the accessory genes-based tree placed all clinical strains in the same clade regardless of their genotypes or serotypes, apart from the environmental strains. Loss-of-function mutations were common in the virulence genes examined, with a high frequency in genes related to adherence, autotransporters, and the type three secretion system. Distinct differences in pathogenicity islands LEE, OI-122, and OI-57, the acid fitness island, and the tellurite resistance island were detected between the O145:H28 and reference strains. A great amount of genetic variation was detected in O145:H28, which was mainly attributed to deletions, insertions, and gene acquisition at several chromosomal "hot spots". Our study demonstrated a distinct virulence gene repertoire among the STEC O145:H28 strains originating from the same geographical region and revealed unforeseen contributions of loss-of-function mutations to virulence evolution and genetic diversification in STEC.Entities:
Keywords: Shiga toxin-producing Escherichia coli (STEC); pangenome; pathogenicity islands; virulence genes
Year: 2022 PMID: 35630311 PMCID: PMC9144400 DOI: 10.3390/microorganisms10050866
Source DB: PubMed Journal: Microorganisms ISSN: 2076-2607
Genomes used in this study.
| Strains | a Sources | b Serotype | Phylogroup/Genotype | Chromosome (bp)/ | c Plasmids (bp)/GenBank Accession | References | ||
|---|---|---|---|---|---|---|---|---|
| pEHEC | Others | |||||||
| MG1655 | Stool | O16:H48 | A/ST10 | N/A | 4,641,652/U00096.3 | N/A | N/A | [ |
| EDL933 | Ground beef | O157:H7 | E/ST11 | 5,528,445/AE005174.2 | 92,077/AF074613.1 | N/A | [ | |
| CFSAN004176 | Clinical | O145:H25 | B1/ST5309 | 5,193,734/CP014583.1 | 52,297/CP012493.1 | 95,721/CP012491.1; | [ | |
| CFSAN004177 | Clinical | O145:H25 | B1/ST5309 |
| 5,191,331/CP014670.1 | 52,297/CP012495.1 | 96,228/CP012494.1; 34,714/CP012496.1 | [ |
| RM13514 | Clinical | O145:H28 | E/ST32 |
| 5,585,613/CP006027.1 | 87,120/CP006028.1 | 64,561/CP006029.1 | [ |
| RM13516 | Clinical | O145:H28 | E/ST6130 |
| 5,402,276/CP006262.1 | 98,066/CP006263.1 | 58,666/CP006264.1 | [ |
| 10942 | Clinical | O145:H28 | E/ST32 | 5,374,674/AP019703.1 | 92,337/AP019704.1 | 71,161/AP019705.1 | [ | |
| 112648 | Clinical | O145:H28 | E/ST32 | 5,488,534/AP019706.1 | 91,036/AP019707.1 | N/A | [ | |
| 122715 | Clinical | O145:H28 | E/ST32 |
| 5,418,961/AP019708.1 | 86,874/AP019709.1 | 48,572/AP019710.1 | [ |
| 95-3192 | Clinical | O145:H28 | E/ST32 |
| 5,385,516/CP027362.1 | N/A | N/A | [ |
| 2015C-3125 | Clinical | O145:H28 | E/ST32 | 5,471,132/CP027763.1 | 66,944/CP027764.1 | 66,388/CP027765.1 | [ | |
| RM8843-C1 | Cattle | O145:H28 | E/ST32 |
| 5,458,415/CP035772.1 | 88,752/CP035773.1 | N/A | [ |
| RM8988-C1 | Cattle | O145:H28 | E/ST32 |
| 5,458,186/CP035770.1 | 88,752/CP035771.1 | N/A | [ |
| RM8995-C1 | Sediment | O145:H28 | E/ST32 |
| 5,457,980/CP031355.1 | 88,747/CP031354.1 | N/A | [ |
| RM9154-C1 | Cattle | O145:H28 | E/ST32 |
| 5,205,721/CP031353.1 | 86,711/CP031352.1 | 187,274/CP031350.1; 96,355/CP031351.1 | [ |
| RM9467-C1 | Cattle | O145:H28 | E/ST32 |
| 5,385,895/CP031349.1 | 89,518/CP031348.1 | N/A | [ |
| RM9872-C1 | Cattle | O145:H28 | E/ST32 |
| 5,385,904/CP024659.1 | 89,518/CP024660.1 | N/A | [ |
| RM9873-C1 | Cattle | O145:H28 | E/ST32 |
| 5,385,819/CP031347.1 | 89,515/CP031346.1 | N/A | [ |
| RM10425-C1 | Cattle | O145:H28 | E/ST32 |
| 5,343,037/CP031343.1 | 89,519/CP031342.1 | N/A | [ |
| RM11626-C1 | Cattle | O145:H28 | E/ST32 |
| 5,419,044/CP035768.1 | 86,875/CP035769.1 | N/A | [ |
| RM12275-C1 | Cattle | O145:H28 | E/ST32 |
| 5,408,598/CP031341.1 | 88,744/CP031340.1 | N/A | [ |
| RM12367-C1 | Water | O145:H28 | E/ST32 | 5,472,396/CP031345.1 | 90,831/CP031344.1 | N/A | [ | |
| RM12522-C8 | Cattle | O145:H28 | E/ST32 |
| 5,418,923/CP035767.1 | 86,874/CP035766.1 | N/A | [ |
a The source and isolation year for strain MG1655 is based on the information available for its parental strain K-12 as described previously [43].b The serotype of strain MG1655 is determined by BLAST searches of E. coli O-antigen and H-antigen databases described previously [44,45].c pEHEC refers to the plasmid carrying genes encoding enterohemolysin.
Characteristics of Pathogenicity Islands (PAIs) and Genomic Islands (GIs) examined in this study.
| PAIs/GIs | Length (bp) | Sources of Query Sequences | GenBank Accession #/Positions | References |
|---|---|---|---|---|
| Locus of Enterocyte Effacement (LEE) | 43,418/40.9 | STEC O157:H7 str. EDL933 | AE005174.2/4,649,862–4,693,279 | [ |
| Pathogenicity Island OI-122 | 23,455/46.3 | STEC O157:H7 str. EDL933 | AE005174.2/3,919,348–3,942,802 | [ |
| Pathogenicity Island OI-57 | 80,502/51.4 | STEC O157:H7 str. EDL933 | AE005174.2/1,849,324–1,929,825 | [ |
| Tellurite Resistance Island (TRI) | 87,548/48.0 | STEC O157:H7 str. EDL933 | AE005174.2/1,454,242–1,541,789 | [ |
| Locus of Adhesion and Autoaggregation (LAA) | 86,353/48.6 | STEC O91:H21 str. B2F1 | AFDQ01000026.1/385,984–472,336 | [ |
| Locus of Proteolysis Activity (LPA) | 37,710/47.4 | STEC O91:H- str. 4797/97 | AJ278144.1/1–37,710 | [ |
| High-Pathogenicity Island (HPI) | 36,448/56.4 |
| AL031866.1/78,113–114,560 | [ |
| Subtilase Encoding Pathogenicity Island (SE-PAI) | 8058/46.2 | JQ994271.1/1–8058 | [ | |
| Acid Fitness Island (AFI) | 13,620/46.0 | U00096.3/3,653,961–3,667,580 | [ | |
| Locus of Heat Resistance (LHR) | 14,981/62.2 | CP002291.1/319,821–304,841 | [ |
Figure 1Comparative genomic analyses of STEC O145:H28. (A): Core and accessory genes of STEC O145:H28. The numbers of core and accessory genes were calculated in Roary as detailed in the Materials and Methods section. The core genes refer to all genes shared by all input genomes (n = 19); the soft-core genes refer to all genes present in any 18 input genomes (n = 18); the shell genes refer to all genes present in at least three but less than 18 input genomes (3 ≤ n < 18); the cloud genes refer to all genes present in less than three input genomes (n < 3). (B): Number of strain specific genes in STEC O145:H28. The number of strain specific genes was calculated in Roary as detailed in the Materials and Methods section. Red bars represent the clinical isolates whereas blue bars represent environmental isolates. * indicates that no plasmids were present in the published genome (GenBank Accession numbers are presented in Table 1).
Figure 2Relatedness of the STEC strains. (A): Core-genes based phylogenetic tree. The maximum likelihood-based phylogenetic tree was constructed based on the alignment of concatenated nucleotide sequences of 2998 homologous CDSs from each of the strains and using iqtree2 with the best fit model GTR + F + I as selected by ModelFinder and was assessed by bootstrapping with 1000 pseudoreplicates. (B): Accessory genes-based relatedness tree. The accessory genes of the STEC genomes were used to construct a FASTA file with the binary of presence and absence in Roary, followed by the construction of an approximately maximum-likelihood tree using FastTree as detailed in the Materials and Methods section. Thus, strains within the same cluster share more common accessory genes than the strains in a different cluster.
Figure 3Detection of the E. coli virulence genes in STEC. (A): The functional categories of the E. coli virulence genes according to the classification in the VFDB [33]. A total of 333 virulence genes that contribute to the pathogenicity in various E. coli pathotypes were downloaded from the VFDB and grouped into 10 functional categories as detailed in the Materials and Methods. The list of genes and their association with the E. coli pathotypes are presented in Supplementary Materials Table S1. (B): The number of E. coli virulence genes and their functional categories detected in each STEC strain. Presence of each virulence gene was verified by BLASTn search of a database containing all STEC strains in Geneious with a threshold of 80% for coverage and 75% for sequence identity. For genes carrying a loss-of-function mutation, they are marked as absent. * indicates that no plasmids were present in the published genome (GenBank Accession numbers are presented in Table 1).
Genetic features of Pathogenicity Islands LEE, OI-122, and OI-57.
| Strains | a LEE | b OI-122 | c OI-57 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Integration Site | Chromosomal Location | Size (bp)/CDS | Integration Site | Chromosomal Location | Size (bp)/CDS | Integration Site | Chromosomal Location | Size (bp)/CDS | |
| EDL933 |
| 4,649,862–4,693,279 | 43,418/52 |
| 3,919,348–3,942,802 | 23,455/27 | 1,849,324–1,929,825 | 80,502/107 | |
| CFSAN004176 |
| 2,363,880–2,425,764 | 61,885/60 |
| 3,818,649–3,780,408 | 38,242/40 | NA | NA | NA |
| CFSAN004177 |
| 2,825,461–2,763,585 | 61,877/61 |
| 1,370,795–1,409,031 | 38,237/40 | NA | NA | NA |
| RM13514 |
| 4,558,936–4,605,536 | 46,601/55 |
| 5,269,766–5,222,118 | 47,649/47 |
| 1,591,159–1,633,674 | 42,516/51 |
| RM13516 |
| 4,410,578–4,458,362 | 47,785/57 |
| 5,188,896–5,242,157 | 53,262/53 |
| 1,569,959–1,615,548 | 45,590/55 |
| 10942 |
| 4,428,998–4,475,832 | 46,835/56 |
| 5,060,248–5,047,349 | 12,900/10 |
| 1,553,610–1,597,236 | 43,627/50 |
| 112648 |
| 4,538,738–4,585,714 | 46,977/56 |
| 5,203,887–5,157,474 | 46,414/37 |
| 1,631,815–1,715,209 | 83,395/102 |
| 122715 |
| 4,436,958–4,483,934 | 46,977/56 |
| 5,103,184–5,055,469 | 47,716/39 |
| 1,575,808–1,619,718 | 43,911/50 |
| 95-3192 |
| 2,903,392–2,950,270 | 46,879/55 |
| 3,567,963–3,521,794 | 46,170/33 |
| 506,141–491,757 | 14,385/24 |
| 2015C-3125 |
| 4,369,007–4,416,023 | 47,017/55 |
| 5,029,590–4,987,764 | 41,827/34 |
| 1,444,864–1,459,247 | 14,384/24 |
| RM8843-C1 |
| 4,473,390–4,520,271 | 46,882/56 |
| 5,142,570–5,091,970 | 50,601/55 |
| 2,204,394–2,248,382 | 43,989/59 |
| RM8988-C1 |
| 4,473,157–4,520,039 | 46,883/56 |
| 5,141,855–5,091,255 | 50,601/55 |
| 1,628,125–1,613,741 | 14,385/26 |
| RM8995-C1 |
| 4,472,986–4,519,862 | 46,877/56 |
| 5,142,141–5,091,544 | 50,598/55 |
| 3,201,502–3,157,514 | 43,989/59 |
| RM9154-C1 |
| 4,237,174–4,284,056 | 46,883/56 |
| 4,891,189–4,855,660 | 35,530/36 |
| 159,456–186,929 | 27,474/44 |
| RM9467-C1 |
| 4,411,115–4,458,137 | 47,023/56 |
| 5,071,448–5,029,449 | 42,000/42 |
| 1,574,704–1,618,656 | 43,953/58 |
| RM9872-C1 |
| 4,411,123–4,458,145 | 47,023/58 |
| 5,071,457–5,029,079 | 42,379/44 |
| 2,099,371–2,084,987 | 14,385/24 |
| RM9873-C1 |
| 4,411,053–4,458,073 | 47,021/57 |
| 5,071,377–5,029,379 | 41,999/42 |
| 1,574,682–1,618,634 | 43,953/58 |
| RM10425-C1 |
| 4,368,259–4,415,280 | 47,022/56 |
| 5,028,591–4,986,592 | 42,000/42 |
| 2,099,365–2,084,981 | 14,385/26 |
| RM11626-C1 |
| 4,439,648–4,486,530 | 46,883/56 |
| 5,104,582–5,058,414 | 46,169/41 |
| 3,172,643–3,158,259 | 14,385/26 |
| RM12275-C1 |
| 4,472,938–4,519,814 | 46,877/56 | NA | NA | NA |
| 1,622,111–1,666,070 | 43,960/59 |
| RM12367-C1 |
| 4,497,622–4,544,644 | 47,023/56 |
| 5,157,953–5,115,955 | 41,999/42 |
| 1,575,949–1,713,896 | 137,948/191 |
| RM12522-C8 |
| 4,439,541–4,486,422 | 46,882/56 |
| 5,104,468–5,058,301 | 46,168/41 |
| 1,625,877–1,652,995 | 27,119/40 |
a The LEE island in strain EDL933 corresponds to O-island #148 as reported previously [17]. The LEEs in strains RM13514 and RM13516 were reported previously [6]. The LEEs in strains 10942, 112648, and 122715 were based on the genome annotations reported previously [41]. The LEE regions in O145:H25 strains were defined by IslandViewer4 in this study. The LEE regions in environmental STEC O145:H28 strains were defined by BLASTn search of a database containing all STEC genomes examined in this study using the RM13514 LEE as a query. b The OI-122 PAIs in strains RM13514 and RM13516 were reported previously [6]. The OI-122 PAIs in strains 10942, 112648, and 122715 were based on the genome annotations reported previously [41]. The OI-122 regions in environmental STEC O145:H28 strains and in O145:H25 strains were defined by BLASTn search of a database containing all STEC genomes examined in this study using the OI-122 of the strain EDL933 and of the RM13514 as queries. The tRNA genes pheV and pheU are identical. In some genomes, this tRNA gene was annotated as pheV whereas in others, it was annotated as pheU. In strain RM12275-C1, this integration site is unoccupied. c The OI-57 regions in O145 strains were defined by BLASTn search of a database containing all STEC genomes examined in this study using the EDL933 OI-57 as a query.
Genetic features of the Tellurite Resistance Island (TRI), Locus of Adherence and Autoaggregation (LAA), and Acid Fitness Island (AFI).
| Strains | a Tellurite Resistance Island (TRI) | Locus of Adherence and Autoaggregation (LAA) Module Location/Size (bp) | Acid Fitness Island (AFI) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Related tRNA Genes | Chromosomal Locations | Size (bp)/ | I | II | III | IV | Chromosomal Locations | Size(bp)/ | |
| EDL933 | OI-43/ | OI-43:1,058,620–1,146,182 | 87,563/106 | OI-43/864 | OI-43/ | OI-43/ | OI-43/15,455 | 4,454,268–4,476,943 | 22,676/21 |
| OI-48:1,454,242–1,541,789 | 87,548/105 | OI-48/864 | OI-48/ | OI-48/ | OI-48/15,455 | ||||
| CFSAN004176 |
| 3,679,064–3,626,942 | 52,123/61 | NA | b TRI/ | NA | TRI/2486 | 3,171,287–3,157,668 | 13,620/13 |
| CFSAN004176 |
| 1,510,368–1,562,482 | 52,115/60 | NA | b TRI/ | NA | TRI/2486 | 2,018,117–2,031,735 | 13,619/13 |
| RM13514 | TRI-1/ | TRI-1:1,241,544–1,322,994 | 81,451/103 | TRI-1/863 | TRI-1/ | TRI-1/ | TRI-1/15,593; | 4,360,251–4,382,890 | 22,640/24 |
| TRI-2:3,926,463–3,861,137 | 65,327/67 | ||||||||
| RM13516 |
| 1,206,887–1,303,036 | 96,150/111 | TRI/863 | TRI/5776 | TRI/7083 | TRI/14,084; | 4,224,613–4,247,252 | 22,640/24 |
| 10942 |
| 1,202,851–1,284,087 | 81,237/97 | TRI/863 | TRI/5777 | TRI/9945 | TRI/14,142 | 4,243,061–4,265,700 | 22,640/20 |
| 112648 |
| 1,280,897–1,362,131 | 81,235/97 | TRI/863 | TRI/5777 | TRI/9696 | TRI/14,140; | 4,352,804–4,375,443 | 22,640/20 |
| 122715 |
| 1,223,578–1,303,501 | 79,924/94 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,251,163–4,273,802 | 22,640/20 |
| 95-3192 |
| 856,722–775,651 | 81,072/86 | TRI/862 | TRI/ | TRI/ | TRI/14,141; | 2,701,442–2,724,081 | 22,640/21 |
| 2015C-3125 |
| 1,095,410–1,175,328 | 79,919/86 | TRI/862 | TRI/ | TRI/ | TRI/14,142; | 4,183,005–4,205,643 | 22,639/21 |
| RM8843-C1 |
| 1,852,148–1,933,384 | 81,237/99 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,287,369–4,310,008 | 22,640/22 |
| RM8988-C1 |
| 1,980,371–1,899,135 | 81,237/99 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,287,136–4,309,775 | 22,640/22 |
| RM8995-C1 |
| 2,961,815–2,880,586 | 81,230/99 | TRI/863 | TRI/ | TRI/ | TRI/14,140; | 4,286,971–4,309,610 | 22,640/22 |
| RM9154-C1 |
| 1,222,582–1,291,597 | 69,016/90 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,036,160–4,058,799 | 22,640/22 |
| RM9467-C1 |
| 1,223,947–1,305,183 | 81,237/99 | TRI/863 | TRI/ | TRI/ | TRI/14,141; | 4,225,103–4,247,742 | 22,640/22 |
| RM9872-C1 |
| 2,450,129–2,368,893 | 81,237/103 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,225,111–4,247,750 | 22,640/24 |
| RM9873-C1 |
| 1,223,939–1,305,171 | 81,233/99 | TRI/863 | TRI/ | TRI/ | TRI/14,140; | 4,225,047–4,247,685 | 22,639/22 |
| RM10425-C1 |
| 2,407,270–2,326,034 | 81,237/99 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,182,248–4,204,887 | 22,640/22 |
| RM11626-C1 |
| 2,879,078–2,796,529 | 82,550/101 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,253,636–4,276,275 | 22,640/22 |
| RM12275-C1 |
| 1,269,895–1,351,122 | 81,228/99 | TRI/863 | TRI/ | TRI/ | TRI/14,141 | 4,286,927–4,309,564 | 22,638/22 |
| RM12367-C1 |
| 1,223,885–1,305,121 | 81,237/99 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,311,611–4,334,250 | 22,640/22 |
| RM12522-C8 |
| 1,274,000–1,356,546 | 82,547/101 | TRI/863 | TRI/ | TRI/ | TRI/14,142; | 4,253,530–4,276,169 | 22,640/22 |
a The TRI islands in strain EDL933 are OI-43 and OI-48 as reported previously [17]. The TRI regions in all O145:H28 strains were defined by BLASTn search of a database containing all STEC genomes examined in this study using the EDL933 OI-48 as a query. The TRI regions in O145:H25 strains were defined by BLAST search genome each using the TRI-2 (inserted downstream of the ileX gene in strain RM13514) as a query as no homologs were detected when OI-48 was used as a query. b This 5 kb LAA Module II segment in both O145:H25 strains mainly carries the virulence gene lesP (espC). A homolog of this segment is present on the virulence plasmid pEHEC. c IE06 is based on the annotation reported previously [6].
Figure 4Sequence analyses of TRI insertion sites in STEC. (A): Sequence analyses of TRI insertion sites in STEC O157:H7 strain EDL933. Yellow blocks refer to the DRs bordering each TRI. The insertion site in tRNA gene ileX in strain EDL933 is unoccupied. (B): Sequence analyses of TRI insertion sites in STEC O145:H28 strain RM13514. Yellow blocks refer to the DRs bordering each TRI. The insertion site in tRNA gene serW in strain RM13514 is unoccupied. (C): Sequence analyses of TRI insertion sites in STEC O145:H28 strain RM13516. Yellow blocks refer to the DRs bordering each TRI. The insertion sites in tRNA genes serW and ileX in strain RM13516 are unoccupied. (D): Sequence analyses of TRI insertion sites in STEC O145:H25 strain CFSAN004176. Yellow blocks refer to the DRs bordering each TRI. The insertion sites in tRNA genes serW and serX in strain CFSAN004176 are unoccupied.