Literature DB >> 35467376

Recent Genetic Changes Affecting Enterohemorrhagic Escherichia coli Causing Recurrent Outbreaks.

Joshua L Cherry1,2.   

Abstract

Enterohemorrhagic E. coli (EHEC) is responsible for significant human illness, death, and economic loss. The main reservoir for EHEC is cattle, but plant-based foods are common vectors for human infection. Several outbreaks have been attributed to lettuce and leafy green vegetables grown in the Salinas and Santa Maria regions of California. Bacteria causing different outbreaks are mostly not close relatives, but one group of closely-related O157:H7 has caused several of them. This unusual pattern of recurrence may have some genetic basis. Here I use whole-genome sequences to reconstruct the genetic changes that occurred in the recent ancestry of this EHEC. In a short period of time corresponding to little genetic change, there were several changes to adhesion-related sequences, mainly adhesins. These changes may have greatly altered the adhesive properties of the bacteria. Possible consequences include increased persistence of cattle infections, more bacteria shed in cattle feces, and greater virulence in humans. Similar constellations of genetic change, which are detectable by current sequencing-based surveillance, may identify other bacteria that are particular threats to human health. In addition, the Santa Maria subclade carries a nonsense mutation affecting ArsR, a repressor of genes that confer resistance to arsenic and antimony. This suggests that the persistent source of Santa Maria contamination is located in an area with arsenic-contaminated groundwater, a problem in many parts of California. This inference may aid identification of the reservoir of EHEC, which would greatly aid mitigation efforts. IMPORTANCE Food-borne bacterial infections cause substantial illness and death. Understanding how bacteria contaminate food and cause disease is important for combating the problem. Closely-related E. coli, likely originating in cattle, have repeatedly caused outbreaks spread by vegetables grown in California. Such recurrence is atypical, and might have a genetic basis. The genetic changes that occurred in the recent ancestry of these E. coli can be reconstructed from their DNA sequences. Several mutations affect genes involved in bacterial adhesion. These might affect persistence of infection in cattle, quantity of bacteria in their feces, and human disease. They also suggest a way of detecting dangerous bacteria from their genome sequences. Furthermore, a subgroup carries a mutation affecting the regulation of genes conferring arsenic resistance. This suggests that the reservoir for contamination utilizes groundwater contaminated with arsenic, a problem in parts of California. This observation may be an aid to locating the persistent reservoir of contamination.

Entities:  

Keywords:  Escherichia coli; adhesins; adhesion molecules; arsenic resistance; food-borne pathogens

Mesh:

Substances:

Year:  2022        PMID: 35467376      PMCID: PMC9241674          DOI: 10.1128/spectrum.00501-22

Source DB:  PubMed          Journal:  Microbiol Spectr        ISSN: 2165-0497


OBSERVATION

Enterohemorrhagic E. coli (EHEC) cause substantial human morbidity and mortality. They produce Shiga toxin and are most commonly of serotype O157:H7 (1–3). Their reservoir is mainly cattle, which, unlike humans, do not experience severe symptoms as a result of infection. Some strains can colonize the rectoanal junction (RAJ) of cattle (4), leading to a persistent infection and in some cases a “supershedder” phenomenon (5, 6), in which orders of magnitude more bacteria are excreted in the feces. Lettuce and leafy green vegetables grown in the Salinas and Santa Maria Valleys of California have been vectors for several outbreaks of EHEC in recent years. Though most have been caused by O157:H7 strains, in most cases strains causing different outbreaks have not been otherwise very closely related. Some of them, however, were recurrent outbreaks of very closely related O157:H7 (7, 8), differentiated by just a few single-nucleotide polymorphisms (SNPs) in the entire ~5Mb genome. Although the E. coli causing the recurrent outbreaks from Salinas are distinguishable from those from Santa Maria, only a few SNPs separate them. This suggests that genetic traits of the bacteria contribute to the recurrence of contamination or to the human infection rate or symptom severity. The NCBI Pathogens database contains clusters of closely-related isolates of foodborne pathogen species, including E. coli. It also provides a phylogenetic tree for each cluster, and information about single-nucleotide polymorphisms (SNPs) within the cluster. The history of nucleotide changes along the branches of the tree can be inferred by maximum parsimony reconstruction of ancestral states (9, 10). The effects of nucleotide changes on proteins can be determined because their location in the genome is known, and the genome is annotated with the coding sequences. Details of these procedures are given in Text S1 in the supplemental material. Fig. 1 shows the phylogenetic tree for the cluster of E. coli containing the recurrent outbreak isolates (PDS000035073.159). The clade of isolates attributed to Salinas is shown in blue, and those attributed to Santa Maria are shown in red. Together with one other isolate, these form a larger clade. The isolates in the tree are all very closely related, differing by at most 69 SNPs.
FIG 1

Phylogenetic tree of the cluster containing the Salinas and Santa Maria recurrent outbreak strains. Triangles represent collapsed clades, and the number within each indicates the number of isolates in the clade. The total number of isolates is 494. Mutations of interest are indicated to the right of the branches on which they occurred. The scale bar corresponds to four SNPs. The height of the tree from the root to the most distant tip is 36 SNPs. The tree obtained from the NCBI Pathogens data was rerooted based on outgroup sequences.

Phylogenetic tree of the cluster containing the Salinas and Santa Maria recurrent outbreak strains. Triangles represent collapsed clades, and the number within each indicates the number of isolates in the clade. The total number of isolates is 494. Mutations of interest are indicated to the right of the branches on which they occurred. The scale bar corresponds to four SNPs. The height of the tree from the root to the most distant tip is 36 SNPs. The tree obtained from the NCBI Pathogens data was rerooted based on outgroup sequences. The uncolapsed tree is shown in Fig. S1 in the supplemental material. Information about the isolates can be found at the NCBI Pathogen Detection website (PDT000639468.2).

Adhesion-related mutations.

The line of descent from the root to the last common ancestor of the Salinas and Santa Maria clades is colored green in Fig. 1. Eighteen single-nucleotide changes that alter protein sequences are inferred to have occurred along this path: 2 nonsense mutations, and 16 that change one amino acid to another. Of these 18 mutations, 4 affect adhesins (Fig. 1, Table 1). Two are in the same gene and occur on the same branch of the tree. The gene encodes an immunoglobulin-like domain-containing protein, which contains several domains of types associated with adhesins. The others are a nonsense mutation in fdeC and a nonsynonymous mutation in bigA.
TABLE 1

Mutations discussed in the text

Gene namebProtein descriptionNucleotide sequence identifierNucleotide positionNucleotide changeProtein sequence identifierProtein positionProtein changebProtein length (ancestral)
-Ig-like domain-containing proteinAAQXIR010000008.1234093C→TEEQ5358295.1212A→V1461
-Ig-like domain-containing proteinAAQXIR010000008.1234346T→GEEQ5358295.1296F→L1461
fdeC Intimin-like adhesinAAQXIR010000008.19658C→AEEV2727796.11177S→*1417
bigA Surface-exposed virulence proteinAAQXIR010000021.161345C→AEEQ5359752.1150P→Q1011
yeeJ Inverse autotransporter adhesinAARLPB010000083.15978-7195DeletionEEV2728103.11036-1441Deletion2660
arpA ShET2/EspL2 family type III secretion system effector toxinAAQXIR010000007.1279847G→TEEV2726747.1133E→*728
arsR Arsenic/antimony resistance repressorAAQXIR010000005.1121773G→AEEQ5357400.189W→*117

The locations of the mutations on the nucleotide and protein sequences and their nature are described. Most of the sequence identifiers are for the reference sequence for the cluster, that of PDT000639468.2, a member of the Salinas clade. The exceptions are the protein sequences for fdeC and arpA (because protein sequences do not exist for the reference, which has premature stop codons) and the sequences for yeeJ (so that the full-length ancestral coding sequence and protein can be listed). These are taken from isolate PDT000430557.2, which is from the other side of the bifurcation at the root of the tree. The reference isolate has the derived allele for all genes but arsR. The protein lengths reported are for the ancestral protein sequences, which are longer than the derived sequences in the cases of nonsense mutation and deletion.

“-” indicates that no gene name applies. An asterisk (“*”) indicates a stop codon at the corresponding nucleotide positions.

Mutations discussed in the text The locations of the mutations on the nucleotide and protein sequences and their nature are described. Most of the sequence identifiers are for the reference sequence for the cluster, that of PDT000639468.2, a member of the Salinas clade. The exceptions are the protein sequences for fdeC and arpA (because protein sequences do not exist for the reference, which has premature stop codons) and the sequences for yeeJ (so that the full-length ancestral coding sequence and protein can be listed). These are taken from isolate PDT000430557.2, which is from the other side of the bifurcation at the root of the tree. The reference isolate has the derived allele for all genes but arsR. The protein lengths reported are for the ancestral protein sequences, which are longer than the derived sequences in the cases of nonsense mutation and deletion. “-” indicates that no gene name applies. An asterisk (“*”) indicates a stop codon at the corresponding nucleotide positions. In addition to these point mutations, a deletion of 1,218 bp (406 amino acids) occurred in the adhesin yeeJ, which mediates adhesion to abiotic surfaces and promotes biofilm formation (11, 12). The deletion is apparently the result of recombination between two copies of a 14 bp sequence in the gene. It eliminates 4 of the 17 immunoglobulin-like domains. The ancestral gene is similar in length to the longer variant previously described (12), but the deletion allele is distinct from the shorter variant and therefore might differ functionally. Adhesins are found on the surface of bacteria and mediate adhesion to host cells, abiotic surfaces, and/or other bacterial cells, and can promote biofilm formation. Adhesins are involved in EHEC adherence to the cells of the RAJ (13). The adhesin mutations described above may alter their adhesive properties, enabling colonization of the RAJ or otherwise accounting for the recurrence of outbreaks. They might also contribute to human pathogenesis, though this effect would not be subject to selection because human infections do not usually spread far beyond the individual first infected. The nonsense mutation in fdeC likely does not inactivate it. It occurs at position 1,177 of a 1,417 amino acid protein with many domains. An even shorter version of this protein binds to mammalian cells (14). Truncation may even be necessary for this binding (15). In any case, inactivation of adhesin genes can promote adhesion to cells of the RAJ (13). A nonsense mutation also occurred in arpA, which encodes an ankyrin repeat protein (erroneously annotated in the MG1655 genome [16]). Deletion of arpA is associated with neonatal meningitis (17). The protein is distantly related to EspL2, an enterotoxin that promotes adhesion (18, 19), which suggests that this nonsense mutation affects adhesion.

Constitutive expression of arsenic resistance genes.

A nonsense mutation in arsR occurred along the branch leading to the last common ancestor of the Santa Maria clade. This gene encodes the repressor of arsB and arsC, the expression of which confers resistance to arsenic and antimony. This mutation truncates the 117 amino acid protein to 88 residues. Truncation of a similar ArsR (95% identical) to 89 residues causes increased expression of a reporter gene in the absence of inducer, corresponding to about 20% of the induced level (20). Truncation to 87 residues abolishes repressor dimerization, which is apparently necessary for repression (20). The nonsense mutation is therefore expected to result in at least significant partial induction, and perhaps full induction, of the arsenic resistance genes in the absence of inducer. This mutation would likely be deleterious in the absence of arsenic or antimony, if only because of wasteful gene expression; this is the presumptive reason for the existence of the repressor. In the presence of high concentrations of arsenic or antimony, the mutation would not be deleterious, since the resistance genes would be fully expressed with or without it. It might, in fact, confer an advantage in an environment in which arsenic or antimony is present intermittently, as it would diminish or eliminate phenotypic lag for resistance. Arsenic in ground water is a problem in many regions, including parts of California. The problem is greatest in portions of the Central Valley (21–23), to the east and north of Santa Maria. This area is the location of extensive cattle operations, and might contain the persistent source of E. coli responsible for recurrent Santa Maria outbreaks. The E. coli could make its way to Santa Maria in manure used for fertilizer or through movement of cattle. Some sources report high arsenic levels in a smaller fraction of wells in the Santa Maria Valley (23). A search for a local reservoir could focus on cattle operations using water from wells with high arsenic levels.

Conclusion.

Changes to adhesion-related proteins likely play a role in the recurring EHEC outbreaks vectored by vegetables grown in Salinas and Santa Maria valleys. The combination of several such mutations may have radically altered the adhesive properties of the bacteria and the nature of host colonization. In addition to contributing to persistence and shedding in cattle, they might increase pathogenicity in humans, leading to a greater probability of outbreak detection and the reporting of a higher fraction of cases. Altered adhesive properties may play a more general role in foodborne outbreaks. The occurrence of multiple changes to adhesion-related sequences in a short time may indicate that the affected bacteria pose a particular threat to human health. The arsR nonsense mutation that affects the Santa Maria clade may be a clue to the location of the source of repeated contamination. The use of sequences for tracing sources of infection is usually based on the inferred relationships between isolates. The case of arsR exemplifies a different kind of inference, in which the expected phenotypic effects of mutations have implications for the environment in which a source of infections is growing.
  20 in total

Review 1.  The acetate switch.

Authors:  Alan J Wolfe
Journal:  Microbiol Mol Biol Rev       Date:  2005-03       Impact factor: 11.056

Review 2.  Super shedding of Escherichia coli O157:H7 by cattle and the impact on beef carcass contamination.

Authors:  Terrance M Arthur; Dayna M Brichta-Harhay; Joseph M Bosilevac; Norasak Kalchayanand; Steven D Shackelford; Tommy L Wheeler; Mohammad Koohmaraie
Journal:  Meat Sci       Date:  2010-09       Impact factor: 5.209

3.  Lymphoid follicle-dense mucosa at the terminal rectum is the principal site of colonization of enterohemorrhagic Escherichia coli O157:H7 in the bovine host.

Authors:  Stuart W Naylor; J Christopher Low; Thomas E Besser; Arvind Mahajan; George J Gunn; Michael C Pearce; Iain J McKendrick; David G E Smith; David L Gally
Journal:  Infect Immun       Date:  2003-03       Impact factor: 3.441

4.  Methylation-Induced Hypermutation in Natural Populations of Bacteria.

Authors:  Joshua L Cherry
Journal:  J Bacteriol       Date:  2018-11-26       Impact factor: 3.490

Review 5.  Molecular mechanisms of Escherichia coli pathogenicity.

Authors:  Matthew A Croxen; B Brett Finlay
Journal:  Nat Rev Microbiol       Date:  2010-01       Impact factor: 60.633

6.  Characterization of an anonymous molecular marker strongly linked to Escherichia coli strains causing neonatal meningitis.

Authors:  Olivier Clermont; Stéphane Bonacorsi; Edouard Bingen
Journal:  J Clin Microbiol       Date:  2004-04       Impact factor: 5.948

7.  FdeC, a novel broadly conserved Escherichia coli adhesin eliciting protection against urinary tract infections.

Authors:  Barbara Nesta; Glen Spraggon; Christopher Alteri; Danilo Gomes Moriel; Roberto Rosini; Daniele Veggi; Sara Smith; Isabella Bertoldi; Ilaria Pastorello; Ilaria Ferlenghi; Maria Rita Fontana; Gad Frankel; Harry L T Mobley; Rino Rappuoli; Mariagrazia Pizza; Laura Serino; Marco Soriani
Journal:  MBio       Date:  2012-04-10       Impact factor: 7.867

8.  Nonfimbrial Adhesin Mutants Reveal Divergent Escherichia coli O157:H7 Adherence Mechanisms on Human and Cattle Epithelial Cells.

Authors:  Matthew R Moreau; Indira T Kudva; Robab Katani; Rebecca Cote; Lingling Li; Terrance M Arthur; Vivek Kapur
Journal:  Int J Microbiol       Date:  2021-01-29

Review 9.  Enterohemorrhagic E. coli (EHEC) pathogenesis.

Authors:  Y Nguyen; Vanessa Sperandio
Journal:  Front Cell Infect Microbiol       Date:  2012-07-12       Impact factor: 5.293

10.  YeeJ is an inverse autotransporter from Escherichia coli that binds to peptidoglycan and promotes biofilm formation.

Authors:  Marta Martinez-Gil; Kelvin G K Goh; Elze Rackaityte; Chizuko Sakamoto; Bianca Audrain; Danilo G Moriel; Makrina Totsika; Jean-Marc Ghigo; Mark A Schembri; Christophe Beloin
Journal:  Sci Rep       Date:  2017-09-12       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.