| Literature DB >> 35467376 |
Joshua L Cherry1,2.
Abstract
Enterohemorrhagic E. coli (EHEC) is responsible for significant human illness, death, and economic loss. The main reservoir for EHEC is cattle, but plant-based foods are common vectors for human infection. Several outbreaks have been attributed to lettuce and leafy green vegetables grown in the Salinas and Santa Maria regions of California. Bacteria causing different outbreaks are mostly not close relatives, but one group of closely-related O157:H7 has caused several of them. This unusual pattern of recurrence may have some genetic basis. Here I use whole-genome sequences to reconstruct the genetic changes that occurred in the recent ancestry of this EHEC. In a short period of time corresponding to little genetic change, there were several changes to adhesion-related sequences, mainly adhesins. These changes may have greatly altered the adhesive properties of the bacteria. Possible consequences include increased persistence of cattle infections, more bacteria shed in cattle feces, and greater virulence in humans. Similar constellations of genetic change, which are detectable by current sequencing-based surveillance, may identify other bacteria that are particular threats to human health. In addition, the Santa Maria subclade carries a nonsense mutation affecting ArsR, a repressor of genes that confer resistance to arsenic and antimony. This suggests that the persistent source of Santa Maria contamination is located in an area with arsenic-contaminated groundwater, a problem in many parts of California. This inference may aid identification of the reservoir of EHEC, which would greatly aid mitigation efforts. IMPORTANCE Food-borne bacterial infections cause substantial illness and death. Understanding how bacteria contaminate food and cause disease is important for combating the problem. Closely-related E. coli, likely originating in cattle, have repeatedly caused outbreaks spread by vegetables grown in California. Such recurrence is atypical, and might have a genetic basis. The genetic changes that occurred in the recent ancestry of these E. coli can be reconstructed from their DNA sequences. Several mutations affect genes involved in bacterial adhesion. These might affect persistence of infection in cattle, quantity of bacteria in their feces, and human disease. They also suggest a way of detecting dangerous bacteria from their genome sequences. Furthermore, a subgroup carries a mutation affecting the regulation of genes conferring arsenic resistance. This suggests that the reservoir for contamination utilizes groundwater contaminated with arsenic, a problem in parts of California. This observation may be an aid to locating the persistent reservoir of contamination.Entities:
Keywords: Escherichia coli; adhesins; adhesion molecules; arsenic resistance; food-borne pathogens
Mesh:
Substances:
Year: 2022 PMID: 35467376 PMCID: PMC9241674 DOI: 10.1128/spectrum.00501-22
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
FIG 1Phylogenetic tree of the cluster containing the Salinas and Santa Maria recurrent outbreak strains. Triangles represent collapsed clades, and the number within each indicates the number of isolates in the clade. The total number of isolates is 494. Mutations of interest are indicated to the right of the branches on which they occurred. The scale bar corresponds to four SNPs. The height of the tree from the root to the most distant tip is 36 SNPs. The tree obtained from the NCBI Pathogens data was rerooted based on outgroup sequences.
Mutations discussed in the text
| Gene name | Protein description | Nucleotide sequence identifier | Nucleotide position | Nucleotide change | Protein sequence identifier | Protein position | Protein change | Protein length (ancestral) |
|---|---|---|---|---|---|---|---|---|
| - | Ig-like domain-containing protein | AAQXIR010000008.1 | 234093 | C→T | EEQ5358295.1 | 212 | A→V | 1461 |
| - | Ig-like domain-containing protein | AAQXIR010000008.1 | 234346 | T→G | EEQ5358295.1 | 296 | F→L | 1461 |
|
| Intimin-like adhesin | AAQXIR010000008.1 | 9658 | C→A | EEV2727796.1 | 1177 | S→* | 1417 |
|
| Surface-exposed virulence protein | AAQXIR010000021.1 | 61345 | C→A | EEQ5359752.1 | 150 | P→Q | 1011 |
|
| Inverse autotransporter adhesin | AARLPB010000083.1 | 5978-7195 | Deletion | EEV2728103.1 | 1036-1441 | Deletion | 2660 |
|
| ShET2/EspL2 family type III secretion system effector toxin | AAQXIR010000007.1 | 279847 | G→T | EEV2726747.1 | 133 | E→* | 728 |
|
| Arsenic/antimony resistance repressor | AAQXIR010000005.1 | 121773 | G→A | EEQ5357400.1 | 89 | W→* | 117 |
The locations of the mutations on the nucleotide and protein sequences and their nature are described. Most of the sequence identifiers are for the reference sequence for the cluster, that of PDT000639468.2, a member of the Salinas clade. The exceptions are the protein sequences for fdeC and arpA (because protein sequences do not exist for the reference, which has premature stop codons) and the sequences for yeeJ (so that the full-length ancestral coding sequence and protein can be listed). These are taken from isolate PDT000430557.2, which is from the other side of the bifurcation at the root of the tree. The reference isolate has the derived allele for all genes but arsR. The protein lengths reported are for the ancestral protein sequences, which are longer than the derived sequences in the cases of nonsense mutation and deletion.
“-” indicates that no gene name applies. An asterisk (“*”) indicates a stop codon at the corresponding nucleotide positions.