| Literature DB >> 31747410 |
David A Rasko1,2, Felipe Del Canto3, Qingwei Luo4, James M Fleckenstein4,5, Roberto Vidal3,6, Tracy H Hazen1,2.
Abstract
Enterotoxigenic Escherichia coli (ETEC) is one of the most common diarrheal pathogens in the low- and middle-income regions of the world, however a systematic examination of the genomic content of isolates from Chile has not yet been undertaken. Whole genome sequencing and comparative analysis of a collection of 125 ETEC isolates from three geographic locations in Chile, allowed the interrogation of phylogenomic groups, sequence types and genes specific to isolates from the different geographic locations. A total of 80.8% (101/125) of the ETEC isolates were identified in E. coli phylogroup A, 15.2% (19/125) in phylogroup B, and 4.0% (5/125) in phylogroup E. The over-representation of genomes in phylogroup A was significantly different from other global ETEC genomic studies. The Chilean ETEC isolates could be further subdivided into sub-clades similar to previously defined global ETEC reference lineages that had conserved multi-locus sequence types and toxin profiles. Comparison of the gene content of the Chilean ETEC identified genes that were unique based on geographic location within Chile, phylogenomic classifications or sequence type. Completion of a limited number of genomes provided insight into the ETEC plasmid content, which is conserved in some phylogenomic groups and not conserved in others. These findings suggest that the Chilean ETEC isolates contain unique virulence factor combinations and genomic content compared to global reference ETEC isolates.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31747410 PMCID: PMC6901236 DOI: 10.1371/journal.pntd.0007828
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
In silico determined characteristics of the ETEC genomes selected for complete genome sequencing.
| Strain | Contig Description | Sequence Length (bp) | GC-content (%) | Completion Level | Contig Name | Plasmid Replicon | Virulence Genes | Accession Nos. |
|---|---|---|---|---|---|---|---|---|
| 10754_a_1 | chromosome | 4,897,493 | 50.72 | not circular | 10754a1_chromosome | NA | T6SS, | CP025976 |
| plasmid 1 | 92,477 | 46.77 | circular | 10754a1_p10754a1_92 | IncFII | STh, | CP025977 | |
| plasmid 2 | 46,623 | 45.9 | circular | 10754a1_p10754a1_46 | IncFIB(AP001918) | CS21-like | CP025978 | |
| 10802_a | chromosome | 4,872,344 | 50.71 | circular | 10802a_chromosome | NA | T2SS, T6SS, | CP025973 |
| plasmid 1 | 92,479 | 46.77 | circular | 10802a_p10802a_92 | IncFII | STh, CFA/I, | CP025974 | |
| plasmid 2 | 46,623 | 45.9 | circular | 10802a_p10802a_46 | IncFIB(AP001918) | CS21-like | CP025975 | |
| 11573_a_1 | chromosome | 4,902,738 | 50.72 | not circular | 11573a1_chromosome | NA | T2SS, | CP025970 |
| plasmid 1 | 92,481 | 46.77 | circular | 11573a1_p11573a1_92 | IncFII | STh, | CP025971 | |
| plasmid 2 | 46,623 | 45.9 | circular | 11573a1_p11573a1_46 | IncFIB(AP001918) | CS21-like | CP025972 | |
| 2407_a | chromosome | 4,941,120 | 50.89 | circular | 2407a_chromosome | NA | T2SS, T6SS | CP025967 |
| plasmid 1 | 135,437 | 48.57 | circular | 2407a_p2407a_135 | IncFII | STh, CS6, CS5, | CP025968 | |
| plasmid 2 | 9,863 | 49.15 | circular | 2407a_p2407a_9 | unknown | none | CP025969 |
Fig 1Phylogenomic analysis of the Chilean ETEC isolates.
The whole-genome sequences of the Chilean ETEC isolates were compared with previously sequenced E. coli and Shigella genomes listed in S2 Table using a single nucleotide polymorphism (SNP)-based approach as previously described [46, 47]. SNPs were detected relative to the completed genome sequence of the laboratory isolate E. coli IAI39 using the n ilico Genotyper (ISG) [47]. A total of 221,978 conserved SNP sites, which were present in all of the genomes analyzed, were concatenated into a representative sequence for each genome. A maximum-likelihood phylogeny with 100 bootstrap replicates was inferred using RAxML v.7.2.8 [72]. Isolates designated in red are from Chile, isolates designated in blue are the ETEC lineage references identified in von Mentzer et al [15], and isolates designated in black are reference E. coli and Shigella isolates representing other pathotypes and phylogenomic groups. The letters (A, B1, B2, D, E, and F) designate the E. coli and Shigella phylogroups that were previously defined [73, 74]. Colored circles indicate the country of origin on the inner ring and the heat labile toxin (LT) and heat stable toxin (ST) status on the middle and outer rings respectively. The green arrows indicate the genomes that were completed using Pacific Biosciences sequencing. The scale bar represents the distance of 0.04 nucleotide substitutions per site.
Fig 2Distribution of virulence plasmids among the Chilean ETEC isolates.
Heat maps indicate the presence of the virulence plasmids A) p10802a_92 and B) p10802a_46 among the Chilean ETEC and reference ETEC isolates analyzed in this study. The predicted protein-coding genes of each plasmid were identified using BLASTN LS-BSR [48] as previously described [43]. Each row represents an individual genome that is labeled on the left side by its ETEC toxin content as having the heat labile toxin (LT), heat stable toxin (ST), both LT and ST, or neither LT nor ST. Each column represents a different protein-coding gene of the reference plasmid being compared. Virulence factors of interest are indicated by a red box. A dashed line box indicates a group of genomes that contain all or nearly all of the plasmid genes.
Virulence factor prevalence.
| Traditional PCR | Genomics | |||
|---|---|---|---|---|
| Gene | number | % | number | % |
| LT-I_ | NT | NT | 71 | 56.8 |
| LT-I_ | 71 | 55.5 | 69 | 55.2 |
| STIa_STp_H10407 | 23 | 18.0 | 19 | 15.2 |
| STIb_STh | 98 | 76.6 | 99 | 79.2 |
| 97 | 75.2 | 85 | 68.0 | |
| 105 | 81.4 | 2 | 1.6 | |
| 92 | 71.3 | 92 | 73.6 | |
| 18 | 14.0 | 9 | 7.2 | |
a Prevalence as calculated by LS-BSR in genome data
NT = not tested
c These samples were also tested with western blots for EatA/EtpA were performed and 86 and 91 if the isolates were positive
Colonization factor prevalence.
| Traditional PCR | Genomics | |||
|---|---|---|---|---|
| Colonization factor | number | % | number | % |
| CFAI | 30 | 23.4 | 32 | 25.6 |
| CS1 | 23 | 18.0 | 23 | 18.4 |
| CS2 | 19 | 14.8 | 19 | 15.2 |
| CS3 | 38 | 29.7 | 40 | 32.0 |
| CS5 | 2 | 1.6 | 4 | 3.2 |
| CS6 | 14 | 10.9 | 13 | 10.4 |
| CS7 | 0 | 0.0 | 4 | 3.2 |
| CS8 | 6 | 4.7 | 60 | 48.0 |
| CS12 | 5 | 3.9 | 7 | 5.6 |
| CS15 | 5 | 3.9 | ND | 0.0 |
| CS17 | 2 | 1.6 | 3 | 2.4 |
| CS19 | 1 | 0.8 | 3 | 2.4 |
| CS20 | 17 | 13.3 | 4 | 3.2 |
| CS21 | 106 | 82.8 | 81 | 64.8 |
| CS23 | 1 | 0.8 | 6 | 4.8 |
| CS27b | 0 | 0.0 | 2 | 1.6 |
| NT | 5 | 3.9 | 6 | 4.8 |
| Novel_CF_TW10509 | NT | NT | 4 | 3.2 |
| Novel_CF_TW11786 | NT | NT | 1 | 0.8 |
| Novel_CF_PCFO71 | NT | NT | 23 | 18.4 |
| CFAI_variant | NT | NT | 14 | 11.2 |
a Prevalence as calculated by LS-BSR in genome data
bProtein ID EMV36291.1
ND—Not detected
NT—Not tested