| Literature DB >> 28766584 |
Tracy H Hazen1,2, Sean C Daugherty1, Amol C Shetty1, James P Nataro3, David A Rasko1,2.
Abstract
Enteropathogenic Escherichia coli (EPEC) bacteria are a diverse group of pathogens that cause moderate to severe diarrhea in young children in developing countries. EPEC isolates can be further subclassified as typical EPEC (tEPEC) isolates that contain the bundle-forming pilus (BFP) or as atypical EPEC (aEPEC) isolates that do not contain BFP. Comparative genomics studies have recently highlighted the considerable genomic diversity among EPEC isolates. In the current study, we used RNA sequencing (RNA-Seq) to characterize the global transcriptomes of eight tEPEC isolates representing the identified genomic diversity, as well as one aEPEC isolate. The global transcriptomes were determined for the EPEC isolates under conditions of laboratory growth that are known to induce expression of virulence-associated genes. The findings demonstrate that unique genes of EPEC isolates from diverse phylogenomic lineages contribute to variation in their global transcriptomes. There were also phylogroup-specific differences in the global transcriptomes, including genes involved in iron acquisition, which had significant differential expression in the EPEC isolates belonging to phylogroup B2. Also, three EPEC isolates from the same phylogenomic lineage (EPEC8) had greater levels of similarity in their genomic content and exhibited greater similarities in their global transcriptomes than EPEC from other lineages; however, even among closely related isolates there were isolate-specific differences among their transcriptomes. These findings highlight the transcriptional variability that correlates with the previously unappreciated genomic diversity of EPEC. IMPORTANCE Recent studies have demonstrated that there is considerable genomic diversity among EPEC isolates; however, it is unknown if this genomic diversity leads to differences in their global transcription. This study used RNA-Seq to compare the global transcriptomes of EPEC isolates from diverse phylogenomic lineages. We demonstrate that there are lineage- and isolate-specific differences in the transcriptomes of genomically diverse EPEC isolates during growth under in vitro virulence-inducing conditions. This study addressed biological variation among isolates of a single pathovar in an effort to demonstrate that while each of these isolates is considered an EPEC isolate, there is significant transcriptional diversity among members of this pathovar. Future studies should consider whether this previously undescribed transcriptional variation may play a significant role in isolate-specific variability of EPEC clinical presentations.Entities:
Keywords: EPEC; Escherichia coli; diversity; transcriptome
Year: 2017 PMID: 28766584 PMCID: PMC5527300 DOI: 10.1128/mSystems.00024-17
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 Phylogenomic analysis of representative EPEC isolates. The genome sequences of representative EPEC isolates were compared with those of a reference collection of diverse E. coli and Shigella isolates that had been sequenced previously and are available in the public domain. The genomes were aligned using Mugsy (95) as previously described (31, 94). A 1.9-Mb aligned region was used to generate a maximum-likelihood phylogeny with 100 bootstrap replicates using RAxML v.7.2.8 (97), and the results were visualized using FigTree v.1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/). The representative EPEC isolates that were analyzed using RNA-Seq are indicated in bold.
Number of shared or unique genes identified using LS-BSR analysis
| Isolate ID | Location of | Date of | Phylogenomic | Phylogroup | No. of isolate-specific |
|---|---|---|---|---|---|
| E2348/69 | England | 1969 | EPEC1 | B2 | 206 |
| B171 | United States | 1983 | EPEC2 | B1 | 131 |
| C581-05 | Africa | NK | EPEC4 | B2 | 141 |
| 401140 | Kenya | 2008 | EPEC5 | A | 90 |
| 402290 | Kenya | 2009 | EPEC7 | B1 | 169 |
| 401588 | Kenya | 2008 | EPEC8 | B2 | 212 |
| 302053 | Mozambique | 2009 | EPEC9 | B2 | 53 |
| 100329 | The Gambia | 2008 | EPEC10 | A | 62 |
| E110019 | Finland | 1987 | None | B1 | 164 |
ID, identifier; NK, date of isolation not known.
The total number of core gene clusters (LS-BSR ≥ 0.8) in all EPEC isolates.
The numbers of gene clusters with significant similarity (LS-BSR ≥ 0.8) in all genomes of one phylogroup that were divergent (LS-BSR < 0.8, ≥ 0.4) or absent (LS-BSR < 0.4) from genomes of other phylogroups were 44 (phylogroup A), 62 (phylogroup B1), and 128 (phylogroup B2).
The isolate-specific genes are those identified in one genome with an LS-BSR ≥ 0.8 and in the other genomes with an LS-BSR < 0.4.
FIG 2 Principal-component analysis and hierarchical cluster analysis of the RNA-Seq samples examined in this study. (A) Principal-component analysis of the expression of gene clusters identified in the EPEC isolates, comparing all RNA-Seq samples analyzed. The first (PC1) and second (PC2) principal components were visualized in a scatterplot to demonstrate the clustering of the strains by gene content and gene expression. The comparative matrix is composed of the maximum normalized expression values for each gene cluster in each of the isolates and under each of the conditions examined for the gene clusters that were present among all isolates. The first component contains 36.75% of the variation, and the second component is responsible for 12.7% of the variation. The samples are colored by isolate, and symbols represent LB or DMEM as indicated in the legend. (B) A heatmap with clustering analysis of the expression values was constructed for the 674 LS-BSR gene clusters that were present in all of the EPEC isolates and had the greatest standard deviations of expression values. The normalized gene expression values were used to compute the standard deviation for each LS-BSR gene cluster across all samples. The heatmap was constructed using the R package gplots v2.11.0. A red label at the top of the heatmap designates the DMEM samples, while a blue label designates the LB samples.
Number of genes that were differentially expressed in the EPEC isolates examined in this study
| Isolate ID | Phylogenomic | Phylogroup | LFC ≥ 2 | LFC ≤ −2 | Total DE | No. of DE | No. of DE | No. of | Total DE |
|---|---|---|---|---|---|---|---|---|---|
| E2348/69 | EPEC1 | B2 | 180 | 251 | 431 | 253 | 18 | 10 | 6 |
| B171 | EPEC2 | B1 | 220 | 255 | 475 | 249 | 2 | 4 | 10 |
| C581-05 | EPEC4 | B2 | 162 | 235 | 397 | 291 | 12 | 1 | 17 |
| 401140 | EPEC5 | A | 145 | 189 | 334 | 242 | 1 | 3 | 22 |
| 402290 | EPEC7 | B1 | 243 | 268 | 511 | 354 | 1 | 2 | 7 |
| 401588 | EPEC8 | B2 | 267 | 134 | 401 | 253 | 11 | 4 | 12 |
| 302053 | EPEC9 | B2 | 228 | 315 | 543 | 392 | 15 | 0 | 21 |
| 100329 | EPEC10 | A | 278 | 294 | 572 | 394 | 1 | 2 | 30 |
| E110019 | None | B1 | 172 | 246 | 418 | 280 | 0 | 7 | 9 |
The phylogenomic lineage and phylogroup are those that have been previously described (Hazen et al. [32], Jaureguy et al. [41], Tenaillon et al. [42]).
LFC, log2-fold change of the genes that exhibited significant (LFC ≥ 2 or ≤ −2 and FDR ≤ 0.05) differential expression (DE).
The total number of genes that exhibited significant (LFC ≥ 2 or ≤ −2 and FDR ≤ 0.05) DE.
The total number of core gene clusters (LS-BSR ≥ 0.8 in all genomes) was 2,989.
The number of clusters in all EPEC genomes of one phylogroup that were divergent or absent (LS-BSR < 0.8) from EPEC genomes of the other phylogroups were 44 (phylogroup A), 62 (phylogroup B1), and 128 (phylogroup B2).
The isolate-specific genes are those that were in one genome with an LS-BSR ≥ 0.8 and in the other genomes with an LS-BSR < 0.4.
The total number of sRNA that were previously investigated in E. coli by Raghavan et al. (60) and exhibited significant (LFC ≥ 2 or ≤ −2 and FDR ≤ 0.05) DE in each of the nine EPEC isolates.
FIG 3 Comparison of the global transcriptomes of nine EPEC isolates. A circular plot of the log2-fold-change (LFC) values for genes that exhibited significant differential expression during exponential growth in DMEM compared to LB is shown. The outermost track contains all of the significant LFC values for each of the indicated nine EPEC isolates. The inner tracks are numbered to correspond to the same number of each EPEC isolate in the outermost track. The phylogroup that each EPEC isolate belongs to is indicated in parentheses following the isolate designation. For example, 5-B171 (B1) indicates that data tracks containing B171 data are labeled track 5 in all of the comparisons, while B171 is the isolate designation and B1 the phylogroup designation. The inner tracks contain the LFC values of genes of another of the EPEC isolates belonging to the same LS-BSR gene cluster as the genes in the outermost reference track. The genes that were not identified in the other EPEC isolates or did not exhibit significant differential expression are absent from the inner tracks.
FIG 4 Comparison of the global transcriptomes of multiple EPEC isolates of the EPEC8 phylogenomic lineage. (A) Circular plot of the log2-fold-change (LFC) values for genes that exhibited significant differential expression during exponential growth in DMEM compared to LB. The outermost track contains all of the significant LFC values for one of the three EPEC8 isolates. The inner tracks contain the LFC values of genes of another of the EPEC8 isolates belonging to the same LS-BSR gene cluster as the genes in the outermost reference track. The genes that were not identified in the other EPEC8 isolates and/or did not exhibit significant differential expression are absent from the inner tracks. The number of genes that had significant differential expression is indicated in parentheses next to each isolate name. (B) The number of genes that were highly conserved (LS-BSR ≥ 0.8) in all four of the EPEC isolates is indicated in the center. The number of genes that were identified with significant similarity (LS-BSR ≥ 0.8) that also exhibited significant differential expression in two or three EPEC isolates is also designated. The number of isolate-specific genes indicates the genes that were identified with significant similarity in one EPEC isolate and that were divergent or absent from the other two EPEC8 isolates. (C) Venn diagram showing the number of genes differentially expressed for each of the EPEC8 isolates analyzed in this study grown to an OD600 of 0.5 in DMEM compared to LB. The number of core genes that were highly conserved (LS-BSR ≥ 0.8) in all three of the EPEC8 isolates that also exhibited significant differential expression in all of the EPEC8 isolates is indicated in the center. There were no genes that were present with significant similarity and also differentially expressed in only two of the three EPEC8 isolates. The number of isolate-specific genes indicates those genes that exhibited significant similarity (LS-BSR ≥ 0.8) that were divergent or absent (LS-BSR < 0.8) from the other two EPEC8 isolates and also exhibited significant differential expression during growth to an OD600 of 0.5 in DMEM compared to LB.
FIG 5 Differential expression analysis of genes carried by the LEE and BFP regions in each of the representative EPEC isolates analyzed. (A) Diagram of the genetic structure and results of differential expression analysis of LEE genes for the nine representative EPEC isolates examined (E2348/69, B171, C581-05, 401588, 401140, 402290, 302053, 100329, and E110019) during exponential growth (OD600 = 0.5) in DMEM compared to LB. The values in the heatmap are the significant log2-fold-change (LFC) values for LEE genes. (B) Diagram of the gene organization and the RNA-Seq LFC values of genes carried by the BFP operon of the eight EPEC isolates that contained BFP (E2348/69, B171, C581-05, 401588, 401140, 402290, 302053, and 100329). The expression values in the heatmap are significant LFC values for BFP genes.
FIG 6 Differential expression analysis of known virulence genes of EPEC. The phylogroup that each EPEC isolate belongs to is indicated in parentheses. (A) Heatmap of LS-BSR values indicating the presence or absence of known EPEC virulence genes in the genomes of each of the EPEC isolates analyzed. Genes present with significant similarity are indicated by yellow, genes with divergent similarity are indicated by black, and genes that are absent are indicated by blue. (B) Heatmap of the log2-fold-change (LFC) values for known virulence genes of EPEC that exhibited significant differential expression during exponential growth in DMEM compared to LB. The color gradient indicates decreased expression (green) or increased expression (red) of the virulence genes, while white indicates a gene that either was not present in the isolate or did not exhibit significant differential expression.