Literature DB >> 34939560

Genome placement of alpha-haemolysin cluster is associated with alpha-haemolysin sequence variation, adhesin and iron acquisition factor profile of Escherichia coli.

Rafał Kolenda1, Katarzyna Sidorczuk2, Mateusz Noszka3, Adrianna Aleksandrowicz1, Muhammad Moman Khan4, Michał Burdukiewicz5, Derek Pickard6, Peter Schierack4,7.   

Abstract

Since the discovery of haemolysis, many studies focused on a deeper understanding of this phenotype in Escherichia coli and its association with other virulence genes, diseases and pathogenic attributes/functions in the host. Our virulence-associated factor profiling and genome-wide association analysis of genomes of haemolytic and nonhaemolytic E. coli unveiled high prevalence of adhesins, iron acquisition genes and toxins in haemolytic bacteria. In the case of fimbriae with high prevalence, we analysed sequence variation of FimH, EcpD and CsgA, and showed that different adhesin variants were present in the analysed groups, indicating altered adhesive capabilities of haemolytic and nonhaemolytic E. coli. Analysis of over 1000 haemolytic E. coli genomes revealed that they are pathotypically, genetically and antigenically diverse, but their adhesin and iron acquisition repertoire is associated with genome placement of hlyCABD cluster. Haemolytic E. coli with chromosome-encoded alpha-haemolysin had high frequency of P, S, Auf fimbriae and multiple iron acquisition systems such as aerobactin, yersiniabactin, salmochelin, Fec, Sit, Bfd and hemin uptake systems. Haemolytic E. coli with plasmid-encoded alpha-haemolysin had similar adhesin profile to nonpathogenic E. coli, with high prevalence of Stg, Yra, Ygi, Ycb, Ybg, Ycf, Sfm, F9 fimbriae, Paa, Lda, intimin and type 3 secretion system encoding genes. Analysis of HlyCABD sequence variation revealed presence of variants associated with genome placement and pathotype.

Entities:  

Keywords:  Escherichia coli; adhesins; alpha-haemolysin; genomics; haemolysin; toxins; virulence-associated genes

Mesh:

Substances:

Year:  2021        PMID: 34939560      PMCID: PMC8767327          DOI: 10.1099/mgen.0.000743

Source DB:  PubMed          Journal:  Microb Genom        ISSN: 2057-5858


Data Summary

Sequences of 220 sequenced for this study are freely available from the NCBI BioProject database under accession number PRJNA725000. Since the discovery of haemolysis, many studies have been undertaken to analyse the role of this phenotype in along with its association with other virulence genes, diseases and pathogenic actions in the host but the results have been quite often inconclusive. Analysis of over 1000 haemolytic genomes allowed us to show that isolates with chromosome- and plasmid-encoded haemolysin differ in virulence factor prevalence and alpha-haemolysin sequence. Taking into consideration that all investigations concerning the role of alpha-haemolysin were conducted with the use of isolates with chromosome-encoded haemolysin, this is the first study to examine genomes with plasmid-encoded haemolysin, which have fewer iron acquisition systems and possess a different set of adhesion factors than bacteria with chromosome-encoded hemolysin.

Introduction

is a versatile bacterium that colonizes intestines of clinically healthy mammals and birds [1]. Some are pathogenic, causing various intestinal and extra-intestinal diseases affecting humans and animals worldwide [2]. Most investigations of pathogenic over the last decades focused on the characterization of genes/operons associated with virulence [virulence-associated genes/factors (VAGs/VAFs)] and molecular virulence mechanisms of pathogenic [3]. On the basis of their characteristic VAGs and infection phenotypes, are divided into several pathotypes (reviewed in detail in [4]). Adhesins, iron acquisition systems and toxins constitute the majority of VAGs determining the ability of a particular to colonize the host [1]. Adhesins are responsible not only for binding and invasion of various cell types but also for interactions between bacteria, and between bacteria and abiotic surfaces [5, 6]. These interactions enable bacteria to form microcolonies and biofilms, colonize surfaces and attach to receptors expressed at the cell surface and their subsequent invasion [7-9]. Another group of factors that mediate adhesion and/or invasion of cells includes type 3 and 6 secretion systems (T3SS, T6SS) [10, 11]. Iron acquisition and storage genes are important for the survival of bacteria in an environment with limited access to this element [12]. Many iron acquisition and storage systems have been identified in and can be divided into siderophores, iron transporters, haem and hemin uptake systems [13]. Toxins – haemolysins and proteases – are hypothesized to damage host cells and tissues, which leads to haemoglobin and haem release increasing iron availability [14]. Five haemolysins have been described so far in the literature for [15]. Two of them – bacteriophage-associated enterohaemolysin (Ehly) and haemolysin F (HlyF) – turned out to be recombinase RecT [16] and SDR (short-chain dehydrogenase/reductase) family oxidoreductase [17], respectively. Therefore, these will not be considered as haemolysins in this study. The other three haemolysins are alpha-haemolysin, enterohaemolysin (Ehx/EHEC-hly) and silent haemolysin (HlyE) [18, 19]. Enterohaemolysin is a pore-forming toxin, mainly associated with Shiga-toxigenic and enterohaemorrhagic (STEC and EHEC) [20]. It is encoded on the plasmid as a part of an operon consisting of four genes: ehxA (or EHEC-hlyA, encoding functional toxin), ehxC (or EHEC-hlyC, responsible for post-translational modification of EhxA), ehxB and ehxD (also known as EHEC-hlyB and EHEC-hlyD, responsible for the transport of EhxA through the inner membrane) [21]. Silent haemolysin is a pore-forming toxin as well and may be also referred to as haemolysin E, cytolysin A (ClyA) and silent haemolysin locus A (SheA). HlyE expressing causes lysis of mammalian erythrocytes, cytotoxicity to cultured mammalian cells, apoptosis induction in macrophages and reduces intracellular Ca2+ oscillations in epithelial cells [22]. Alpha-haemolysin can be encoded on plasmids or chromosomal pathogenicity islands [23]. Similar to Ehx, alpha-haemolysin is part of an operon consisting of four genes: hlyA (encoding functional toxin HlyA), hlyC (responsible for post-translational modification of HlyA), hlyB and hlyD (responsible for the transport of HlyA through the inner membrane). The role and importance of alpha-haemolysin in pathogenesis remain unclear so far. Alpha-haemolysin has been detected in various pathotypes and commensal but is mainly associated with extraintestinal pathogenic (ExPEC), and there are indications to implicate alpha-haemolysin in the pathogenesis of ExPEC pathotypes. Possible roles of alpha-haemolysin in pathogenesis include exfoliation of epithelial cells to grant access to underlying tissue sites, modulation of host immune response or cell signalling subversion and induction of apoptosis [21]. Different modes of pathogenic actions assigned to alpha-haemolysin depend on cell type, infection site environment and concentration of secreted haemolysin. In vivo studies with uropathogenic (UPEC) strains CFT073 and UTI89 showed other genetic factors influencing the outcome of alpha-haemolysin loss-of-function models and finally, undermining the ‘importance’ of investigated VAG [24]. Since the discovery of haemolysis, many studies have been undertaken to analyse the role of this phenotype in along with its association with other virulence genes, diseases and pathogenic actions in the host but the results have been quite often inconclusive. The reason for these inconsistent results can be associated with limited availability of virulence-associated factors to test or inconsistent selection of virulence factors used in testing. Our previous study focused on VAGs' association with the presence of haemolysin in isolated from healthy pigs [25]. Taking advantage of developments in sequencing technologies and the presence of publicly available genome sequences, we decided to expand the scope of our study and provide a global picture of the haemolytic property and its associated contributing VAGs/VAFs differing in haemolytic and nonhaemolytic . The objectives of this study were to analyse and characterize the diverse collection of 220 haemolytic and nonhaemolytic based on their respective genomes. The analysis was expanded by utilization of already available genomes in the GenBank database to acquire a global overview allowing us to further explore various genetic and sequence variations involved in the haemolytic ability possessed by leading to the revelation of novel virulence-associated genes/factors linked with alpha-haemolysin.

Methods

Bacteria

Isolates (n=220) from diseased and clinically healthy humans, wild and domestic animals are listed in Table 1. To identify isolates, rectal (mammals) or cloacal (birds) swabs and urine samples were plated onto CHROMagar orientation agar [26]. After incubation at 37 °C, were initially identified as pink colonies. For further confirmation, these colonies were transferred onto Gassner Agar. Colonies appearing pink on CHROMagar and blue/yellow/green on Gassner Agar were assumed to be [27-29]. Haemolysis of each isolate was tested on 5 % sheep blood agar by a clear (transparent) zone surrounding the colonies. All isolates from urine samples were considered as UPEC. A single colony (pink on CHROMagar orientation and haemolytic or nonhaemolytic on blood agar plates) was subcultured twice on CHROMagar orientation plates and stored in 15 % glycerol at −80 °C.
Table 1.

Origin of used in this study

Species

Group name

No. of isolates

Haemolytic

Nonhaemolytic

Human

Homo sapiens

Human, healthy*, †

12

19

Homo sapiens

Human, urinary tract infection‡

13

20

Domestic mammals

Sus scrofa domestica

Domestic pig, healthy§

13

19

Wild mammals

Capreolus capreolus

Roe deer¶, **

8

23

Martes sp.

Marten¶

6

15

Procyon lotor

Raccoon¶

10

17

Vulpes vulpes

Red fox¶

5

20

Wild birds

Anas platyrhynchos

Mallard††

13

6

Total

81

139

*, †Sampled by Thomas Wex (Magdeburg, Germany) and Peter Schierack (Senftenberg, Germany).

‡Urine samples from patients with urinary tract infections collected by Steffen Vogel (Hoyerswerda, Germany) in the hospital in 2009.

§Samples from 18 different pig production units in eastern Germany collected by Peter Schierack from 2009 to 2010.

¶Samples collected from Lausitz (Lusatia), a region in south-eastern Germany, taken by Peter Schierack from the rectum (mammals) or cloaca (birds) of dead animals, which were collected directly as accident victims or delivered to Hermann Ansorge (Görlitz, Germany) and Olaf Zinke (Kamenz, Germany) between 2007 and 2011.

**Rectal samples were taken during several hunts by Peter Schierack between 2007 and 2010.

††Cloacal samples of wild birds were taken by Peter Schierack in the winter between 2007–2008 and 2010–2011.

Origin of used in this study Species Group name No. of isolates Haemolytic Nonhaemolytic Human Homo sapiens Human, healthy*, † 12 19 Homo sapiens Human, urinary tract infection‡ 13 20 Domestic mammals Sus scrofa domestica Domestic pig, healthy§ 13 19 Wild mammals Capreolus capreolus Roe deer¶, ** 8 23 Martes sp. Marten¶ 6 15 Procyon lotor Raccoon¶ 10 17 Vulpes vulpes Red fox¶ 5 20 Wild birds Anas platyrhynchos Mallard†† 13 6 Total 81 139 *, †Sampled by Thomas Wex (Magdeburg, Germany) and Peter Schierack (Senftenberg, Germany). ‡Urine samples from patients with urinary tract infections collected by Steffen Vogel (Hoyerswerda, Germany) in the hospital in 2009. §Samples from 18 different pig production units in eastern Germany collected by Peter Schierack from 2009 to 2010. ¶Samples collected from Lausitz (Lusatia), a region in south-eastern Germany, taken by Peter Schierack from the rectum (mammals) or cloaca (birds) of dead animals, which were collected directly as accident victims or delivered to Hermann Ansorge (Görlitz, Germany) and Olaf Zinke (Kamenz, Germany) between 2007 and 2011. **Rectal samples were taken during several hunts by Peter Schierack between 2007 and 2010. ††Cloacal samples of wild birds were taken by Peter Schierack in the winter between 2007–2008 and 2010–2011.

Genome sequencing, assembly and annotation

Wizard Genomic DNA Purification Kit (Promega) was used to extract genomic DNA of 220 isolates. Extracted DNA was sequenced by HiSeq X platform at Sanger Institute Sequencing Facility. FastQ Screen was used to determine the quality of sequences [30]. Genome assemblies were acquired by utilising Shovill pipeline and assembly_improvement pipeline, followed by annotation using Prokka version 1.13.5 [31, 32]. Assembled genome sequences have been deposited at the NCBI under the BioProject ID: PRJNA725000.

Analysis of 220 genomes

Pangenomes were determined using Roary version 3.12 with a minimal percentage of identity for blast equal or higher than 95 % and splitting the paralogues [33]. Core genome alignment provided by Roary analysis was used to obtain maximum-likelihood (ML) core-genome phylogeny with RAxML version 8.2.12 employing a GTRGAMMA substitution model and 100 bootstrap replications [34]. In silico serotyping was performed with SRST2 [35]. phylotypes were assigned by ClermonTyping [36]. Bayesian analysis of population structure (BAPS) was carried out by RhierBAPS [37]. Genome placement of hlyCABD operon was analysed with the use of blast by comparing reference sequences of 5′-upstream of hlyC and 3′-downstream of hlyD (Table S1, available in the online version of this article) [38]. Prevalence of VAGs and multilocus sequence typing (MLST) were determined by ariba version 2.13 and Virulence Factor Database Core Collection (VFDB) [39, 40]. Multiple genes encoding the same VAF were reduced to a single factor as shown in Table S2. Additionally, PCRs were carried out for detection of ten VAGs (sitchr, traT, hra, sitep, malX, cvi/cva, iss, tia, ireA, csgA) [41]. Genome-wide association of gene presence between haemolytic and nonhaemolytic was conducted with Scoary software [42]. For all genes with Bonferroni-adjusted P-value lower than 0.001 in Scoary analysis, the genes belonging to the following groups: iron acquisition, outer membrane, LPS, secretion systems, multidrug efflux transporter, flagella, toxins, adhesion, biofilm, stress and transcriptional regulators were overrepresented in one bacterial group and counted. Read mapping and SNP calling for fimH, ecpD and csgA genes were achieved using bwa, samtools and bcftools [43].

Analysis of GenBank genome collection

In order to select genomes containing alpha-haemolysin, a collection of genomes (n=22,752) was downloaded from GenBank database and blasted against hlyA, hlyB, hlyC and hlyD sequences of isolate UTI89 (GenBank accession number: CP000243.1). Genomes that contained all four genes were selected for further analysis. Additionally, genomes with no information about coverage (with the exception of complete genome sequences), coverage less than 50 times, number of contigs higher than 400 and third-generation sequencing platforms, i.e. ‘Oxford Nanopore’ and ‘Pacific Biosciences’, were excluded from analysis. In order to select genomes of nonpathogenic , GenBank collection of was blasted against VAGs and pathotyped according to the presence of VAGs listed in Table S3. Genomes with none of the VAGs or only one of fimH, fyuA, iucC, neuC, sitA and yfcV were considered nonpathogenic . Additionally, part of the nonpathogenic genomes was filtered out like in the case of hlyCABD-positive genomes as described above. Genomes were annotated with the use of Prokka version 1.13.5 [32]. BAPS was performed by fastBAPS [44]. phylogroups and hlyCABD genome placement were determined as mentioned in the previous paragraph. MLST was examined with the use of mlst and PubMLST database [45, 46]. Serotypes were assigned with the use of ABRicate software and EcOH database [35 ]. Adhesin, toxin and iron acquisition genes were selected by literature review and sequences were manually collected from GenBank (Table S4). The prevalence of adhesin coding genes or toxin and iron acquisition genes listed in Table S4, was tested with ABRicate. Multiple genes encoding one adhesin, iron acquisition and toxin system were reduced to one VAF as shown in Table S4. Nucleotide sequences of hlyA, hlyB, hlyC and hlyD from haemolytic were extracted from genomes with use of blast and hlyA, hlyB, hlyC and hlyD sequences of isolate UTI89 (GenBank accession number: CP000243.1) were used as reference. Next, nucleotide sequences were translated into amino acid sequences with use of UGENE [47]. Amino acid sequences with 100 % identity and 100 % coverage were clustered together with use of CD-HIT [48]. The protein sequence was considered a variant if it differed from other proteins with at least one amino acid. Variant numbers were given in order of the viariant prevalence (the lowest number ‘0’ was given to the variant with the highest prevalence). HlyA phylogenetic tree was constructed with the use of alignment of variants that were at least 99 % of length of reference UTI89 HlyA protein (which is 1024 amino acids long) and RAxML version 8.2.12 employing HIVW substitution model with 500 bootstrap replications. HlyA protein features were downloaded from InterPro protein families and domains database [49]. Average number of variable sites was calculated by dividing variable sites in feature by all sites in feature.

Figures and statistical analysis

All figures were generated with use of ggplot2 and ggalluvial packages implemented in R software [50-52]. Gene prevalence was compared with Chi-squared test of independence implemented in R stats package [53]. Phylogenetic trees were annotated with iTOL software [54].

Results

Characterization of 220 haemolytic and nonhaemolytic genomes

In total, 81 haemolytic and 139 nonhaemolytic were isolated to generate a collection of bacteria covering various hosts and different clinical status (Table 1). Phylogenetic analysis revealed that all isolates could be categorized into two different lineages (Fig. 1). Lineage 1 comprised of 68 haemolytic (84%) and 38 nonhaemolytic isolates (27%), whereas lineage 2 had only 13 haemolytic (16%) and 101 nonhaemolytic (73%) . Nearly all isolates from lineage 1 belonged to the phylogroup B2, whereas lineage 2 included isolates from all phylogroups, except for B2. Haemolytic isolates belonged to groups A, B1 and B2, while nonhaemolytic had representatives in each phylogroup (Table S5). Investigation of the genetic structure with BAPS revealed eight clusters, from which three clusters belonged to lineage 1 and phylogroup B2. In the case of isolates from lineage 2, BAPS clusters corresponded well with phylogroups.
Fig. 1.

Phylogenetic relationship between 220 haemolytic and nonhaemolytic E. coli. Core-genome phylogenetic tree for 81 haemolytic and 139 nonhaemolytic based on 2628 genes. isolates diverge into two lineages, 1 and 2, marked with orange and green background, respectively. Lineage, phylogroup, BAPS group, ST, serotype, host/disease and haemolysis and alpha-haemolysin genomic placement were annotated on the tree with the use of iTOL.

Phylogenetic relationship between 220 haemolytic and nonhaemolytic E. coli. Core-genome phylogenetic tree for 81 haemolytic and 139 nonhaemolytic based on 2628 genes. isolates diverge into two lineages, 1 and 2, marked with orange and green background, respectively. Lineage, phylogroup, BAPS group, ST, serotype, host/disease and haemolysis and alpha-haemolysin genomic placement were annotated on the tree with the use of iTOL. Sequence types (STs) and serotypes were assigned to each isolate. Haemolytic were represented by nearly four times less STs and serotypes when compared with nonhaemolytic isolates (Figs 1 and S1). Both groups shared only eight STs and ten serotypes. Haemolytic had 17 unique STs and serotypes, while nonhaemolytic had 85 and 94 unique STs and serotypes, respectively. Operon hlyCABD encoding alpha-haemolysin was detected in 78 out of 81 haemolytic . All haemolytic isolates did not possess the genes encoding enterohaemolysin and silent haemolysin. All nonhaemolytic isolates were devoid of genes encoding alpha-haemolysin, but 92 isolates (66.2%) were positive for hlyE encoding silent haemolysin and eight were positive for ehxCABD encoding enterohaemolysin operon. Analysis of hlyCABD operon placement revealed that for all isolates from lineage 1, these genes were carried on the chromosome, whereas all isolates from lineage 2, with the exception of one isolate, had these genes located on the plasmid. All isolates with chromosome-encoded hlyCABD belonged to phylogroup B1 and B2, while plasmid-encoded alpha-haemolysin was found in isolates from phylogroup A, isolated nearly exclusively from pigs.

Virulence-associated factors profiling revealed high prevalence of adhesins, iron acquisition genes and toxins in haemolytic

In order to investigate the association of virulence-associated factors with haemolytic and nonhaemolytic , the prevalence of VAGs was examined. In total, 255 VAGs coding for 66 different VAFs were detected in at least one isolate, but prevalence differed for only 24 VAFs between haemolytic and nonhaemolytic (Figs 2 and S2). The majority (i.e. 19 of 24) of VAFs were detected significantly more often in haemolytic (P<0.05), and their functions were adhesion, invasion, iron acquisition or toxicity. Four VAFs with similar functions as mentioned earlier were detected more frequently in nonhaemolytic . Surprisingly, one VAF also associated with this group contained VAGs encoding for the type three secretion system. Eight VAFs were present nearly in all isolates, out of which four were associated with iron acquisition and another three with adhesion.
Fig. 2.

Virulence factor prevalence in 220 haemolytic and nonhaemolytic E. coli. Heatmap with virulence factor sequence prevalence in genomes of haemolytic and nonhaemolytic . Groups ‘Haemolytic’ and ‘Nonhaemolytic’ are shown on the y-axis and refer to haemolytic and nonhaemolytic isolates, respectively. Names of virulence factors detected in at least one genome are shown on the x-axis. The colour gradient is proportional to the prevalence of each virulence factor in each group. Results of statistical analysis with Chi-squared test of independence are shown as symbols: ‘o’ - not statistically significant, ‘+’ - 0.01

Virulence factor prevalence in 220 haemolytic and nonhaemolytic E. coli. Heatmap with virulence factor sequence prevalence in genomes of haemolytic and nonhaemolytic . Groups ‘Haemolytic’ and ‘Nonhaemolytic’ are shown on the y-axis and refer to haemolytic and nonhaemolytic isolates, respectively. Names of virulence factors detected in at least one genome are shown on the x-axis. The colour gradient is proportional to the prevalence of each virulence factor in each group. Results of statistical analysis with Chi-squared test of independence are shown as symbols: ‘o’ - not statistically significant, ‘+’ - 0.01

Genome-wide comparison of 220 haemolytic and nonhaemolytic genomes showed differences in the prevalence of adhesin, outer membrane and iron acquisition genes

Analysis of VAGs prevalence was initially limited to genes present in the VFDB_core database. To find other genetic traits associated with haemolytic and nonhaemolytic groups, the pangenome was inferred, and differences in gene content between haemolytic and nonhaemolytic were investigated (for ‘gene’ please take into consideration settings of Roary). GWAS analysis revealed 1080 genes overrepresented in one of the groups (P<0.001) and the majority associated with metabolism. Overall, 184 genes were identified as VAGs and assigned to one of the ten functional groups (Table 2). Adhesion as the VAG group had the highest frequency of genes associated with one of the groups and there were 13 additional adhesion VAGs identified in haemolytic (P<0.1). Genes encoding for P, Yfc, Yde (F9), Ecp, Yeh, Yad fimbriae and autotransporters YeeJ, YfaL were more often found in haemolytic (P<0.001) (Table S5). Another large functional group covers genes associated with outer membrane, where genes responsible for LPS and flagella synthesis, secretion systems and multidrug efflux transporters are categorized. The presence of secretion systems and LPS genes were associated with haemolytic phenotype (P<0.001 and P<0.05, respectively). The majority of genes assigned to secretion systems group were coding for type II secretion system. The highest difference in gene presence between haemolytic and nonhaemolytic was found in the group containing iron acquisition genes (P<0.000001). HlyCABD genes were part of this VAG group and the first four genes associated with haemolytic phenotype with the lowest P-values (P<10−48).
Table 2.

Comparison of virulence-associated genes presence between 81 haemolytic and 139 nonhaemolytic

VAG group

Haemolytic

Nonhaemolytic

All

P-value

Iron acquisition

25

2

27

0.000001

Outer membrane:

36

14

50

0.002

LPS

4

0

4

0.05

Secretion systems

14

1

15

0.001

Multidrug efflux transporter

7

4

11

0.37

Flagella

2

4

6

0.42

Toxins

16

8

24

0.1

Adhesion

39

26

65

0.11

Biofilm

3

6

9

0.32

Stress

7

4

11

0.37

Transcriptional regulators

8

8

16

1

Total

161

77

238

Comparison of virulence-associated genes presence between 81 haemolytic and 139 nonhaemolytic VAG group Haemolytic Nonhaemolytic All P-value Iron acquisition 25 2 27 0.000001 Outer membrane: 36 14 50 0.002 LPS 4 0 4 0.05 Secretion systems 14 1 15 0.001 Multidrug efflux transporter 7 4 11 0.37 Flagella 2 4 6 0.42 Toxins 16 8 24 0.1 Adhesion 39 26 65 0.11 Biofilm 3 6 9 0.32 Stress 7 4 11 0.37 Transcriptional regulators 8 8 16 1 Total 161 77 238

Sequence variation shows differences between haemolytic and nonhaemolytic in adhesins with a very high prevalence

When we analysed the prevalence of different VAFs, we observed that three adhesins, i.e. type 1 fimbriae, common pili and curli fimbriae were present in nearly all haemolytic and nonhaemolytic . On the other hand, some of the genes encoding for these adhesins were found differentially represented in GWAS analysis. Taking these observations together with the overwhelming amount of research showing the contribution of fimH sequence polymorphism to adhesive properties of type 1 fimbriae in lead to the conclusion that the analysis of these adhesins should not only include the presence of genes coding for adhesin clusters but also a comparison of variant prevalence. It was found that FimH, EcpD and CsgA variants differed in prevalence between haemolytic and nonhaemolytic (Fig. 3). In the case of FimH, four variants (no. 1, 3, 6 and 13) were found statistically more often in one of the groups (P<0.05), and 15 variants (out of 17) were present only in one of these groups. Nonhaemolytic , had 2.5 times more FimH variants present in less than 2.5 % isolates. When sequence variation in EcpD adhesin was tested, six variants were present in only one group (i.e. no. 2, 4, 5, 6, 7 and 8), and there were 5.8 times more variants present in only one isolate of nonhaemolytic in comparison to haemolytic (P<0.05). One EcpD variant was found statistically more often in haemolytic (P<0.001, no. 1) and two in nonhaemolytic (P<0.01, no. 2, 4). The lowest number of variants were found during analysis of CsgA. One dominant CsgA variant (i.e. no. 1) with 75 % prevalence was found in haemolytic and detected 3.3 times more often compared to nonhaemolytic (P<0.001). The most prevalent CsgA variant (i.e. no 3) in nonhaemolytic (50%) was found 3.7 times less often in haemolytic (P<0.001, no. 1). To summarize, there were significant differences between haemolytic and nonhaemolytic in the prevalence of diverse adhesin variants.
Fig. 3.

Comparison of adhesin frequency between 81 haemolytic and 139 nonhaemolytic E. coli. Barplot showing FimH, EcpD and CsgA adhesin frequency in genomes of haemolytic and nonhaemolytic . The frequency of different adhesin variants is shown on the y-axis. Names of adhesins and group (‘Haemo’ refers to haemolytic , ‘Non’ refers to nonhaemolytic ) are shown on the x-axis. Various colours represent different adhesin variants and are described on the legend. Each number represents one adhesin variant, ‘No Group’ contains frequency of adhesin variants present in only one isolate in both groups, ‘Other’ contains frequency of adhesin variants present in less than 2.5 % of isolates. ‘No gene’ refers to genomes without investigated gene.

Comparison of adhesin frequency between 81 haemolytic and 139 nonhaemolytic E. coli. Barplot showing FimH, EcpD and CsgA adhesin frequency in genomes of haemolytic and nonhaemolytic . The frequency of different adhesin variants is shown on the y-axis. Names of adhesins and group (‘Haemo’ refers to haemolytic , ‘Non’ refers to nonhaemolytic ) are shown on the x-axis. Various colours represent different adhesin variants and are described on the legend. Each number represents one adhesin variant, ‘No Group’ contains frequency of adhesin variants present in only one isolate in both groups, ‘Other’ contains frequency of adhesin variants present in less than 2.5 % of isolates. ‘No gene’ refers to genomes without investigated gene.

Global diversity of haemolytic

To get a global overview of haemolytic population structure, 1041 genomes were downloaded from the GenBank database. The downloaded genomes along with 81 genomes sequenced in this study were analysed. Phylogenetic analysis revealed that the isolates clustered according to their respective phylogroup placement (Fig. 4). Majority of isolates belonged to phylogroup B2 (65%), B1 (17%) and A (9.5%). Altogether, 2 and 5 % of isolates belonged to the phylogroup C and D, respectively, and only a few isolates were clustered to phylogroups E and F. BAPS revealed 19 clusters that aligned well within phylogroups, further dividing them into subgroups (Figs 4 and S3a). To test the genetic and antigenic diversity in a collection of haemolytic genomes, all isolates were analysed for STs and serotypes. All 1122 isolates represented 149 STs and 189 serotypes, indicating high diversity in this set of genomes (Fig. S3C, E). Such high diversity is supported by the fact that all isolates originate from 46 countries across six continents (Fig. S3B). When genome placement for operon hlyCABD was tested, the majority of the isolates from phylogroup B2 and 8 (out of 9) BAPS subgroups had alpha-haemolysin encoded on a chromosome (97.8%) (Figs S3D and S4). Similarly, 69.5 % of isolates from phylogroup D encoded the operon on the chromosome. Isolates belonging to phylogroups A and C had comparable distribution of isolates between chromosome- and plasmid-encoded hlyCABD. For group B1, most isolates had plasmid-encoded hlyCABD (69.8%). Overall, only 4 % of isolates with plasmid-encoded alpha-haemolysin were detected in phylogroup B2. The majority of isolates belonging to phylogroups B2 and D were isolated from humans (Fig. S3f). Human isolates were also present in other phylogroups, but their prevalence was lowered by animal isolates, mainly from pigs and cattle. When genome placement was compared with the host origin of the isolate, it was observed that animal isolates possessed plasmid-encoded alpha-haemolysin, whereas human isolates encoded this operon on the chromosome (P<0.001)(Fig. S3G).
Fig. 4.

Phylogenetic relationship and VAGs' distribution in global population of haemolytic E. coli. Core-genome phylogenetic tree for 1122 haemolytic based on 3032 genes. Phylogroup, BAPS group, ST, serotype and alpha-haemolysin genomic placement, iron acquisition, toxin and adhesin gene prevalence were annotated on the tree with the use of iTOL.

Phylogenetic relationship and VAGs' distribution in global population of haemolytic E. coli. Core-genome phylogenetic tree for 1122 haemolytic based on 3032 genes. Phylogroup, BAPS group, ST, serotype and alpha-haemolysin genomic placement, iron acquisition, toxin and adhesin gene prevalence were annotated on the tree with the use of iTOL. As the presence of adhesins, iron acquisition and toxin genes have a direct linkage to the haemolytic and virulence abilities of (Fig. 2, Table 2), they were investigated to elucidate global diversity in haemolytic and nonhaemolytic . As there is no comprehensive database containing adhesins, iron acquisition and toxin genes, they were downloaded from GenBank nucleotide collection. A database was set up with a total of 124 toxin and iron acquisition and 443 adhesin genes, providing functionality for 24 iron acquisition and toxin systems and 75 adhesins or adhesion-related molecules. In total, 124 toxin and iron acquisition and 275 adhesin genes were detected in at least one genome (Fig. S5A, B). Alpha-haemolysin was detected in all except three isolates. Enterohaemolysin was not present in any of the tested genomes, whereas silent haemolysin was present in isolates from all phylogroups except B2. Opposite observation was made for group ‘Other haemolysins and haemoglobin proteases’, which were more prevalent in phylogroup B2 compared to A and B1 groups (P<0.001). Iron acquisition genes encoding for salmochelin (iroBCDEN) and SitABCD system (sitABCD) were found more prevalent in isolates from phylogroup B2 than in other groups (P<0.001, Fig. 4). Yersiniabactin (ybtAEPSTQUXirp12fyuA) was detected around two times more often in isolates from group B2 than groups A, B1 and D (P<0.001). Gene bfd involved in iron storage was not detected in group B1 but had nearly 100 % prevalence in group B2. Fec system (fecABCDEIR) and aerobactin (iucABCDiutA) were more prevalent in B2 phylogroup than in group B1 (P<0.001), but less prevalent when compared to groups C and D (P<0.01). Adhesin and adhesion-associated genes' prevalence analysis revealed a group of nine adhesins present in nearly all investigated genomes (Fig. 4). Auf, S and P fimbriae were found nearly exclusively in B2 phylogroup (P<0.001), whereas Sfm, Ybg and Yfc were present in all groups with the exception of B2. A similar observation could be made for Ycb, Ygi and Yra fimbriae, but the prevalence of these adhesins was significantly lower in group D when compared with phylogroups A, B1 and C (P<0.001). Stg fimbriae were detected only in phylogroups B1, C and D. In some isolates of B1 group belonging to BAPS subgroup 8, Lda and Paa adhesins were detected. In the same phylogroup, BAPS subgroup 6 and 8, type 3 secretion system and intimin genes were found. When frequency of iron acquisition and toxin systems was compared with genome placement, only silent haemolysin was found eight times more frequently in isolates with plasmid-encoded hlyCABD operon (P<0.001). Eight systems associated with iron acquisition, i.e. salmochelin, SitABCD, hemin uptake (chuASTUWVXY), Fec, aerobactin, other haemolysins and haemoglobin proteases, iron storage-bfd and yersiniabactin were found 1.5-9.7 times more often in genomes of isolates with alpha-haemolysin encoded on the chromosome (P<0.001). Adhesin genes prevalence was also associated with genome placement of alpha-haemolysin. Auf, S and P fimbriae were found 11–184 times more often when hlyCABD cluster was encoded on a chromosome (P<0.001), whereas Sfm, Stg, Ybg, Yfc, Ycb, Ygi, Yra and Paa adhesins were identified in genomes with plasmid-encoded alpha-haemolysin 8–71 times more often than in other isolates (P<0.001). Intimin, T3SS and Lda adhesin were found exclusively in isolates with plasmid-encoded alpha-haemolysin. Pathotyping revealed that majority of isolates with chromosome-encoded hlyCABD belonged to UPEC (87%), whereas plasmid-encoded hlyCABD can be found mainly in nonpathogenic isolates and undetermined pathotype (50.5%) followed by atypical enteropathogenic (aEPEC, 29.6%), enterotoxigenic i (ETEC, 14.4%), UPEC (2.3%), neonatal meningitis (NMEC, 1.85%), STEC (0.9%) and enteroinvasive (EIEC, 0.5%) (Fig. S6). Taken together, the data suggest that haemolytic are pathotypically, genetically and antigenically diverse, but their adhesin, toxin and iron acquisition system repertoire is associated with genome placement of hlyCABD cluster.

with plasmid-encoded alpha-haemolysin has similar iron acquisition system and adhesin profile to nonpathogenic

Next, haemolytic genomes were compared with a set of genomes that did not possess VAFs typical for well-defined pathotypes (and therefore were considered as ‘nonpathogenic E. coli’) to assess virulence factors unique for haemolytic and which can be also found in nonpathogenic . Overall, 2257 genomes of nonpathogenic belonged to phylogroups A, B1, C and D, and 14 out of 22 BAPS groups (Fig. 5). Nonpathogenic were more diverse group in comparison to their haemolytic counterparts. Altogether, 509 STs and 671 serotypes were defined in 2257 nonpathogenic collected from 65 different hosts in 64 countries.
Fig. 5.

Phylogenetic relationship and VAGs' distribution in haemolytic and nonpathogenic E. coli. Core-genome phylogenetic tree for 1122 haemolytic and 2257 nonpathogenic based on 2818 genes. Phylogroup, hlyCABD genome placement, BAPS group, ST, serotype, host, toxin and iron acquisition and adhesin gene prevalence were annotated on the tree with the use of iTOL.

Phylogenetic relationship and VAGs' distribution in haemolytic and nonpathogenic E. coli. Core-genome phylogenetic tree for 1122 haemolytic and 2257 nonpathogenic based on 2818 genes. Phylogroup, hlyCABD genome placement, BAPS group, ST, serotype, host, toxin and iron acquisition and adhesin gene prevalence were annotated on the tree with the use of iTOL. Analysis of iron acquisition and toxin systems' frequency revealed that salmochelin, SitABCD, hemin uptake, aerobactin, other haemolysins and haemoglobin proteases, and yersiniabactin were found 3–41 times less prevalent in genomes of nonpathogenic isolates than in isolates with plasmid-encoded hlyCABD (P<0.001) (Fig. 6a). When compared with isolates with genome-encoded hlyCABD, the aforementioned systems were found 7–296 times less frequently in nonpathogenic isolates (P<0.001).
Fig. 6.

Frequency of toxin, iron acquisition and adhesin systems in haemolytic and nonpathogenic E. coli. Frequency of 124 toxin and iron acquisition and 443 adhesin genes providing information about the presence of 24 toxins or iron acquisition systems and 75 adhesins or adhesion-related molecules was tested in 1122 haemolytic and 2257 nonpathogenic E .coli. Genome placement is shown on the x-axis. Iron acquisition and toxin systems (a) and adhesins (b) are listed on the y-axis. The colour gradient is proportional to the frequency of each system, and the colour scale is shown on the legends attached with each heatmap.

Frequency of toxin, iron acquisition and adhesin systems in haemolytic and nonpathogenic E. coli. Frequency of 124 toxin and iron acquisition and 443 adhesin genes providing information about the presence of 24 toxins or iron acquisition systems and 75 adhesins or adhesion-related molecules was tested in 1122 haemolytic and 2257 nonpathogenic E .coli. Genome placement is shown on the x-axis. Iron acquisition and toxin systems (a) and adhesins (b) are listed on the y-axis. The colour gradient is proportional to the frequency of each system, and the colour scale is shown on the legends attached with each heatmap. Adhesin profile of nonpathogenic was very similar to the haemolytic with plasmid-encoded alpha-haemolysin (Fig. 6b). These two groups only differed in frequency of intimin, T3SS, Lda and Paa adhesins (P<0.001). Whereas in haemolytic with chromosome-encoded alpha-haemolysin, Auf, S and P fimbriae were found 25–73 times more often in comparison to nonpathogenic (P<0.001). However, adhesins like Paa, Stg, Sfm, Yfc, Ybg, Ycb, Ygi, Yra were identified in nonpathogenic 6–15 times more often than in haemolytic with chromosome-encoded hlyCABD cluster (P<0.001). Overall, our data suggests that isolates with plasmid-encoded alpha-haemolysin are similar to genomes that do not possess VAFs typical for well-defined pathotypes.

Variant prevalence of HlyCABD is associated with genome placement and pathotype of

To assess the diversity within hlyCABD cluster, the aformentioned sequences were collected and submitted to amino acid sequence variation analysis. Highest number of variants were detected in HlyA (122), followed by HlyB (74) and HlyD (56) and the lowest number of variants were found for HlyC (37) (Figs 7a and S7A). Interestingly, different protein variants were associated with genome placement of alpha haemolysin (Figs 7b and S7B). Similarly, in the case of pathotypes, selected variants were specific to one pathotype only (Figs 7c and S7C).
Fig. 7.

HlyCABD variant frequency in . Alluvial plots with HlyCABD variant prevalence contextualized with information about alpha-haemolysin genome placement and pathotype of alpha-haemolysin-positive . Names of proteins are shown on the x-axis and number of genomes is shown on the y-axis. Colour of the bars and streams connecting them represent variants (a), genome placement (b) or pathotypes (c) and are described on the legend on the right side for each plot separately. Variants with more than one pathotype (c) are coloured grey. All variants detected in less than ten isolates were joint together in one group ‘Other’ on plot (a) (see Fig. S7A for alluvial plot on which all variants are shown). All genomes with undetermined genome placement are not shown on plot (b) (see Fig. S7B for alluvial plot on which all genomes are shown). All genomes with Not determined (‘NA’) pathotype are not shown on plot (c) (see Fig. S7C for alluvial plot on which all genomes are shown).

HlyCABD variant frequency in . Alluvial plots with HlyCABD variant prevalence contextualized with information about alpha-haemolysin genome placement and pathotype of alpha-haemolysin-positive . Names of proteins are shown on the x-axis and number of genomes is shown on the y-axis. Colour of the bars and streams connecting them represent variants (a), genome placement (b) or pathotypes (c) and are described on the legend on the right side for each plot separately. Variants with more than one pathotype (c) are coloured grey. All variants detected in less than ten isolates were joint together in one group ‘Other’ on plot (a) (see Fig. S7A for alluvial plot on which all variants are shown). All genomes with undetermined genome placement are not shown on plot (b) (see Fig. S7B for alluvial plot on which all genomes are shown). All genomes with Not determined (‘NA’) pathotype are not shown on plot (c) (see Fig. S7C for alluvial plot on which all genomes are shown). Phylogenetic analysis with the use of HlyA alignment revealed clustering of HlyA variants encoded on chromosome (mainly from UPEC strains) together (Fig. 8). In the case of plasmid-encoded HlyA variants, it is visible that variants from aEPEC and ETEC cluster together, which is associated with the presence of variable sites, specific for plasmid-encoded HlyA. Distribution of variable sites was not equal in HlyA (Fig. S8). Taking into consideration InterPro database features, the average number of variable sites in RTX C-terminal, RTX N-terminal, RTX toxin determinant A, RTX calcium-binding nonapeptide repeat was 0.32, 0.17, 0.1, 0.09, respectively, when in the rest of the sites the average number of variable sites was 0.18.
Fig. 8.

Phylogenetic analysis of HlyA in E. coli. Phylogenetic tree for 96 HlyA variants from alpha-haemolysin-positive genomes. Genome placement, pathotype prevalence and amino acid variable sites (190) were annotated on the tree with the use of iTOL. Numbers on the alignment do not reflect actual site numbers in the whole protein alignment.

Phylogenetic analysis of HlyA in E. coli. Phylogenetic tree for 96 HlyA variants from alpha-haemolysin-positive genomes. Genome placement, pathotype prevalence and amino acid variable sites (190) were annotated on the tree with the use of iTOL. Numbers on the alignment do not reflect actual site numbers in the whole protein alignment.

Discussion

can colonize various niches in the host body and the environment [2]. Its success depends on the presence of various genetic factors that provide the bacterium with the ability to survive and propagate in often harsh conditions. One of the VAFs with multiple proposed functions is alpha-haemolysin [24, 25]. In this work, we analysed the VAFs that can be associated with the presence of alpha-haemolysin in genomes to determine other factors that might influence utilization of alpha-haemolysin by during host colonization. First, we established a large collection of haemolytic and nonhaemolytic from various hosts and compared their phylogenetic relatedness. We noticed that several isolates collected from wild animals cluster together with human UTI-causing isolates. Similar results have been obtained in the previous studies, where wild animals have been identified as carriers of potential human pathogenic and our observations provide additional information about wild animals as the reservoir of haemolytic [55, 56]. The full assessment of the potential of these isolates to cause human infections can be only evaluated by functional analysis of virulence properties with the use of in vitro and in vivo animal experiments. Analysis of VAFs' presence in haemolytic and nonhaemolytic revealed a group of VAFs with high prevalence in both groups, with three adhesins among them (Fig. 2). Our previous work and many other reports have shown that variations in FimH adhesin sequence have an impact on interaction with host cell receptors or biofilm formation [57-59]. ECP and Curli fimbriae also mediate binding of to host cells and take part in biofilm formation, but the effect of sequence variation in EcpD adhesin and CsgA to these processes have not been shown so far [60] [61, 62]. We were able to show that different FimH, EcpD and CsgA variant clusters are present in the analysed groups, which indicate different adhesive capabilities of haemolytic and nonhaemolytic . We hypothesize that different niche colonization of haemolytic and nonhaemolytic lead to point mutations that result in fimbrial functional diversification and altered binding capabilities to biotic or abiotic surfaces [63, 64]. It is also possible that observed mutations are based on different cluster membership in population or phylogenetic history, therefore our hypothesis requires experimental confirmation [65]. To further explore differences between haemolytic and nonhaemolytic we carried out GWAS and unveiled genes encoding toxins, iron acquisition and adhesion systems that have not been previously associated with the haemolytic . One of these genes encodes a protein belonging to the porin protein family functionally classified as an outer membrane protein with beta-barrel domain [66]. Porins are known to play a role in the regulation of outer membrane permeability, stress response and adhesion to epithelial cells [67, 68]; therefore we think that this gene requires further characterization to elucidate its role in haemolytic physiology. Furthermore, genes encoding for type II secretion system were found nearly exclusively in haemolytic . As this system is utilized in the export of VAFs like proteases and adhesins aiding host colonization, our analysis shows that it can be important for haemolytic and may contribute in many virulence traits to gaining access and adhesion to host cells [69]. During the analysis of 81 haemolytic genomes, we noticed that majority of isolates have chromosome-encoded alpha-haemolysin, whereas only a few isolates possess plasmid-encoded alpha-haemolysin. Enriching our analysis with over 1000 genomes of haemolytic from GenBank allowed us to show that isolates with chromosome- and plasmid-encoded haemolysin differ in VAF prevalence (Fig. 9). Bacteria with plasmid-encoded haemolysin have fewer iron acquisition systems and possess different set of adhesion factors than bacteria with chromosome-encoded haemolysin. Moreover, different protein variants of HlyCABD were associated with genome placement of hlyCABD. Taking into consideration that: (1) all investigations concerning the role of alpha-haemolysin were conducted with the use of isolates with chromosome-encoded haemolysin [24, 70], (2) isolates with plasmid-encoded alpha-haemolysin have similar VAF profile to nonpathogenic , (3) the majority of isolates with plasmid-encoded alpha-haemolysin was isolated from farm animals and (4) influence of HlyCABD sequence variation on haemolytic properties of was investigated only in UPEC [71], we think that plasmid-encoded alpha-haemolysin requires further attention with a focus on phenotypic characterization of cell exfoliation potential, immunomodulatory properties and the possibility of farm animals as a source of alpha-haemolysin in human pathogenic .
Fig. 9.

Virulence-associated factors of haemolytic E. coli. Schematic depiction of differences in VAFs of with chromosome-encoded (a) and plasmid-encoded (b) alpha-haemolysin. Iron acquisition and toxin systems are marked in blue, while adhesins and adhesion-related molecules in green. VAFs present in haemolytic regardless of hlyCABD operon placement (i.e. alpha-haemolysin, siderophore receptors, Yad, type I, F9, Yeh and curli fimbriae, FdeC adhesin, ECP, autotransporters, T6SS, and OmpA) were omitted for graphic simplicity.

Virulence-associated factors of haemolytic E. coli. Schematic depiction of differences in VAFs of with chromosome-encoded (a) and plasmid-encoded (b) alpha-haemolysin. Iron acquisition and toxin systems are marked in blue, while adhesins and adhesion-related molecules in green. VAFs present in haemolytic regardless of hlyCABD operon placement (i.e. alpha-haemolysin, siderophore receptors, Yad, type I, F9, Yeh and curli fimbriae, FdeC adhesin, ECP, autotransporters, T6SS, and OmpA) were omitted for graphic simplicity. Click here for additional data file.
  66 in total

1.  The Escherichia coli K-12 sheA gene encodes a 34-kDa secreted haemolysin.

Authors:  F J del Castillo; S C Leal; F Moreno; I del Castillo
Journal:  Mol Microbiol       Date:  1997-07       Impact factor: 3.501

2.  Unipro UGENE: a unified bioinformatics toolkit.

Authors:  Konstantin Okonechnikov; Olga Golosova; Mikhail Fursov
Journal:  Bioinformatics       Date:  2012-02-24       Impact factor: 6.937

3.  Detection of a novel virulence gene and a Salmonella virulence homologue among Escherichia coli isolated from broiler chickens.

Authors:  Cesar Morales; Margie D Lee; Charles Hofacre; John J Maurer
Journal:  Foodborne Pathog Dis       Date:  2004       Impact factor: 3.171

Review 4.  Adhesins and invasins of pathogenic Escherichia coli.

Authors:  Chantal Le Bouguénec
Journal:  Int J Med Microbiol       Date:  2005-10       Impact factor: 3.473

5.  The type III secretion system is involved in the invasion and intracellular survival of Escherichia coli K1 in human brain microvascular endothelial cells.

Authors:  Yufeng Yao; Yi Xie; Donna Perace; Yi Zhong; Jie Lu; Jing Tao; Xiaokui Guo; Kwang Sik Kim
Journal:  FEMS Microbiol Lett       Date:  2009-08-19       Impact factor: 2.742

Review 6.  Bacterial iron homeostasis.

Authors:  Simon C Andrews; Andrea K Robinson; Francisco Rodríguez-Quiñones
Journal:  FEMS Microbiol Rev       Date:  2003-06       Impact factor: 16.408

7.  Molecular analysis of the plasmid-encoded hemolysin of Escherichia coli O157:H7 strain EDL 933.

Authors:  H Schmidt; L Beutin; H Karch
Journal:  Infect Immun       Date:  1995-03       Impact factor: 3.441

8.  Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary.

Authors:  Ola Brynildsrud; Jon Bohlin; Lonneke Scheffer; Vegard Eldholm
Journal:  Genome Biol       Date:  2016-11-25       Impact factor: 13.583

9.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

10.  RhierBAPS: An R implementation of the population clustering algorithm hierBAPS.

Authors:  Gerry Tonkin-Hill; John A Lees; Stephen D Bentley; Simon D W Frost; Jukka Corander
Journal:  Wellcome Open Res       Date:  2018-07-30
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.