| Literature DB >> 35024551 |
Sebastian Fischer1, Jens Klockgether1, Marina Gonzalez Sorribes1,2, Marie Dorda1,3, Lutz Wiehlmann1,3, Burkhard Tümmler1,4.
Abstract
Five hundred and thirty-four unrelated Pseudomonas aeruginosa isolates from inanimate habitats, patients with cystic fibrosis (CF) and other human infections were sequenced in 19 genes that had been identified previously as the hot spots of genomic within-host evolution in serial isolates from 12 CF lungs. Amplicon sequencing confirmed a significantly higher sequence diversity of the 19 loci in P. aeruginosa isolates from CF patients compared to those from other habitats, but this overrepresentation was mainly due to the larger share of synonymous substitutions. Correspondingly, non-synonymous substitutions were either rare (gltT, lepA, ptsP) or benign (nuoL, fleR, pelF) in some loci. Other loci, however, showed an accumulation of non-neutral coding variants. Strains from the CF habitat were often mutated at evolutionarily conserved positions in the elements of stringent response (RelA, SpoT), LPS (PagL), polyamine transport (SpuE, SpuF) and alginate biosynthesis (AlgG, AlgU). The strongest skew towards the CF lung habitat was seen for amino acid sequence variants in AlgG that clustered in the carbohydrate-binding/sugar hydrolysis domain. The master regulators of quorum sensing lasR and rhlR were frequent targets for coding variants in isolates from chronic and acute human infections. Unique variants in lasR showed strong evidence of positive selection indicated by d N/d S values of ~4. The pelA gene that encodes a multidomain enzyme involved in both the formation and dispersion of Pel biofilms carried the highest number of single-nucleotide variants among the 19 genes and was the only gene with a higher frequency of missense mutations in P. aeruginosa strains from non-CF habitats than in isolates from CF airways. PelA protein variants are widely distributed in the P. aeruginosa population. In conclusion, coding variants in a subset of the examined loci are indeed characteristic for the adaptation of P. aeruginosa to the CF airways, but for other loci the elevated mutation rate is more indicative of infections in human habitats (lasR, rhlR) or global diversifying selection (pelA).Entities:
Keywords: Pseudomonas aeruginosa; amplicon sequencing; cystic fibrosis; niche adaptation; population genetics
Year: 2021 PMID: 35024551 PMCID: PMC8749138 DOI: 10.1099/acmi.0.000286
Source DB: PubMed Journal: Access Microbiol ISSN: 2516-8290
genes that were identified as hot spots of mutation in serial isolates from cystic fibrosis patients seen at the CF clinic Hannover, Germany.
|
Gene | |||
|---|---|---|---|
|
Name |
Locus |
Length (bp) |
Annotation |
|
|
PA3545 |
1632 |
Alginate-c5-mannuronan-epimerase AlgG |
|
|
PA0762 |
582 |
Sigma factor AlgU |
|
|
PA1099 |
1422 |
Two-component response regulator (of motility and adhesion to mucin) |
|
|
PA2586 |
645 |
Response regulator GacA |
|
|
PA3192 |
729 |
Two-component response regulator GltR (to presence of glucose) |
|
|
PA1430 |
720 |
Transcriptional regulator LasR |
|
|
PA0767 |
1800 |
GTP-binding protein LepA |
|
|
PA2647 |
1848 |
NADH dehydrogenase I chain L |
|
PA4391 |
1002 |
Hypothetical protein (orthologue in PA14 strain: PA14_57070) | |
|
PA5048 |
768 |
Probable nuclease (orthologue in PA14 strain: PA14_66700) | |
|
|
PA4661 |
522 |
Lipid A 3-O-deacylase |
|
|
PA3064 |
2847 |
PelA (multi-domain enzyme with PEL deacetylase and hydrolase activities) |
|
|
PA3059 |
1524 |
PelF (UDP-GalNAc/GlcNAc-glycosyltransferase) |
|
|
PA0337 |
2280 |
Phosphoenolpyruvate-protein phosphotransferase PtsP |
|
|
PA0934 |
2244 |
GTP pyrophosphokinase |
|
|
PA3477 |
726 |
Transcriptional regulator RhlR |
|
|
PA5338 |
2106 |
Guanosine-3′,5′-bis(diphosphate) 3′-pyrophosphohydrolase |
|
|
PA0301 |
1098 |
Polyamine transport protein 8 (spermidine-binding protein) |
|
|
PA0302 |
1155 |
Polyamine transport protein PotG (ABC transporter ATPase) |
Normalized frequency of sequence variants in the strain panel (%)*
|
Gene |
Compared to PA14 reference genome |
Absent in a reference panel of environmental strains |
Unique in one strain |
|---|---|---|---|
|
|
8.4 |
6.4 |
2.2 |
|
|
10.8 |
8.9 |
4.1 |
|
|
10.5 |
7.6 |
2.7 |
|
|
7.1 |
5.7 |
3.4 |
|
|
8.9 |
6.0 |
1.1 |
|
|
14.4 |
12.6 |
7.5 |
|
|
7.3 |
4.7 |
1.8 |
|
|
6.6 |
4.1 |
1.4 |
|
PA4391 |
9.6 |
8.0 |
2.1 |
|
PA5048 |
12.1 |
9.0 |
2.2 |
|
|
9.2 |
7.5 |
3.8 |
|
|
10.4 |
7.5 |
2.3 |
|
|
11.7 |
9.3 |
1.9 |
|
|
6.4 |
4.2 |
1.2 |
|
|
9.4 |
6.6 |
1.6 |
|
|
8.3 |
5.9 |
3.0 |
|
|
6.0 |
4.1 |
1.7 |
|
|
8.4 |
6.1 |
2.2 |
|
|
6.0 |
4.9 |
1.4 |
*The frequency of sequence variants as a percentage was determined by (the total number of sequence variants)/(gene length×number of isolates), i.e. hits at the same position were added by counts.
Habitat-associated abundance of sequence variants in isolates*
|
Gene |
Environment |
Acute infection |
COPD |
CF |
|---|---|---|---|---|
|
|
+ |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
+ |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
+ |
– |
– |
+ |
|
|
+ |
– |
– |
+ |
|
PA4391 |
+ |
– |
– |
+ |
|
PA5048 |
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
+ |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
– |
– |
– |
+ |
|
|
+ |
– |
– |
+ |
+, higher abundance than average; −, lower abundance than average.
*The normalized frequency distributions of sequence variants in each selected P. aeruginosa locus were compared between isolates from the environment, acute infection and airways of patients with COPD or CF. The frequency distribution of sequence variants was significantly different in the four habitats for all tested genes (Bonferroni-corrected P corr < 0.05). Only sequence variants were considered that are absent in the reference panel of environmental isolates with completely sequenced genomes (see Table S2).
Ratio of nonsynonymous to synonymous substitutions d N/d S in the examined genes*
|
Gene |
Substitutions compared to PA 14 reference sequence |
Substitutions absent in the genomes of environmental strains (Table S2) |
Unique substitutions in one strain | |||
|---|---|---|---|---|---|---|
|
panels |
CF |
non-CF |
CF |
non-CF |
CF |
non-CF |
|
|
0.14 |
0.11 |
0.83 |
0.44 |
0.87 |
0.075 |
|
|
0.0085 |
0.0005 |
0.054 |
0.005 |
1.73 |
0.081 |
|
|
0.03 |
0.02 |
0.11 |
0.08 |
0.40 |
0.13 |
|
|
0.004 |
0.002 |
0.046 |
0.020 |
0.91 |
0.055 |
|
|
0.008 |
0.006 |
0.082 |
0.097 |
–/– |
0.36 |
|
|
0.023 |
0.017 |
0.22 |
0.35 |
4.24 |
3.91 |
|
|
0.007 |
0.007 |
0.043 |
0.034 |
0.089 |
0.12 |
|
|
0.085 |
0.090 |
0.11 |
0.11 |
0.21 |
0.19 |
|
PA4391 |
0.049 |
0.038 |
0.32 |
0.24 |
0.16 |
0.33 |
|
PA5048 |
0.084 |
0.072 |
0.12 |
0.24 |
0.15 |
0.25 |
|
|
0.044 |
0.035 |
0.40 |
0.15 |
1.51 |
0.12 |
|
|
0.135 |
0.139 |
0.23 |
0.27 |
0.52 |
0.36 |
|
|
0.084 |
0.072 |
0.16 |
0.18 |
0.25 |
0.24 |
|
|
0.019 |
0.054 |
0.021 |
0.055 |
0.11 |
–/– |
|
|
0.006 |
0.003 |
0.028 |
0.026 |
0.34 |
0.020 |
|
|
0.008 |
0.007 |
0.056 |
0.076 |
1.44 |
0.86 |
|
|
0.004 |
0.001 |
0.061 |
0.022 |
0.32 |
0.029 |
|
|
0.025 |
0.033 |
0.099 |
0.14 |
0.44 |
0.11 |
|
|
0.019 |
0.006 |
0.056 |
0.034 |
0.11 |
–/– |
*d N/d S, i.e. the ratio of nonsynonymous substitutions per nonsynonymous sites to the number of synonymous substitutions per synonymous sites. Whole-genome comparisons of unrelated clonal P. aeruginosa complexes revealed an empirical median d N/d S value of 0.1 [21, 62].
Frequency of amino acid sequence variants in the strain panel
|
Gene |
No. of missense mutations in CF and non-CF isolates |
Ratio |
Fisher's test | ||
|---|---|---|---|---|---|
|
Common in CF and non-CF |
CF only |
Non-CF only |
CF/non-CF* |
Presence of CF vs non-CF | |
|
|
20 |
47 |
4 |
6.3 |
|
|
|
0 |
23 |
1 |
12.3 |
|
|
|
25 |
21 |
4 |
2.8 |
|
|
|
0 |
12 |
2 |
3.2 |
|
|
|
2 |
4 |
4 |
0.50 |
|
|
|
1 |
47 |
28 |
0.90 |
|
|
|
6 |
3 |
7 |
0.23 |
|
|
|
15 |
10 |
7 |
0.77 |
|
|
PA4391 |
17 |
10 |
13 |
0.41 |
|
|
PA5048 |
16 |
8 |
5 |
0.85 |
|
|
|
3 |
19 |
1 |
10.1 |
|
|
|
63 |
25 |
29 |
0.46 |
|
|
|
42 |
10 |
10 |
0.53 |
|
|
|
4 |
6 |
0 |
| |
|
|
3 |
16 |
3 |
2.8 |
|
|
|
0 |
14 |
9 |
0.83 |
|
|
|
2 |
17 |
1 |
9.1 |
|
|
|
10 |
7 |
4 |
0.93 |
|
|
|
3 |
5 |
0 |
| |
*Normalized ratio of variants detected in 345 CF and 184 non-CF P. aeruginosa isolates.
Distribution of amino acid replacements in the strain panel
|
Affected amino acid |
No. of missense mutations present |
Average Dayhoff score of missense mutations | ||||
|---|---|---|---|---|---|---|
|
CF and non-CF |
CF |
non-CF |
CF and non-CF |
CF |
non-CF isolates | |
|
Alanine |
60 |
37 |
21 |
50 |
47 |
46 |
|
Arginine |
13 |
28 |
12 |
8 |
5 |
4 |
|
Asparagine |
9 |
9 |
1 |
41 |
37 |
43 |
|
Aspartic acid |
16 |
28 |
7 |
42 |
30 |
54 |
|
Cysteine |
1 |
7 |
1 |
3 |
7 |
1 |
|
Glutamine |
3 |
14 |
6 |
21 |
6 |
11 |
|
Glutamic acid |
15 |
11 |
7 |
45 |
21 |
34 |
|
Glycine |
8 |
25 |
14 |
45 |
25 |
26 |
|
Histidine |
6 |
6 |
5 |
9 |
13 |
8 |
|
Isoleucine |
11 |
12 |
4 |
41 |
22 |
13 |
|
Leucine |
10 |
22 |
3 |
14 |
9 |
17 |
|
Lysine |
4 |
7 |
2 |
36 |
43 |
29 |
|
Methionine |
6 |
5 |
0 |
4 |
8 | |
|
Phenylalanine |
2 |
4 |
4 |
11 |
17 |
9 |
|
Proline |
12 |
21 |
11 |
23 |
15 |
14 |
|
Serine |
13 |
13 |
5 |
41 |
32 |
24 |
|
Threonine |
18 |
18 |
6 |
30 |
45 |
17 |
|
Tryptophan |
1 |
8 |
4 |
0 |
1 |
1 |
|
Tyrosine |
2 |
9 |
4 |
13 |
2 |
2 |
|
Valine |
20 |
25 |
15 |
41 |
27 |
34 |
Fig. 1.Distribution of amino acid exchanges. The figure summarizes the frequency and potential functional severity of amino acid exchanges observed in the strain panel. The observed numbers of events affecting a certain amino acid either in strains from all habitats, exclusively in CF isolates or in non-CF isolates only are represented by the diameters of the respective circles. Please note that the diameters are adjusted to a logarithmic scale, only the spots for frequencies values of 1 are displayed in an estimated size and without a black outer ring. The colour of the circles represents the average Dayhoff score of all exchanges affecting the respective amino acid. Dayhoff scores are taken from a matrix containing frequencies of observed amino acid exchanges in sequences of functionally equivalent proteins. Thus they are taken as probability values for potential functional conservation or changes introduced by single exchanges, with low values indicating potentially severe functional changes and higher values hinting at likely less drastic or even very little effect on protein function.
Fig. 2.Overview on mutation hot spot loci and the corresponding functions. The figure displays the proteins encoded in the loci found to be most frequently mutated in strains from CF background, their individual traits and functional category. Please note that only 17 of the 19 proteins are shown here, as for 2 frequently mutated loci (PA4391 and PA5048) functional data is still lacking, and the encoded proteins are still classified as hypothetical in the databases.
Fig. 3.Amino acid sequence variants in (a) LasR, (b) AlgG and (c) PelA. The cumulative number of mutations causing amino acid exchanges is plotted versus the position in the primary amino acid sequence from the N- to the C-terminus. Each exchange is counted once independent of its detection in single or several strains. Divergent exchanges observed at the same position are counted as independent events. For each protein, the observed mutations are summarized individually for the subsets of the 345 CF isolates (black line) and the 189 non-CF isolates (red line). In case of LasR (a), mutations detected in other studies on CF isolates are also displayed (blue line, Seattle, USA [25]; yellow line, Copenhagen, Denmark [24]). Known functionally important positions and domains within the protein sequences are indicated. For LasR, the DNA-binding region and the major amino acids in the AHL-binding pocket are shown [48]. Similarly, the polymannuronate embedding region containing 24 aa repeats with β-helix folds is shown for AlgG (b). The indicated amino acids designate the major residues of the AlgG catalytic centre and substrate-binding sites [54, 55]. For PelA (c), the structural domains are shown [57].
Adaptation of to the CF environment
|
Gene* |
Non-conservative coding variants in the CF habitat | |||
|---|---|---|---|---|
|
Detection |
CF/non-CF skew |
Positive selection |
Potential drug target | |
|
|
Yes |
Yes |
Yes | |
|
|
Yes |
Yes |
Yes | |
|
|
Yes | |||
|
|
Yes |
Yes |
Yes | |
|
|
Yes |
Yes |
Yes | |
|
|
Yes | |||
|
|
Yes |
Yes | ||
|
|
Yes |
Yes | ||
|
|
Yes |
Yes | ||
*Loci are not shown that lacked any coding variant at conserved positions or any missense mutation with low Dayhoff score.