| Literature DB >> 34937179 |
Andrew Hutchin1,2,3,4, Charlotte Cordery1,2,3,5, Martin A Walsh2,3, Jeremy S Webb1,5, Ivo Tews1,5.
Abstract
PAS domains are omnipresent building blocks of multidomain proteins in all domains of life. Bacteria possess a variety of PAS domains in intracellular proteins and the related Cache domains in periplasmic or extracellular proteins. PAS and Cache domains are predominant in sensory systems, often carry cofactors or bind ligands, and serve as dimerization domains in protein association. To aid our understanding of the wide distribution of these domains, we analyzed the proteome of the opportunistic human pathogen Pseudomonas aeruginosa PAO1 in silico. The ability of this bacterium to survive under different environmental conditions, to switch between planktonic and sessile/biofilm lifestyle, or to evade stresses, notably involves c-di-GMP regulatory proteins or depends on sensory pathways involving multidomain proteins that possess PAS or Cache domains. Maximum likelihood phylogeny was used to group PAS and Cache domains on the basis of amino acid sequence. Conservation of cofactor- or ligand-coordinating amino acids aided by structure-based comparison was used to inform function. The resulting classification presented here includes PAS domains that are candidate binders of carboxylic acids, amino acids, fatty acids, flavin adenine dinucleotide (FAD), 4-hydroxycinnamic acid, and heme. These predictions are put in context to previously described phenotypic data, often generated from deletion mutants. The analysis predicts novel functions for sensory proteins and sheds light on functional diversification in a large set of proteins with similar architecture. IMPORTANCE To adjust to a variety of life conditions, bacteria typically use multidomain proteins, where the modular structure allows functional differentiation. Proteins responding to environmental cues and regulating physiological responses are found in chemotaxis pathways that respond to a wide range of stimuli to affect movement. Environmental cues also regulate intracellular levels of cyclic-di-GMP, a universal bacterial secondary messenger that is a key determinant of bacterial lifestyle and virulence. We study Pseudomonas aeruginosa, an organism known to colonize a broad range of environments that can switch lifestyle between the sessile biofilm and the planktonic swimming form. We have investigated the PAS and Cache domains, of which we identified 101 in 70 Pseudomonas aeruginosa PAO1 proteins, and have grouped these by phylogeny with domains of known structure. The resulting data set integrates sequence analysis and structure prediction to infer ligand or cofactor binding. With this data set, functional predictions for PAS and Cache domain-containing proteins are made.Entities:
Keywords: Cache domain; PAS domain; Pseudomonas; cofactors; phylogenetic analysis; phylogeny; sensory transduction processes
Mesh:
Substances:
Year: 2021 PMID: 34937179 PMCID: PMC8694187 DOI: 10.1128/spectrum.01026-21
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
P. aeruginosa PAO1 proteins with PAS or Cache domains
| Gene | Protein | Domain boundary | |||
|---|---|---|---|---|---|
| PAS1 | PAS2 | PAS3 | PAS4 | ||
|
| SiaA | dCache 102–304 | dCache 102–304 | ||
|
| Aer2/TlpG/McpB | 166–287 | |||
|
| 79–198 | 206–320 | |||
|
| 31–151 | ||||
|
| 50–170 | ||||
|
| CreC | sCache35–179 | |||
|
| 12–135 | 137–255 | 265–379 | ||
|
| 310–426 | 438–550 | 562–675 | 682–797 | |
|
| AgtS | 323–436 | 446–568 | ||
|
| 142–284 | 444–560 | |||
|
| RbdA | 243–363 | |||
|
| PhhR | 82–187 | |||
|
| GacS | 43–161 | |||
|
| FleS | 74–164 | |||
|
| TpbB/ YfiN | 46–152 | |||
|
| PhoQ | 33–161 | |||
|
| YegE | 298–415 | 427–542 | 553–674 | |
|
| DdaR | 20–132 | |||
|
| 57–169 | 343–456 | |||
|
| IhpR | 1–107 | |||
|
| AauS | dCache51–346 | dCache51–346 | ||
|
| 23–129 | ||||
|
| BdlA | 3–112 | 116–234 | ||
|
| MmnS | 41–166 | |||
|
| Aer/ TlpC | 8–121 | |||
|
| 38–169 | ||||
|
| McpS | 17–134 | 139–254 | ||
|
| ErcS′ | 97–207 | 226–338 | 339–454 | |
|
| ErcS | 41–157 | |||
|
| HbcR | 17–123 | |||
|
| 301–414 | ||||
|
| 62–180 | 190–308 | |||
|
| 79–182 | ||||
|
| 30–148 | ||||
|
| CzcS | 34–171 | |||
|
| CtpM | sCache42–198 | |||
|
| TlpQ | dCache50–346 | dCache50–346 | ||
|
| SagS | 56–169 | |||
|
| 97–211 | 241–348 | |||
|
| RocS2 | 110–225 | |||
|
| 636–751 | ||||
|
| RocS1 | 573–687 | |||
|
| EatR | 80–185 | 225–344 | ||
|
| 432–537 | ||||
|
| 343–460 | 491–614 | 626–744 | ||
|
| BphP | 23–123 | |||
|
| AcoR | 82–191 | 225–344 | ||
|
| BfiS | 158–265 | 266–383 | 389–504 | |
|
| 411–520 | ||||
|
| PprA | 303–421 | 431–549 | 560–675 | |
|
| PctC | dCache34–275 | dCache34–275 | ||
|
| PctA | dCache35–273 | dCache35–273 | ||
|
| PctB | dCache35–274 | dCache35–274 | ||
|
| 50–154 | 286–395 | |||
|
| PilS | 195–296 | |||
|
| RtcR | 52–165 | |||
|
| MorA | 290–411 | 582–705 | 717–845 | 825–967 |
|
| dCache51–346 | dCache51–346 | |||
|
| CbrA | 630–739 | |||
|
| 69–166 | ||||
|
| FimX | 142–254 | |||
|
| 53–166 | ||||
|
| AruS | 288–388 | |||
|
| DipA | 9–130 | 344–460 | ||
|
| NtrB | 3–116 | |||
|
| DctB | dCache44–291 | dCache44–291 | ||
|
| PhoR | 101–201 | |||
|
| 275–393 | 401–515 | |||
|
| KinB | 257–369 | |||
|
| MifS | dCache31–298 | dCache31–298 | ||
Of the 70 genes listed, several encode more than one PAS domain. Domain boundaries were identified by HMM analysis in previous studies (22, 26) or with the SMART domain web server (30, 31).
The reference data set contains sequences from PAS or Cache domain structures, grouped by physiological cofactor or ligand and by protein and species name, as well as references to the structural database and literature
| Cofactor or ligand | Protein | Organism | PDB | PAS domain boundary from PDB RCSB | Pocket MS vol (Å3) | Comment | |
|---|---|---|---|---|---|---|---|
| 4′-Hydroxycinnamic acid | 164.16 | Ppr |
| 25–129 | 397.0 | ||
| 164.16 | PYP |
| 1–125 | 226.5 | |||
| Autoinducers | 124.14 | VqmA |
| 16–121 | 318.0 | ||
| Aromatics | 92.14 | TodS |
| 5–133 | 216.8 | ||
| FAD | 785.55 | MmoS (PAS A) |
| 1–100 | 763.2 | ||
| 785.55 | NifL |
| 16–117 | 548.1 | |||
| 785.55 | Vivid |
| 35–149 | 691.8 | |||
| Fatty acids | 228.37 | Caur_2278/ MltR |
| 111–292 | 965.3 | ||
| 356.54 | HIF3a9 PAS-B |
| 235–343 | 1,108.0 | |||
| 200.32 | RpfR |
| 7–110 | 453.7 | |||
| 256.42 | Rv1364c |
| 27–132 | 615.0 | |||
| FMN | 456.34 | Aureochrome 1a LOV |
| 34–138 | 552.8 | ||
| 456.34 | Cagg_3753 |
| 48–152 | 612.1 | |||
| 456.34 | LOV |
| 32–141 | 858.9 | Pocket open to solvent | ||
| 456.34 | EI222 |
| 34–141 | 585.6 | |||
| 376.36 | EL346 (HTCC2694) |
| 15–123 | 625.9 | Riboflavin binding | ||
| 456.34 | Env1 |
| 37–146 | 589.6 | |||
| 456.34 | LOV |
| 18–123 | 627.8 | |||
| 456.34 | LOV-HK |
| 26–140 | 573.7 | |||
| 456.34 | NPH1-1 (LOV2) |
| 13–119 | 587.7 | |||
| 456.34 | AUREO1 |
| 16–120 | 633.0 | |||
| 456.34 | PAL PAS B |
| 209–347 | 669.9 | |||
| 456.34 | Phot |
| 17–125 | 730.3 | |||
| 456.34 | Phot1 |
| 15–125 | 629.9 | |||
| 456.34 | Phot2 |
| 16–121 | 741.6 | |||
| 456.34 | Phy3 |
| 929–1032 | 715.6 | |||
| 456.34 | SB1-LOV |
| 16–119 | 1,075.2 | Pocket open to solvent | ||
| 456.34 | AUREO1 |
| 51–154 | 581.8 | |||
| 456.34 | YtvA |
| 8–111 | 738.5 | |||
| 456.34 | Ado1 LOV |
| 16–129 | 602.0 | |||
| Heme-B | 616.49 | Aer2 |
| 32–135 | 814.2 | ||
| 616.49 | Aer2 |
| 170–280 | 1,007.5 | |||
| 616.49 | DosP |
| 30–132 | 565.1 | |||
| 616.49 | HODM |
| 155–290 | 1,869.9 | |||
| 616.49 | FixL |
| 13–117 | 984.8 | |||
| 616.49 | FixL |
| 26–130 | 907.4 | |||
| Heme-C | 616.49 | GSU0582 |
| 45–131 | 24.8 | Non-classical heme cofactor binding | |
| 616.49 | GSU0935 |
| 45–127 | 12.1 | |||
| 618.50 | Tll0287 |
| 26–186 | 1,196.8 | Extended pocket | ||
| Metals | 107.87 | CusS |
| 38–185 | 58.1 | ||
| 65.39 | CzcS |
| 38–161 | 91.5 | |||
| No cofactor or ligand binding | NA | Agp1 (Atu1990) |
| 20–108 | 52.7 | ||
| NA | Agp2 (Atu2165) |
| 21–119 | 175.1 | Pocket open to solvent | ||
| NA | AhR |
| 106–253 | 67.5 | |||
| NA | AhR |
| 41–186 | 42.3 | |||
| NA | AhRR |
| A102–A256 | 133.0 | |||
| NA | ARNT (PAS A) |
| 89–189 | 1,188.7 | Open binding groove | ||
| NA | ARNT (PAS B) |
| B208–B311 | 128.2 | |||
| NA | ARNT (PAS B) |
| 1–119 | 38.9 | |||
| NA | ARNT (PAS A) |
| 92–263 | 48.1 | |||
| NA | ARNT (PAS B) |
| 282–384 | 137.3 | |||
| NA | BMAL1/ARNTL (PASB) |
| 277–382 | 234.3 | Pocket w/o occupancy | ||
| NA | CLOCK (PAS B) |
| 250–353 | 146.5 | |||
| NA | Cph1 |
| 29–126 | 65.8 | |||
| NA | DhaR/ YcgU |
| C214–C305 | 90.7 | |||
| NA | BphP |
| 52–144 | 79.7 | |||
| NA | EAG/ Kcnh1 |
| B23–B134 | 112.5 | |||
| NA | EAG/ Kcnh1 |
| A27–A132 | 186.6 | |||
| NA | PadC |
| 33–123 | 86.7 | |||
| NA | MmoS (PAS B) |
| 122–227 | 88.0 | |||
| NA | NcoA1 PAS B |
| A254–A385 | 410.7 | |||
| NA | NcoA-1/ SRC-1 |
| A259–A367 | 204.5 | Pocket w/o occupancy | ||
| NA | BphP |
| 25–114 | 136.1 | |||
| NA | PhyB |
| 29–131 | 41.1 | |||
| NA | PpsR (N-PAS) |
| 29–125 | 25.7 | |||
| NA | PpsR (PAS1) |
| 166–261 | 23.4 | |||
| NA | PpsR (PAS2) |
| 284–383 | 56.3 | |||
| NA | BphP1 PAS1 |
| 54–145 | 53.2 | |||
| NA | BphP1 PAS2 |
| 549–646 | 121.2 | |||
| NA | BphP2 |
| 29–121 | 21.0 | |||
| NA | BphP3 |
| 42–138 | 100.1 | |||
| NA | BphP |
| 17–112 | 12.3 | |||
| NA | BphP2 |
| 19–103 | 31.4 | |||
| NA | Soluble guanylate cyclase (sGC) PAS α domain |
| 10–110 | 31.4 | |||
| NA | Soluble guanylate cyclase (sGC) α subunit |
| A288–A386 | 47.1 | |||
| NA | Soluble guanylate cyclase (sGC) β subunit |
| B217–B326 | 72.5 | |||
| NA | XccBphP (N-terminal PAS domain) |
| 33–128 | 97.4 | |||
| NA | XccBphP (C-terminal PAS domain) |
| 534–637 | 852.6 | Open binding groove | ||
| dCache - amino acids | 89.09 | CtaA |
| 41–269 | 147.3 | ||
| 89.09 | Mlp24/ McpX/ VC_A0923 |
| 1–226 | 138.5 | |||
| 105.09 | Mlp37 |
| 5–234 | 127.5 | |||
| 149.21 | PctA |
| 29–256 | 334.5 | |||
| 175.21 | PctB |
| 33–256 | 227.8 | |||
| 103.12 | PctC |
| 33–257 | 191.8 | |||
| 115.13 | PscC |
|
| 23–275 | 192.5 | ||
| 131.17 | Tlp3 |
| 37–285 | 248.9 | |||
| 111.14 | TlpQ |
| 39–323 | 753.7 | Open binding groove | ||
| dCache - cytosine | 111.10 | Dret_0059 |
| 322–562 | 349.1 | ||
| dCache - phosphate | 94.97 | VP0354 (vpHK1S-Z8) |
| 8–269 | 224.1 | ||
| dCache - polyamines | 88.15 | McpU |
| 41–300 | 615.0 | ||
| Cache - no cofactor or ligand binding | NA | LuxQ |
| 21–240 | 37.0 | ||
| NA | LuxQ |
| 2–221 | 23.7 | |||
| dCache - QACs | 144.19 | McpX |
| 38–306 | 229.7 | ||
| dCache - cytokinins | 203.24 | AHK4 |
| 126–393 | 528.2 | ||
| dCache - carboxylic acids | 118.09 | DctB |
| 48–301 | 134.1 | ||
| 118.09 | DctB |
| 27–285 | 130.3 | |||
| 88.06 | KinD |
| 6–204 | 156.2 | |||
| 90.08 | TlpC |
| 3–261 | 263.2 | |||
| sCache - acetate sensing | 59.04 | Adeh_3718 |
| 57–144 | 84.9 | ||
| sCache - carboxylic acids | 189.10 | CitA |
| 50–126 | 356.9 | Pocket open to solvent | |
| 134.09 | DcuS |
| 56–130 | 174.9 | |||
| 88.06 | VP0183 |
| 56–146 | 95.9 | |||
| 73.07 | PscD-SD |
| 32–178 | 98.7 | |||
| sCache - metals | 58.69 | PhoQ |
| 41–138 | 108.5 | ||
| 40.08 | PhoQ | 39–138 | 32.8 | ||||
| sCache - urea | 60.05 | TlpB |
| 70–156 | 170.9 |
PAS or Cache domain boundaries are indicated. The pocket or cavity volume is presented along with the molecular weight (Mw) of the cofactor or ligand in the pocket/cavity, where present. MS, pocket volume based on the molecular surface; QAC, quaternary ammonium compound.
FIG 1Maximum likelihood phylogenetic analysis of Pseudomonas aeruginosa PAO1 PAS or Cache domains with the reference set of structurally characterized domains. The percentage of bootstrap replicates that reproduced each branch is given, with branches corresponding to less than 15% of bootstrap replicates collapsed and rearranged for clarity. PAS, dCache, and sCache domains are labeled with a circle, square, or triangle, respectively. The nature of ligand or cofactor is given in the key and denoted by color, and individual alignments of these groups are found in the supplemental material. Groups discussed in the text are marked with a numbered arrow. The supplement to this article contains an evaluation of different phylogenetic analyses and alignments of individual clades shown in Fig. 1 and discussed in the text.
FIG 2Ligand-binding capacity of the dCache phylogenetic clade investigated by primary/secondary structure conservation analysis. Sequences selected here are highlighted by arrow 1 in Fig. 1. (A) The dCache domain of Rhizobium meliloti DctB in complex with succinate (PDB 3E4O). Residues involved in the coordination of succinate are shown as sticks and are labeled with single-letter amino acid codes and consecutive numbers. (B) Guided sequence alignment using the predicted secondary structure for PAO1 PA1336, PA5165, and PA5512 against the carboxylic acid-binding dCache domains from R. meliloti DctB, Vibrio cholerae DctB, and Bacillus subtilis KinD. The predicted secondary structure used for alignment is denoted as a cartoon under the sequences. The position of amino acids used for ligand binding in DctB is indicated in the alignment.
FIG 3Combination of phylogeny- and conservation-based assignment. Pseudomonas aeruginosa PAS and Cache domains predicted to bind cofactors or ligands are grouped by method of prediction. The nature of the bound cofactor or ligand is denoted by color.
PAS or Cache domains and predicted cofactors or ligands assigned on the basis of combined phylogeny and sequence-structure alignment
| Cofactor or ligand | Protein | Domain | Known physiological role |
|---|---|---|---|
| Amino acids | PA2654 (TlpQ) | dCache | Chemotaxis toward ethylene and histamine ( |
| PA4307 (PctC) | dCache | Chemotaxis toward amino acids ( | |
| PA4309 (PctA) | dCache | Chemotaxis toward amino acids ( | |
| PA4310 (PctB) | dCache | Chemotaxis toward amino acids ( | |
| PA4633 | dCache | Unknown ( | |
| PA4886 | PAS1 | Unknown ( | |
| Autoinducers | PA1261 (LhpR) | PAS1 | Transcriptional regulator ( |
| PA2005 (HbcR) | PAS1 | Regulation of (R)-3-hydroxybutyrate catabolism ( | |
| Carboxylic acids -dCache like | PA1336 (AauS) | dCache | Regulation of genes involved in aspartate, glutamate, and glutamine uptake and catabolism ( |
| PA5165 (DctB) | dCache | Regulation of C4-dicarboxylic acid transport systems ( | |
| PA5512 (MifS) | dCache | Regulation of α-ketoglutarate transport and utilization ( | |
| Carboxylic acids -sCache like | PA2652 (CtpM) | sCache | Chemotaxis toward malate ( |
| PA4021 (EatR) | PAS1 | Regulation of ethanolamine catabolism ( | |
| PA4147 (AcoR) | PAS1 | Regulation of 2,3‐butanediol and acetoin metabolism ( | |
| Cytokinins | PA1976 (ErcS') | PAS3 | Regulation of ethanol oxidation ( |
| FAD | PA0285 | PAS2 | Regulation of biofilm formation ( |
| PA0575 | PAS4 | Regulation of biofilm formation in response to | |
| PA1423 (BdlA) | PAS1 | Regulation of biofilm dispersal ( | |
| PA1423 (BdlA) | PAS2 | Regulation of biofilm dispersal ( | |
| PA1561 (Aer/TlpC) | PAS1 | Aerotaxis ( | |
| PA1930 (McpS) | PAS1 | Regulation of chemotaxis ( | |
| PA1930 (McpS) | PAS2 | Regulation of chemotaxis ( | |
| PA4601 (MorA) | PAS4 | Regulation of flagellar development and protease secretion ( | |
| PA5442 | PAS2 | Unknown | |
| Fatty acids | PA0290 | PAS1 | Regulation of biofilm formation and Psl production ( |
| PA0847 | PAS2 | Regulation of motility in response to a no. of stimuli ( | |
| PA1196 (DdaR) | PAS1 | Regulation of methylarginine metabolism, role in quorum-sensing ( | |
| PA1243 | PAS1 | Regulation of swimming and biofilm formation ( | |
| PA1438 (MmnS) | PAS1 | Regulation of efflux pump expression ( | |
| PA1976 (ErcS') | PAS2 | Regulates ethanol oxidation ( | |
| PA4112 | PAS2 | Histidine kinase of unknown pathway | |
| PA4197 (BfiS) | PAS2 | Regulation of biofilm formation ( | |
| PA4293 (PprA) | PAS2 | Regulation of outer membrane permeability/of biofilm formation ( | |
| PA4581 (RtcR) | PAS1 | Homologous to | |
| PA5017 (DipA) | PAS1 | Biofilm regulation, chemotaxis, motility, maintenance of c-di-GMP heterogeneity ( | |
| PA5442 | PAS1 | Unknown | |
| Heme-b | PA0176 (Aer2/TlpG/McpB) | PAS1 | Aerotaxis and virulence ( |
| PA1976 (ErcS') | PAS1 | Regulates ethanol oxidation ( | |
| PA2177 | PAS1 | Unknown | |
| PA3271 (MxtR) | PAS1 | Redox sensing and interbacterial signaling ( | |
| PA5442 | PAS2 | Unknown | |
| Metals | PA0464 (CreC) | sCache | Regulation of carbon source catabolism ( |
| PA2524 (CzcS) | PAS1 | Regulation of metal detoxification and resistance to carbapenem antibiotics ( | |
| PA2870 | PAS1 | Diguanylate cyclase involved in biofilm production, Psl production, regulation of swimming motility ( | |
| No cofactor or ligand binding | PA0285 | PAS1 | Regulation of biofilm formation ( |
| PA0338 | PAS1 | Regulation of biofilm formation, Psl production, and swimming motility ( | |
| PA1181 (YegE) | PAS2 | Biofilm dispersal ( | |
| PA4112 | PAS3 | Histidine kinase of unknown pathway | |
| PA4117 (BphP) | PAS1 | Quorum sensing ( | |
| PA5124 (NtrB) | PAS1 | Regulation of nitrogen metabolism, rhamnolipid production, biofilm formation, expression of virulence genes, and swarming ( | |
| PA5442 | PAS1 | Unknown |
FIG 4Proteins assigned as containing carboxylic acid-binding Cache and PAS domains are involved in various signaling cascades. PA5165 and PA2652 bind carboxylic acids with periplasmatic dCache/sCache domains. PA5512 is involved in the transport and metabolism of α-ketoglutarate, a previously undescribed ligand for carboxylic acid-binding dCache domains. From the analysis presented here, PA1336 is predicted to bind polar and acidic amino acids. PA5165, PA5512, and PA1336 are sensor histidine kinases (HisK, His kinase A; HatP, histidine kinase-like ATPase), while PA2652 is a chemoreceptor (MA, methyl-accepting). Another cascade in response to acetaldehyde promotes transcriptional changes through interaction with cytoplasmic PAS domains in PA4021 or PA4147; this ligand has not previously been described for PAS domains.