| Literature DB >> 19019219 |
Lokesh P Tripathi1, R Sowdhamini.
Abstract
BACKGROUND: Serine proteases are one of the most abundant groups of proteolytic enzymes found in all the kingdoms of life. While studies have established significant roles for many prokaryotic serine proteases in several physiological processes, such as those associated with metabolism, cell signalling, defense response and development, functional associations for a large number of prokaryotic serine proteases are relatively unknown. Current analysis is aimed at understanding the distribution and probable biological functions of the select serine proteases encoded in representative prokaryotic organisms.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19019219 PMCID: PMC2605481 DOI: 10.1186/1471-2164-9-549
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Distribution of the select serine protease families across the representative genomes of the prokaryotic lineages.
| - | 3 | 1 | 2 | - | |
| 1 | 2 | 1 | 1 | - | |
| 2 | 5 | - | - | - | |
| 1 | 2 | - | - | - | |
| 1 | 3 | - | - | - | |
| - | 2 | - | - | 2 | |
| 1 | 2 | - | - | 1 | |
| - | 2 | - | - | 1 | |
| - | - | - | - | 2 | |
| - | 3 | 1 | - | 2 | |
| - | 2 | 1 | - | 2 | |
| 1 | 3 | - | - | 1 | |
| - | - | 1 | 1 | 2 | |
| - | 3 | - | 1 | 2 | |
| - | 3 | - | - | 2 | |
| - | 2 | - | - | 1 | |
| - | - | - | - | 1 | |
| 7 | 1 | 2 | 2 | - | |
| 7 | 1 | 2 | 2 | - | |
| 4 | 3 | 2 | 2 | 1 | |
| 6 | 4 | 13 | 2 | 1 | |
| 2 | 1 | 2 | 2 | 1 | |
| 7 | 3 | 3 | 1 | 1 | |
| 14 | 15 | 13 | 5 | 2 | |
| 4 | 5 | 9 | 3 | 3 | |
| 3 | - | 3 | 3 | 1 | |
| 10 | 2 | 17 | 2 | 2 | |
| 8 | 2 | 7 | 3 | 1 | |
| 2 | 2 | 7 | 1 | 1 | |
| 8 | 2 | 7 | 1 | 1 | |
| 1 | 2 | 2 | - | - | |
| 4 | 3 | - | 2 | 3 | |
| 2 | 2 | 2 | 2 | 2 | |
| 2 | 1 | 2 | 1 | 2 | |
| 3 | 1 | 1 | - | - | |
| 3 | 2 | 1 | 1 | 1 | |
| - | 6 | 3 | 1 | 1 | |
| 6 | 4 | 2 | 2 | 1 | |
| 1 | 1 | - | 1 | 1 | |
| - | 1 | 1 | - | - | |
| 3 | 3 | 1 | 1 | 1 | |
| 3 | 2 | 2 | 1 | 1 | |
| 3 | 1 | 1 | 1 | 1 | |
| 2 | 5 | 4 | 2 | - | |
| 7 | 4 | 4 | 2 | - | |
| 3 | 1 | 3 | 3 | - | |
| 3 | 1 | 3 | 4 | - | |
| - | 1 | - | - | - | |
| 24 | 15 | 2 | 1 | 3 | |
| 2 | 3 | 2 | 1 | 5 | |
| 1 | 1 | 2 | 1 | 3 | |
| 2 | 4 | 14 | 4 | 3 | |
| 2 | 5 | 4 | 3 | 3 | |
| 2 | 9 | 2 | 3 | 3 | |
| 4 | 7 | 4 | 2 | 3 | |
| 2 | 5 | 17 | 3 | 3 | |
| 2 | - | 3 | 1 | - | |
| 1 | 1 | 6 | 1 | 1 | |
| 1 | 1 | 3 | 1 | 1 | |
| 1 | - | 3 | 1 | - | |
| 1 | - | 1 | 1 | 1 | |
| 3 | 8 | 4 | 2 | 1 | |
| 9 | 1 | 2 | 2 | - | |
| 3 | - | 3 | 1 | - | |
| 1 | 3 | 2 | 1 | - | |
| 1 | - | 2 | 1 | - | |
| 2 | 1 | 1 | 1 | - | |
| 2 | 3 | 2 | 2 | 2 | |
| 3 | 1 | 2 | 1 | 3 | |
| 1 | 2 | - | 1 | 1 | |
| 4 | - | 3 | 4 | 2 | |
| 2 | - | - | 1 | 2 | |
| 8 | 4 | 4 | 2 | 3 | |
| 2 | 2 | 3 | 1 | 2 | |
| 3 | 3 | 7 | 1 | 1 | |
| 2 | 2 | 3 | 1 | 2 | |
| 2 | 2 | 7 | 3 | 3 | |
| 2 | 2 | 7 | 3 | 4 | |
| 2 | 2 | 4 | 3 | 4 | |
| 3 | 2 | 4 | 1 | 4 | |
| 2 | 6 | 2 | 2 | 2 | |
| 3 | 10 | 3 | 1 | 1 | |
| 2 | 3 | 1 | 2 | 1 | |
| 5 | - | 1 | 2 | 1 | |
Tryp- Trypsin (Pfam accession – PF00089); Subt- Subtilisin (PF00082); DDPept- D-Ala-D-Ala carboxypeptidase B (DD-peptidase) (PF00144); Clp- ClP protease (PF00574); Lon- Lon protease (PF05362).
Genomes where no serine protease-like proteins were identified: Bacteroides_fragilis_YCH46 Haloarcula_marismortui_ATCC_43049; Legionella_pneumophila_Lens; Nostoc_sp.; Salmonella_enterica_Choleraesuis; Clostridium_acetobutylicum; Clostridium_perfringens
Distribution of five serine protease families across various prokaryotic taxonomic groups represented in 91 genomes.
| Lineage | Trypsin | Subtilisin | Beta-lactamase | Clp protease | Lon protease |
| Euryarchaeota (11) | 2 | 22 | 3 | 2 | 18 |
| Crenarchaeota (5) | 5 | 15 | 2 | 3 | - |
| Alphaproteobacteria (6) | 32 | 10 | 43 | 10 | 6 |
| Betaproteobacteria (10) | 24 | 24 | 13 | 11 | 12 |
| Gammaproteobacteria (13) | 37 | 38 | 48 | 25 | 31 |
| Firmicutes (18) | 42 | 49 | 75 | 31 | 24 |
| Actinobacteria (8) | 51 | 33 | 46 | 19 | 9 |
| Chlorobi (3) | 8 | 8 | 7 | 4 | 2 |
| Cyanobacteria (3) | 13 | 6 | 10 | 9 | - |
| Deltaproteobacteria (2) | 26 | 18 | 4 | 2 | 8 |
| Others (12) | 7 | 4 | 3 | 15 | 7 |
| Total | 247 | 227 | 254 | 121 | 117 |
Figure 1Phylogenetic analysis of the trypsins. A neighbour-joining tree based on an alignment of the trypsin protease domain generated with ClustalW [26], was inferred using the PHYLIP package [27] and drawn using the MEGA program [28] (see text for details). The various taxonomic lineages encountered in the analysis are represented in the different colours. For clarity, the protein identifiers are suffixed with the abbreviated species IDs (see Additional file 2). Only the protein clusters supported by significant bootstrap values (> 50%) are highlighted with the colour scheme. For the rest only the gene (and species) identifiers are highlighted with the colour scheme. The primary branches in the clusters populated by the representatives from non-identical lineage (taxa) are shaded in grey. Atypical members in an otherwise strong cluster are highlighted in the colour of their corresponding lineage. The phylogenetic clade corresponding to the trypsin-like proteins that carry the Colicin_V-S1 domain architecture is shaded pink. The colour schemes for the various lineages are as follows: Actinobacteria- Magenta; Alphaproteobacteria- Orange; Archaea- Red; Betaproteobacteria- Brown; Chlorobi- Olive green; Cyanobacteria- Green; Deltaproteobacteria- Yellow; Firmicutes- Cyan; Gammaprot- Gammaproteobacteria- Blue; Others- Black.
Figure 2Phylogenetic analysis of the subtilisins carried out as described in Figure 1. Phylogenetic clade corresponding to subtilisin homologues that carry S8-Autotrans domain architecture and those that atleast carry a DUF1034 module C-terminal to subtilisin protease domain are marked. The abbreviations and the colour schemes are the same as in Figure 1.
Distribution of domain architectures in prokaryotic SPs; their occurrence in major lineages (indicated by +) and inferred functional associations based on co-existing domains and literature.
| Lineage* | ||||||
| Domain Architecture# | Representative sequence | No. of SPs | A | B | E | Postulated Biological Functional Associations (see text) |
| Trypsin family (S1; Tryp(sin)- PF00089) | ||||||
| Tryp | YP_643608.1 | 112 | + | + | + | Proteolysis |
| Tryp-PDZ | NP_441326.1 | 63 | + | + | + | Signalling |
| Tryp-PDZ-PDZ | NP_107958.1 | 49 | - | + | - | Signalling, Heat Shock response |
| Colicin_V-Tryp | NP_338325.1 | 8 | - | + | - | Pathogenesis, Defense |
| Pro_Al_prot-Tryp | NP_827728.1 | 2 | - | + | - | Proteolysis |
| Tryp-Endonuclease_NS | NP_604177.1 | 1 | - | + | - | Nucleic acid metabolism |
| Tryp-(FG-GAP)3 | NP_825221.1 | 1 | - | + | - | Ligand binding and processing |
| Tryp-(Sel1)6 | YP_374752.1 | 1 | - | + | - | Proteolysis |
| Tryp-CW_binding_1-CW_binding_1 | NP_344916.1 | 1 | - | + | - | Cell recognition, pathogenesis |
| FHA-FHA-Tryp | NP_811686.1 | 1 | - | + | - | Metabolism and signalling |
| Pro_Al_prot-Tryp-CBM_5_12 | NP_822175.1 | 1 | - | + | - | Carbohydrate metabolism |
| (Pro_Al_prot)2-Tryp | NP_827729.1 | 1 | - | + | - | Proteolysis |
| Tryp-ANF_receptor | YP_073997.1 | 1 | - | + | - | Ligand binding and processing |
| Tryp-PPC-SCP | YP_434226.1 | 1 | - | + | - | Calcium chelating, signalling |
| Tryp-PPC-PPC | YP_437990.1 | 1 | - | + | - | Carbohydrate metabolism, signalling |
| TerD-Tryp | YP_273108.1 | 1 | - | + | - | Growth in unfavourable environment |
| Subtilisin family (S8; Subt(ilisin)- PF00082) | ||||||
| Subt | NP_147093.1 | 142 | + | + | + | Proteolysis |
| Subt-Autotransporter | YP_260308.1 | 15 | - | + | - | Transport, Cell adhesion, Virulence |
| Subtilisin_N-Subt | NP_241550.1 | 14 | + | + | + | Proteolysis |
| Subt-PPC | YP_154554.1 | 9 | + | + | - | Carbohydrate metabolism, signalling |
| Subtilisin_N-Subt-PA | NP_391688.1 | 4 | + | + | + | Proteolysis |
| Subt-PPC-PPC | YP_341139.1 | 4 | + | + | - | Carbohydrate metabolism, signalling |
| Subt-P_proprotein | NP_967370.1 | 3 | - | + | - | Proteolysis |
| Subt-Big_2 | NP_969490.1 | 2 | - | + | - | Cell adhesion, pathogenesis |
| Subt-DUF1034 | YP_194362.1 | 2 | - | + | - | Proteolysis |
| Subt-PA-DUF1034 | NP_693854.1 | 2 | - | + | + | Proteolysis |
| Subt-PKD-PKD | YP_326498.1 | 2 | + | + | + | Carbohydrate metabolism, signalling |
| Subt-P_proprotein-PKD | NP_716498.1 | 2 | - | + | - | Carbohydrate metabolism, signalling |
| GRP-Subt | NP_435320.1 | 1 | - | + | - | Stress response |
| (Hemolys)2-Subt-P_proprotein | NP_747027.1 | 1 | - | + | - | Cell surface binding |
| (Hemolys)3-Subt-P_proprot-Hemolys | NP_927988.1 | 1 | - | + | - | Cell surface binding |
| PPC-Subt | YP_436813.1 | 1 | - | + | - | Carbohydrate metabolism, signalling |
| Subt-BNR | NP_824495.1 | 1 | - | + | - | Proteolysis |
| Subt-(CARDB)9 | NP_954260.1 | 1 | - | + | - | Cell adhesion, pathogenesis |
| Subt-Cleaved_Adhesin-fn3-PKD-PKD | YP_074547.1 | 1 | - | + | - | Virulence, signalling, metabolism |
| Subt-CUB | NP_967057.1 | 1 | - | + | + | Signalling |
| Subt-(Dockerin_1)2 | NP_280653.1 | 1 | - | + | - | Cellulose degradation, metabolism |
| Subt-DUF11 | NP_951948.1 | 1 | - | + | - | Cellular transport |
| Subt-fn3 | YP_446403.1 | 1 | - | + | - | Cell surface binding |
| Subt-(fn3)3-(PKD)3 | YP_565583.1 | 1 | - | + | - | Cell surface binding, signalling, metabolism |
| Subt-Gram_pos_anchor | NP_241562.1 | 1 | - | + | - | Cell invasion, pathogenesis |
| Subt-NosD | NP_616940.1 | 1 | + | - | - | Respiratory metabolism |
| Subt-PA-DUF1034-(Big_2)2-(SLH)2 | NP_624131.1 | 1 | - | + | - | Cell adhesion, pathogenesis |
| Subt-PilZ | NP_969350.1 | 1 | - | + | - | Signalling |
| Subt-(P_proprotein)2 | YP_434175.1 | 1 | - | + | - | Proteolysis |
| Sub_N-Subt-Cleaved_Adhesin | NP_693252.1 | 1 | - | + | - | Virulence |
| Sub_N-Subt-PA-Dockerin | NP_691157.1 | 1 | - | + | - | Cellulose degradation, metabolism |
| Sub_N-Subt-PA-DUF1034-Gram_pos_anchor | NP_345151.1 | 1 | - | + | - | Cell invasion, pathogenesis |
| Sub_N-Subt-PA-DUF1034-(FIVAR)5-Gram_pos_anchor | NP_965819.1 | 1 | - | + | - | Cell recognition and invasion, Sugar binding |
| Sub_N-Subt-PA-PPC | NP_717522.1 | 1 | - | + | - | Carbohydrate metabolism, signalling |
| Sub_N-Subt-PA-PPC- P_proprotein | NP_718668.1 | 1 | - | + | - | Carbohydrate metabolism, signalling |
| Thermopsin-Subt | NP_394205.1 | 1 | + | - | - | Thermostability |
| (W_rich_C)2-(PPC)2-Subt | YP_382882.1 | 1 | - | + | - | Cell surface signalling |
| YSIRK_signal-Subt-PA-DUF1034-(FIVAR)3- Gram_pos_anchor | NP_689039.1 | 1 | - | + | - | Cell recognition and invasion, Sugar binding |
| DD-peptidase family (S12; DD-Pept(idase) -PF00144) | ||||||
| DDPept | NP_811352.1 | 249 | + | + | + | Cell wall biosynthesis |
| DDPept -ABC_tran | YP_434618.1 | 1 | - | + | - | Biological transport |
| DDPept -DUF1343 | YP_439122.1 | 1 | - | + | - | Cell wall biosynthesis |
| (Cond-AMP-PPbind)3- DDPept | NP_824819.1 | 1 | - | + | - | Metabolism of Antibiotic compounds |
| Glyco_hydr_3- DDPept | NP_811352.1 | 1 | - | + | - | Carbohydrate hydrolysis |
| Glyc_hyd_3_Glyc_hyd_3_C- DDPept | YP_444518.1 | 1 | - | + | - | Carbohydrate hydrolysis, metabolism |
| Clp protease family (S14; Clp(_protease)- PF00574) | ||||||
| Clp | NP_811352.1 | 118 | + | + | + | Proteolysis |
| Clp-Nfed | NP_126341.1 | 3 | + | + | - | Proteolysis |
| Lon protease family (S16; Lon_C- PF05362) | ||||||
| Lon_C | NP_623361.1 | 39 | + | + | + | Proteolysis |
| LON-AAA-Lon_C | NP_743601.1 | 58 | + | + | + | Signalling, metabolism |
| Sigma54_activat-AAA-Lon_C | YP_183677.1 | 8 | + | + | - | Transcription regulation, metabolism |
| Sigma54_activat-Lon_C | NP_127256.1 | 5 | + | - | - | Transcription regulation |
| Mg_chelatase-Lon_C | NP_248420.1 | 4 | + | - | - | Bacteriochlorophyll metabolism |
| Mg_chelat-Sigma54_activat-Lon_C | NP_578196.1 | 1 | + | - | - | Transcription regulation |
| DnaB_C-Tryp | YP_160730.1 | 1 | - | + | - | DNA metabolism |
| PDZ-Lon_C | NP_389388.1 | 1 | - | + | - | Signalling |
# Co-exisiting domains: -Trypsin; -Subtilisin; - DD-peptidase; - Clp protease; _- Lon protease; - ATPase family associated with various cellular activities (PF00004); - ABC transporter (PF00005); - AMP-binding enzyme (PF00501); - Receptor family ligand binding region (PF01094); Autotransporter- Autotransporter beta-domain (PF03797); - Bacterial Ig-like domain (group 2) (PF02368); - BNR/Asp-box repeat (PF02012); - Cell adhesion related domain found in bacteria (PF07705); - Colicin V production protein (PF02674); CBM_5_12- Carbohydrate binding domain (PF02839); - Cleaved Adhesin Domain (PF07675); - Condensation domain (PF00668); - CUB domain (PF00431); - Putative cell wall binding repeat (PF01473); - DnaB-like helicase C terminal domain (PF03796); Dockerin_1- Dockerin type I repeat (PF00404); - Domain of unknown function (PF06280); - Domain of unknown function (PF01345); - Protein of unknown function (PF07075); - FG-GAP repeat (PF01839); - DNA/RNA non-specific endonuclease (PF01233); - FHA (Forkhead-associated) domain (PF00498); - Uncharacterised Sugar-binding Domain (PF07554); - Fibronectin type III domain (PF00041); - Glycosyl hydrolase family 3 N terminal domain (PF00933); - Glycosyl hydrolase family 3 C terminal domain (PF01915); - Gram positive anchor (PF00746); - Glycine rich protein family (PF07172); - Hemolysin-type calcium-binding repeat (2 copies) (PF00353); - ATP-dependent protease La (LON) domain (PF02190); Mg_chelat- Magnesium chelatase, subunit ChlI (PF01078); - Nfed-like (PF01957); - Periplasmic copper-binding protein (NosD) (PF05048); - Proprotein convertase P-domain (PF01483); - Protease associated domain (PF02225); - PDZ domain (PF00595); - PKD domain (PF00801); - Phosphopantetheine attachment site (PF00550); - Bacterial pre-peptidase C-terminal domain (PF04151); - Alpha-lytic protease prodomain (PF02983); 1- Sel1 repeat (PF08238); - SCP-like extracellular protein (PF00188); - Sigma-54 interaction domain (PF00158); - S-layer homology domain (PF00395); - Subtilisin N-terminal region (PF005922); - Bacterial stress protein (PF02342); - Thermopsin (PF05317); - Tryptophan-rich Synechocystis species C-terminal domain (PF07483) * Lineage: A- Archaea; B- Bacteria; E- Eukaryotes
Figure 3Phylogenetic analysis of the DD-peptidase-like proteins carried out as described in Figure 1. The abbreviations and the colour schemes are the same as in Figure 1.
Figure 4Phylogenetic analysis of the Clp protease carried out as described in Figure 1. The abbreviations and the colour schemes are the same as in Figure 1.
Figure 5Phylogenetic analysis of the Lon proteases carried out as described in Figure 1. Abbreviations and the colour schemes are the same as in Figure 1 except for those employed for phylogenetic clades comprising LonB proteins. Subclusters of bacterial (green) and archaeal (pink) LonB homologues can be visualised.
Figure 6A schematic representation of the abundance of the single domain and the multi-domain serine protease-like proteins in the five serine protease families under study.