| Literature DB >> 35625380 |
Supajit Sraphet1, Bagher Javadi2.
Abstract
The wealth of biological databases provides a valuable asset to understand evolution at a molecular level. This research presents the machine learning approach, an unsupervised agglomerative hierarchical clustering analysis of invariant solvent accessible surface areas and conserved structural features of Amycolatopsis eburnea lipases to exploit the enzyme stability and evolution. Amycolatopsis eburnea lipase sequences were retrieved from biological database. Six structural conserved regions and their residues were identified. Total Solvent Accessible Surface Area (SASA) and structural conserved-SASA with unsupervised agglomerative hierarchical algorithm were clustered lipases in three distinct groups (99/96%). The minimum SASA of nucleus residues was related to Lipase-4. It is clearly shown that the overall side chain of SASA was higher than the backbone in all enzymes. The SASA pattern of conserved regions clearly showed the evolutionary conservation areas that stabilized Amycolatopsis eburnea lipase structures. This research can bring new insight in protein design based on structurally conserved SASA in lipases with the help of a machine learning approach.Entities:
Keywords: Amycolatopsis eburnea; conserved accessible solvent area; enzyme; lipase; protein stability; structural biology analysis
Year: 2022 PMID: 35625380 PMCID: PMC9138565 DOI: 10.3390/biology11050652
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Physiochemical and SASA features of Amycolatopsis eburnea lipases.
| Lipase | NAA | MW | pI | Asp + Glu | Arg + Lys | AI | GRAVY | TPS | TAS | TSA | SCS | BBS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 388 | 40,097.38 | 5.75 | 32 | 28 | 85.90 | 0.096 | 4790.13 | 9644.99 | 14,435.12 | 1476 | 1187 |
| 2 | 394 | 41,410.78 | 5.29 | 35 | 29 | 80.84 | −0.028 | 5317.97 | 9602.71 | 14,920.67 | 1488 | 1247 |
| 3 | 436 | 44,666.35 | 5.27 | 34 | 27 | 87.82 | 0.120 | 5554.14 | 10,311.73 | 15,865.87 | 1582 | 1279 |
| 4 | 404 | 42,150.78 | 5.73 | 38 | 31 | 90.97 | 0.111 | 5406.49 | 10,157.69 | 15,564.18 | 1526 | 1286 |
| 5 | 288 | 29,214.18 | 6.13 | 19 | 16 | 89.90 | 0.280 | 3386.73 | 7358.97 | 10,745.70 | 1003 | 793 |
| 6 | 252 | 24,997.27 | 4.52 | 24 | 12 | 97.26 | 0.222 | 3612.33 | 6175.37 | 9787.69 | 874 | 666 |
| 7 | 419 | 44,089.24 | 6.23 | 29 | 26 | 73.89 | −0.097 | 4917.24 | 9462.03 | 14,379.28 | 1310 | 867 |
| 8 | 380 | 40,419.88 | 5.96 | 47 | 39 | 95.87 | −0.110 | 7165.27 | 12,160.81 | 19,326.07 | 1719 | 1104 |
Number of amino acids (NAA), molecular weight (MW), isoelectric point (pI), total number of negatively charged residues (Asp + Glu), total number of positively charged residues (Arg + Lys), aliphatic index (AI), grand average of hydropathicity (GRAVY), Total–Polar SASA (TPS), Total Apolar SASA (TAS), Total SASA (TSA), Side Chain SASA (SCS), BackBone SASA (BBS) of lipases (SASA = Å2).
Figure 1Lipase topology diagram with the strands indicated by arrow and helices by cylinder (A) Surface model representation of Amycolatopsis eburnea lipase 1 alone. (i) total SASA (ii) and Structurally conserved SASA in red (iii) (B) Amycolatopsis eburnea lipases modelled with deep learning de novo (C) (structurally conserved SASA in red, ribbon representation of lipases structures in blue).
Lipase Models properties were predicated for homology modeling and deep learning de novo. The properties for deep learning de novo are shown with *.
| Lipase | Entry | Oligo State | Ligand | GMQE | QMEAN | Cβ | Solvation | Torsion | Seq Identity | Seq Similarity | Coverage | Range | QSQE | Template |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | A0A3R9KNJ9 | Monomer | None | 0.64 | −4.02 | −1.96 | −1.46 | −3.32 | 29.49% | 0.35 | 0.92 | 25–388 | 0.00 | 2veo.1.A |
| 2 | A0A3R9DUJ4 | Monomer | None | 0.63 | −3.77 | −1.98 | −0.85 | −3.27 | 27.22% | 0.34 | 0.91 | 29–394 | 0.16 | 3zpx.1.A |
| 3 | A0A427T6P4 | Monomer | None | 0.56 | −4.02 | −3.87 | −1.01 | −3.14 | 26.60% | 0.33 | 0.86 | 28–422 | 0.12 | 3zpx.1.A |
| 4 | A0A3R9KMI2 | Monomer | None | 0.63 | −3.74 | −3.15 | −1.82 | −2.73 | 30.41% | 0.35 | 0.90 | 22–403 | 0.00 | 3guu.1.A |
| 5 | A0A3R9EQB2 | Monomer | None | 0.66 | −2.24 | −1.75 | −2.49 | −1.12 | 44.80% | 0.40 | 0.87 | 33–282 | 0.00 | 5h6g.1.A |
| 6 | A0A3R9F8T1 | Monomer | None | 0.50 | −2.54 | −2.32 | −1.60 | −1.48 | 26.39% | 0.32 | 0.86 | 34–251 | 0.00 | 5h6b.1.A |
| 7 | A0A3R9DV90 | Monomer | None | 0.31 | −5.78 | −3.28 | −3.35 | −4.24 | 20.95% | 0.31 | 0.60 | 99–390 | 0.00 | 4bvj.1.A |
| 8 | A0A427T2R3 | Monomer | None | 0.59 | −4.36 | −3.28 | −2.69 | −3.03 | 32.33% | 0.35 | 0.87 | 4–378 | 0.00 | 3skv.1.A |
Ramachandran plot information for the 3-D structures of lipases (HM = Homology Modeling, DM = Deep learning de novo Modeling).
| Lipase | Sequences | Number of Residues in Favored | Number of Residues in Outlier | ||
|---|---|---|---|---|---|
| HM (%) | DM (%) | HM (%) | DM (%) | ||
| 1 | A0A3R9KNJ9 | 90.61 | 95.85 | 3.31 | 1.30 |
| 2 | A0A3R9DUJ4 | 90.93 | 96.68 | 2.47 | 0.77 |
| 3 | A0A427T6P4 | 89.82 | 96.77 | 2.80 | 0.92 |
| 4 | A0A3R9KMI2 | 90.79 | 96.02 | 2.89 | 0.25 |
| 5 | A0A3R9EQB2 | 96.77 | 96.85 | 0.40 | 0.00 |
| 6 | A0A3R9F8T1 | 93.06 | 98.40 | 2.31 | 0.40 |
| 7 | A0A3R9DV90 | 88.28 | 91.13 | 4.48 | 1.92 |
| 8 | A0A427T2R3 | 87.40 | 96.03 | 4.29 | 0.00 |
Amino acid compositions of the Amycolatopsis eburnea lipases.
| Lipase | Entry | Length | Ala | Arg | Asn | Asp | Cys | Gln | Glu | Gly | His | Ile | Leu | Lys | Met | Phe | Pro | Ser | Thr | Trp | Tyr | Val |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | A0A3R9KNJ9 | 388 | 62 | 21 | 4 | 22 | 4 | 10 | 10 | 48 | 5 | 11 | 37 | 7 | 2 | 12 | 33 | 21 | 26 | 8 | 16 | 29 |
| 2 | A0A3R9DUJ4 | 394 | 58 | 13 | 10 | 24 | 4 | 18 | 11 | 39 | 4 | 9 | 34 | 16 | 4 | 16 | 30 | 20 | 29 | 5 | 18 | 32 |
| 3 | A0A427T6P4 | 436 | 65 | 18 | 13 | 22 | 2 | 9 | 12 | 50 | 5 | 14 | 40 | 9 | 5 | 11 | 33 | 33 | 34 | 5 | 19 | 37 |
| 4 | A0A3R9KMI2 | 404 | 69 | 20 | 8 | 20 | 4 | 11 | 18 | 39 | 9 | 14 | 38 | 11 | 3 | 14 | 28 | 24 | 21 | 3 | 17 | 33 |
| 5 | A0A3R9EQB2 | 288 | 45 | 8 | 7 | 14 | 6 | 7 | 5 | 37 | 7 | 9 | 28 | 8 | 3 | 9 | 17 | 18 | 22 | 3 | 11 | 24 |
| 6 | A0A3R9F8T1 | 252 | 44 | 9 | 3 | 10 | 4 | 13 | 14 | 31 | 3 | 4 | 26 | 3 | 2 | 0 | 20 | 11 | 20 | 3 | 3 | 29 |
| 7 | A0A3R9DV90 | 419 | 51 | 19 | 13 | 20 | 4 | 20 | 9 | 52 | 8 | 13 | 31 | 7 | 7 | 15 | 25 | 39 | 27 | 10 | 19 | 30 |
| 8 | A0A427T2R3 | 380 | 54 | 34 | 8 | 26 | 2 | 6 | 21 | 38 | 14 | 10 | 48 | 5 | 2 | 11 | 30 | 8 | 26 | 3 | 5 | 29 |
The secondary structure of Amycolatopsis eburnea lipases sequences.
| Lipase | Entry | Helix (%) | Sheet (%) | Turn (%) |
|---|---|---|---|---|
| 1 | A0A3R9KNJ9 | 60.8 | 33.0 | 13.1 |
| 2 | A0A3R9DUJ4 | 62.7 | 37.8 | 13.7 |
| 3 | A0A427T6P4 | 51.4 | 34.4 | 11.0 |
| 4 | A0A3R9KMI2 | 68.3 | 50.5 | 12.4 |
| 5 | A0A3R9EQB2 | 56.2 | 60.8 | 9.7 |
| 6 | A0A3R9F8T1 | 59.9 | 51.6 | 10.7 |
| 7 | A0A3R9DV90 | 53.7 | 37.7 | 11.5 |
| 8 | A0A427T2R3 | 67.9 | 35.8 | 10.3 |
Figure 2Dendrogram of lipase enzyme sequences.
Figure 3The hierarchical clustering of Amycolatopsis eburnea lipases is based on solvent surface accessibility area.
Nucleus and Surface solvent accessible surface area (SASA) and average of total solvent accessible area for two environments (nucleus and surface). The data presented in angstrom (Å2).
| Lipase | SASA | Total | Apolar | Backbone | Sidechain | Total Ave SASA |
|---|---|---|---|---|---|---|
| 1 | nucleus | 1612.05 | 1105.63 | 592.96 | 1019.10 | 39.65 |
| surface | 9087.30 | 6239.72 | 1796.53 | 7290.70 | ||
| 2 | nucleus | 1535.62 | 975.20 | 514.03 | 1021.70 | 40.76 |
| surface | 9201.62 | 5956.73 | 1673.27 | 7528.28 | ||
| 3 | nucleus | 1513.82 | 889.97 | 570.11 | 943.72 | 40.16 |
| surface | 10,211.71 | 6650.06 | 2225.90 | 7985.83 | ||
| 4 | nucleus | 1897.73 | 1227.46 | 601.65 | 1295.97 | 40.74 |
| surface | 10,134.94 | 6760.98 | 1846.70 | 8288.26 | ||
| 5 | nucleus | 839.39 | 498.14 | 390.68 | 448.79 | 42.98 |
| surface | 7280.09 | 5114.07 | 1544.76 | 5735.31 | ||
| 6 | nucleus | 941.66 | 630.22 | 411.88 | 529.76 | 44.89 |
| surface | 6561.57 | 4139.78 | 1521.13 | 5040.43 | ||
| 7 | nucleus | 1295.47 | 805.00 | 517.81 | 777.68 | 49.24 |
| surface | 9195.57 | 5912.66 | 2367.46 | 6828.11 | ||
| 8 | nucleus | 1686.72 | 1006.49 | 686.57 | 1000.18 | 51.53 |
| surface | 13,554.21 | 8567.14 | 2587.99 | 10,966.29 |
Figure 4Hierarchical clustering of structurally conserved regions-SASA.
Figure 5The proteins sequences were aligned using Chimera with defaults parameters. Structurally conserved regions are shown in blue.
Correlation matrix of structurally conserved regions-SASA.
| Correlation Matrix | Lipase 1 | Lipase 2 | Lipase 3 | Lipase 4 | Lipase 5 | Lipase 6 | Lipase 7 | Lipase 8 |
|---|---|---|---|---|---|---|---|---|
| Lipase 1 | 1.00 | |||||||
| Lipase 2 | 0.91 | 1.00 | ||||||
| Lipase 3 | 0.54 | 0.72 | 1.00 | |||||
| Lipase 4 | 0.65 | 0.63 | 0.36 | 1.00 | ||||
| Lipase 5 | 0.64 | 0.78 | 0.88 | 0.33 | 1.00 | |||
| Lipase 6 | 0.55 | 0.69 | 0.85 | 0.48 | 0.92 | 1.00 | ||
| Lipase 7 | 0.60 | 0.72 | 0.60 | 0.36 | 0.80 | 0.83 | 1.00 | |
| Lipase 8 | 0.75 | 0.74 | 0.50 | 0.55 | 0.78 | 0.78 | 0.81 | 1.00 |
Figure 6Residues preferences of conserved regions.