| Literature DB >> 30346964 |
Shohreh Ariaeenejad1, Maryam Mousivand2, Parinaz Moradi Dezfouli2, Maryam Hashemi2, Kaveh Kavousi3, Ghasem Hosseini Salekdeh1.
Abstract
Xylanases are hydrolytic enzymes which based on physicochemical properties, structure, mode of action and substrate specificities are classified into various glycoside hydrolase (GH) families. The purpose of this study is to show that the activity of the members of the xylanase family in the specified pH and temperature conditions can be computationally predicted. The proposed computational regression model was trained and tested with the Pseudo Amino Acid Composition (PseAAC) features extracted solely from the amino acid sequences of enzymes. The xylanases with experimentally determined activities were used as the training dataset to adjust the model parameters. To develop the model, 41 strains of Bacillus subtilis isolated from field soil were screened. From them, 28 strains with the highest halo diameter were selected for further studies. The performance of the model for prediction of xylanase activity was evaluated in three different temperature and pH conditions using stratified cross-validation and jackknife methods. The trained model can be used for determining the activity of newly found xylanases in the specified condition. Such computational models help to scale down the experimental costs and save time by identifying enzymes with appropriate activity for scientific and industrial usage. Our methodology for activity prediction of xylanase enzymes can be potentially applied to the members of the other enzyme families. The availability of sufficient experimental data in specified pH and temperature conditions is a prerequisite for training the learning model and to achieve high accuracy.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30346964 PMCID: PMC6197662 DOI: 10.1371/journal.pone.0205796
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
41 different xylanase enzymes were selected for experimental and computational studies.
The GenBank Accession No., and its relevant strain code, for each sequence are included. The diameter of halos produced in the screening plates is also included for enzymes with high and medium halos surface. The last three columns shows the activities measured for 28 selected xylanase enzymes in three different pH and temperature conditions.
| AGO02713 | a14h | 4.6 | M | 400 | 280 | 320 | |
| AGO02715 | d16d | 5.7 | H | 295 | 80 | 300 | |
| AGO02724 | d19d | 5 | M | 490 | 360 | 370 | |
| AGO63347 | d3d | 5 | M | 60 | 120 | 150 | |
| AGO63342 | h11h | 5 | M | 590 | 380 | 205 | |
| AGO02730 | h13f | 4.2 | M | 170 | 200 | 150 | |
| AGS78259 | h13h | 4.1 | M | 490 | 370 | 320 | |
| AGO63354 | h14d | 6.5 | H | 740 | 120 | 170 | |
| AGO63351 | h14h | 5 | M | 810 | 130 | 330 | |
| AGO63345 | h16h | 4.8 | M | 330 | 205 | 150 | |
| AGO63356 | k2b | 7 | H | 670 | 230 | 250 | |
| AGO02722 | k32l | 5 | M | 280 | 290 | 150 | |
| AGO02727 | k33l | 5 | M | 440 | 660 | 330 | |
| AGO63344 | k36p88 | 5 | M | 400 | 230 | 180 | |
| AGO63350 | k40b | 6 | H | 610 | 320 | 200 | |
| AGO02714 | k43l | 5 | M | 220 | 180 | 50 | |
| AGO02728 | k46b | 4 | M | 510 | 60 | 210 | |
| AGO02725 | s6a | 5.8 | H | 710 | 420 | 40 | |
| AGO02721 | s7e | 6.5 | H | 890 | 420 | 780 | |
| AGO02726 | S7h | 5 | M | 370 | 350 | 280 | |
| AGO97103 | t27b | 4.3 | M | 530 | 280 | 210 | |
| AGO02717 | t28d | 5 | M | 525 | 170 | 150 | |
| AGO63355 | t31d | 4 | M | 5 | 40 | 50 | |
| AGO63353 | t34b | 4.5 | M | 210 | 0 | 120 | |
| AGO02716 | t37a | 8 | H | 670 | 280 | 590 | |
| AGO02718 | t41a | 5 | H | 505 | 310 | 390 | |
| AGO02729 | W | 4.5 | M | 410 | 110 | 125 | |
| AGO63357 | Y | 4.5 | M | 690 | 390 | 260 | |
| AGO02712 | b16b | 3 | L | ||||
| AGO02719 | s2f | 2.9 | L | ||||
| AGO02720 | s2h | 2.5 | L | ||||
| AGO02732 | a10d | 3.5 | M | ||||
| AGO02723 | d3b | 2.5 | L | ||||
| AGO02731 | b5d | 3 | L | ||||
| AGO02733 | s3d | 2 | L | ||||
| AGO63358 | b9h | 2.7 | L | ||||
| AGO63360 | s5d | 3 | L | ||||
| AGO97104 | h13d | 3 | L | ||||
| AGO02734 | S1d | 3.8 | M | ||||
| AGO63359 | k27k88 | 3.5 | M | ||||
| AGO63349 | b11h | 3.5 | M | ||||
All cloned xylanase gene sequences belongs to the CAZy GH family 11 according to the Expert Protein Analysis System (ExPASy) PROSITE.
The xylanases in rows 38–41 are excluded because they showed very low activities in all three different conditions of temperature and pH. By experimentally determining the activities for 28 sequences in three conditions, they were used as the material for building and validating a regression model to predict the activity of the xylanase enzymes. The model was validated using stratified k-fold cross validation and jackknife methods.
Fig 1The amphiphilic coupling between all the third most contiguous amino acids.
The values 0 and 1 for m represents the correlations via hydrophobicity and hydrophilicity indices.
Results from three different classifiers.
| 10 fold | Jackknife | 10 fold | Jackknife | 10 fold | Jackknife | 10 fold | Jackknife | 10 fold | Jackknife | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
| 0.857 | 0.854 | 0.854 | 0.829 | 0.856 | 0.831 | 0.861 | 0.834 | 0.854 | 0.829 | |
| 0.964 | 0.910 | 0.951 | 0.902 | 0.951 | 0.900 | 0.954 | 0.901 | 0.951 | 0.902 | |
| 0.688 | 0.636 | 0.317 | 0.610 | 0.307 | 0.604 | 0.608 | 0.600 | 0.317 | 0.610 | |
Fig 2The activity of xylanase enzymes purified from different Bacillus subtilis strains vs. predicted activities by four computational models.
The activities were determined in three different pH/temperature conditions. (a) pH = 4,T = 26°C (b) pH = 4,T = 60°C (c) pH = 6,T = 26°C.
Performance measures resulted from four different regression models.
The models were validated by stratified 10-fold cross validation and jackknife methods. The results are related to three different pH/Temperature conditions. (a) pH = 4,T = 26°C (b) pH = 4,T = 60°C (c) pH = 6,T = 26°C.
| Regression Models | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| pH = 4 | MSE | 24529.116 | 22879.562 | 25600.679 | 26193.536 | 35004.789 | 39039.153 | 34166.515 | 29285.824 | |
| RMSE | 156.618 | 151.260 | 160.002 | 161.844 | 187.096 | 197.583 | 184.842 | 171.131 | ||
| MAE | 126.693 | 122.028 | 135.536 | 135.893 | 147.176 | 162.334 | 139.415 | 134.655 | ||
| R2 | -0.21 | -0.128 | -0.263 | -0.292 | -0.726 | -0.925 | -0.685 | -0.444 | ||
| pH = 4 | MSE | 55753.178 | 57096.944 | 72585.964 | 73541.250 | 123725.753 | 114543.750 | 78659.703 | 77522.968 | |
| RMSE | 236.121 | 238.950 | 269.418 | 271.185 | 351.747 | 338.443 | 280.463 | 278.429 | ||
| MAE | 197.666 | 199.404 | 227.179 | 231.821 | 284.171 | 254.821 | 222.875 | 226.616 | ||
| R2 | -0.192 | -0.221 | -0.552 | -0.572 | -1.646 | -1.449 | -0.682 | -0.658 | ||
| pH = 6 | MSE | 31188.785 | 29521.580 | 32759.964 | 32268.250 | 62414.683 | 59070.759 | 44906.883 | 44090.080 | |
| RMSE | 176.603 | 171.818 | 180.997 | 179.634 | 249.829 | 243.045 | 211.912 | 209.976 | ||
| MAE | 130.15 | 126.440 | 134.179 | 132.321 | 182.262 | 160.089 | 149.517 | 159.520 | ||
| R2 | -0.282 | -0.213 | -0.347 | -0.326 | -1.565 | -1.428 | -0.846 | -0.812 | ||