Literature DB >> 23251045

Modeling and structural analysis of cellulases using Clostridium thermocellum as template.

Nathan Vinod Kumar1, Mary Esther Rani, Rathinasamy Gunaseeli, Narayanan Dhiraviam Kannan, Jayavel Sridhar.   

Abstract

Cellulase is one of the most widely distributed enzymes with wide application. They are involved in conversion of biomass into simpler sugars. Cellulase of Trichoderma longibrachiatum, a known cellulolytic fungus was compared with Clostridium thermocellum [AAA23226.1] cellulase. Blastp was performed with AAA23226.1 as query sequence to obtain nine similar sequences from NCBI protein data bank. The physicochemical properties of cellulase were analyzed using ExPASy's ProtParam tool namely ProtParam, SOPMA and GOR IV. Homology modeling was done using SWISS MODEL and checked quality by RMSD values using VMD1.9.1. Active sites of each model were predicted using automated active site prediction server of SCFBio. Study revealed instability of cellulase of two eukaryotic strains namely Trichoderma longibrachiatum [CAA43059.1] and Melanocarpus albomyces [CAD56665.1]. The negative GRAVY score value of cellulases ensured better interaction and activity in aqueous phase. It was found that molecular weight (M. Wt) ranges between 25-127.56 kDa. Iso-electric point (pI) of cellulases was found to be acidic in nature. GOR IV and SOPMA were used to predict secondary structure of cellulase, which showed that random coil, was dominated. Neighbor joining tree with C. thermocellum [AAA23226.1] cellulase as root showed that cellulases of Thermoaerobacter subterraneus [ZP_07835928] and C. thermocellum [CAA4305.1] were more similar to eukaryotic cellulases supported by least boot strap values. Pseudoalteromonas haloplanktis cellulase was found to be the ideal model supported by least RMSD score among the predicted structures. Trichoderma longibrachiatum cellulase was found to be the best compared to other cellulases, which possess high number of active sites with ASN and THR rich active sites. CYS residues were also present ensuring stable interaction and better bonding. Hydrophilic residues were found high in active sites of all analyzed models and template.

Entities:  

Year:  2012        PMID: 23251045      PMCID: PMC3523225          DOI: 10.6026/97320630081105

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Cellulases are important enzymes in many proposed processes for producing fuels and chemicals from plant biomass [1]. They are multienzyme complexes, comprising three major components; endo- β-glucanase (EC 3.2.1.4), exo- β - glucanase (EC 3.2.1.9.1) and β- gucosidase (EC 3.2.1.2.1), which have been shown to act synergistically in the hydrolysis of cellulose [2]. These enzymes are being isolated from many microbial sources and characterized. Cellulase plays a major role in carbon cycling in the biosphere by contributing towards the major carbon source [3]. The role of cellulase in host- pathogen interaction is quite important when considering the cellulase producing ability of pathogenic strains [4]. These enzymes help in hydrolysis of substrates available and for the utilization of microbial growth and for their metabolism. There are no much reports available on computational characterization of the cellulase enzyme. Clostridium thermocellum is a thermophilic bacterial strain with high cellulolytic activity of about 5.32IU/L [5]. However, when eukaryotic cellulases are concern, Trichoderma sp. are widely used as a cellulase source and reported to possess very high activity. It was found to be the best strain for conversion of substrate into glucose of about 0.92mg/0.5ml which shows its higher cellulolytic activity [6]. Cellulase being an enzyme of wide application needs to be characterized in all aspects to understand the structural and functional relations. Present study is to identify more efficient cellulolytic enzyme producing microorganism for bioprospecting using the computational analysis. Protein sequences of cellulase retrieved from NCBI and were subjected to ProtParam to analyze physicochemical parameter, secondary structure prediction using GOR IV and SOPMA, homology modeling (Swiss model), phylogenetic analysis and active site prediction by SCFBio.

Methodology

Sequence retrieval and alignment:

Cellulase protein sequence of Clostridium thermocellum [AAA23226.1] was retrieved from the National Center for Biotechnology (NCBI) and made as the query sequence for the structure, properties prediction and modeling. Blastp was performed and obtained nine similar sequences of different strains. Clustal W multiple sequence alignment was done for those sequences using BioEdit5.0.

Secondary structure and physicochemical characterization cellulose:

The sequences obtained were analyzed using various softwares available in the ExPASy server [7]. The GOR IV analysis was performed to understand the presence of helices, beta turns and coils in the protein structure [8]. Self-optimized prediction method with alignment (SOPMA) analysis was done for analyzing the structural components [9]. Comparison was made between the GOR IV and SOPMA analysis results. ProtParam software analysis was done to understand about the amino acid composition, molecular weight, instability index, aliphatic index and grand average of hydropathicity (GRAVY) [7]. Hydropathy plot analysis for all cellulase sequences was performed and the nature of amino acid residues were studied using ProtScale [7] based on Kyte and Doolittle scale.

Homology modeling of cellulase:

Homology models were predicted using SWISS- MODEL [10-12] and the quality was analysed using VMD 1. 9.1 [13]. RMSD values were calculated using the RMSD calculator and the best homology model was selected. Ramachandran plot for the best predicted model was depicted by RAMPAGE software [14].

Phylogenetic analysis:

Phylogenetic relation among the aligned cellulase sequences obtained from Blastp were analyzed based on neighbor joining method [15] using MEGA 4.0 [16]. The cellulase sequence of C. thermocellum [AAA23226.1] was considered as the root taxon for the analysis. Confidence level was analyzed using bootstrap of 1000 replications.

Activity validation by active site comparison:

Active sites of the predicted models and the template were analyzed using Automated Active Site prediction AADS server of SCFBio [17]. Amino acid compositions of all the cavities were analyzed and the frequency of amino acid occurrence in the cavities of each models were analyzed.

Discussion

Blast analysis and sequence retrieval:

The cellulase protein sequence of Clostridium thermocellum [AAA23226.1] was used as query sequence and nine sequences were obtained by performing Blastp. Multiple sequence alignment was done in BioEdit software and further used for phylogenetic analysis in MEGA.

Secondary structure and physicochemical analysis:

SOPMA and GOR IV were used to predict the secondary structure, percentage of alpha, extended and random coils of cellulase producing microorganism are presented Table 1 (see supplementary material). SOPMA analysis for the structure prediction was also done and obtained the percentage of alpha, extended, beta and random coils (Table 1). The secondary structure indicates whether a given amino acid lies in a helix, strand or coil [18, 19]. SOPMA was used for structure prediction of cellulase protein [20]. Random coil dominates the other forms in the cellulase analyzed by SOPMA and GOR IV. It was identified that random coils of M. abomyces (58.72%) and T. longibrachiatum (57.88%) were dominant compared to other forms. However, followed by random coils, extended forms ranging from (10%-27%) was dominant over α and β helix. All the cellulases analyzed, α-helix was ranging from (13%-37%) dominates β-helix, which had less percentage of conformation (4%-10%). ProtParam analysis was performed and the number of amino acid residues, molecular weight, pI value, aliphatic index and GRAVY index was obtained for each sequence Table 2 (see supplementary material). Comparison of the amino acid residue occurrence in cellulase sequences were done and the most dominant residues were highlighted Table 3 (see supplementary material). It was found that molecular weight ranging from 25-127 kDa and it was higher in C. thermocellum (83 kDa) and lower in M. albomyces (25kDa). Comparing to the eukaryotic cellulase available, the higher aliphatic index of up to 97.51 was noted in T. subterraneus strains which indicate their stability over a wide range of temperatures. GRAVY value was negative in all species studied. It was notable that the bacterial strains had lower GRAVY values indicating the better possibilities of aqueous interaction. pI value showed that cellulase is acidic in nature. T. subterraneus had a slightly neutral pI value and the highest GRAVY value. Generally it was observed that towards acidic pI values the GRAVY tends to be low. In eukaryotic cellulases, the occurrence of α helices was found to be too low. In case of A. bisporus, α helices was similar to that of lower taxonomic groups. Moreover these cellulases possess higher percentages of random coils. A general pattern of inverse relationship between the percentage of occurrence of α helices and random coils were observed in both higher and lower taxonomic levels. Cellulase of M. albomyces, T. longibrachiatum and R. flavefaciens FD-1 was classified as unstable (II > 40) with an instability index (II) of 53.54, 55.23 and 54.34 respectively. It is notable that the M. albomyces and T. longibrachiatum are eukaryotic isolates and possess the least percentage of alpha helices in their structure. P. haloplanktis and R. flavefaciens FD-1 with dominant amino acid residues Asn (10.1%) and Ser (11.6%) respectively which are hydrophilic residues, all the other sequences had ALA and GLY as dominant residues which are hydrophobic in nature. ALA was dominant in cellulases of A. bisporus, C. thermocellum, P. carotovarum, Saccharophagus sp. and T. subterraneus whereas, Gly was dominant for C. thermocellum, M. albomyces and T. longibrachiatum.

Homology model validation:

SWISS MODEL was used to predict the homology model of the cellulase sequences and the protein structure quality was analyzed. RMSD values for the models were calculated and the model with least value i.e. the best predicted model is shown in (Figure 1). Ramachandran plot for the model was constructed using RAMPAGE software. Residue B 169 -LEU belonged to outlier region and the number of residues in the allowed and favoured region was very close to the expected values. It was observed that 94.8% of residues were in favored region and 5.5% in allowed region. It was found that 0.2% was found in outlier region.
Figure 1

Homology Model of Pseudoalteromonas haloplanktis cellulase based on template 1tvn predicted using SWISS MODEL. The model showed least RMSD value compared to other models

Phylogenetic tree was constructed using the ten sequences based on neighbour joining method with reference sequence C. thermocellum [AAA23226.1] as a root (Figure 2). It was observed that the cellulase of T. subterraneus [ZP_07835928.1] was found to be more related to the eukaryotic cellulases. T. longibrachiatum [CAA43059.1], T. subterraneus [ZP_07835928.1], M. albomyces [CAD56665.1], A. bisporus [CAA83971.1], C. thermocellum [CAA43035.1] were belonging to same group. It can be implied that cellulase sequence of T. subterraneus and C. thermocellum were much similar to eukaryotic cellulase and it is not much evolved from the C. thermocellum [AAA23226.1] cellulase sequence. But the higher boot strap values for the other sequences supports its divergence from the root sequence. However, all the taxa of the group belonged to prokaryotic origin. There was no much influence for evolutionary divergence of the sequence with respect to variations in secondary structure.
Figure 2

Neighbor joining tree showing evolutionary relationship among cellulase sequences of different origin were depicted using MEGA 4.0.Boot strap values are depicted at the nodes and branch lengths are also shown. Clostridium thermocellum CAA43035.1 and T. subterraneus ZP07835928.1 was grouped among eukaryotic cellulase sequences.

Compared to bacterial cellulases, fungal cellulases are widely used. Moreover, the cellulolytic activities are high for fungal cellulases. Highest cellulase activity for C. thermocellum was 12.05IU/ml [5]. P. haloplanktis being a psychrophilic bacterium the cellulase obtained is cold adaptable. Cellulase from the former has conserved five amino acid residues in their active sites [21]. C. thermocellum is a thermophilic bacteria and its cellulase has a better heat stability. It is known to be ethanogenic strain and cellulase from this source has high commercial applications [22]. Cysteine residues contribute to protein thermal stability [22]. Amongst fungi, species of Trichoderma and Aspergillus are well known for cellulolytic potential [23]. Apart from the above, other fungi used for cellulase production are Humicola and Aspergillus sp. [24]. Hydropathy plot for the cellulase sequence was constructed using ProtScale based on Kyte and Doolittle and the hydrophilicity and hydrophobicity nature was observed from the plot. It was observed that the majority of the residues were belonging to the hydrophilic regions confirming the interaction of the enzymes in aqueous medium. Aliphatic residues namely ALA, LEU, ILE and VAL were among the hydrophobic residues in the profile. Similarly, Phe which is an aromatic residue and sulfur containing residues MET and CYS were the other residues belonging to hydrophobic regions of ProtScale profile.

Active site prediction based on active site:

Active sites for each model and template were predicted using Active Site prediction server and tabulated Table 4 (see supplementary material). It was found that T. longibrachiatum had most number of cavities (192). C. thermocellum [AAA23226.1] had 84 cavities which were very close to template with 85 cavities. Comparison of amino acid residues present in the cavities of each models were made. It was inferred that THR rich active sites may be favouring the enzyme activity in extreme environments and ASN rich cavities may be contributing towards better enzyme activity. Among the analysed models, 4 models and the template was found to possess ASN as the dominant residue in its active sites. Both C. thermocellum and R. flavefaciens FD-1 cellulases had LYS rich active sites. ARG was dominant in active sites of M. albomyces [CAD56665.1] and T. subterraneus DSM 13965[ZP_07835928.1] cellulases. However P. haloplanktis, an extremophile had THR dominant active sites. In T. longibrachiatum ASN and THR was found to be dominant in active sites with a frequency of 10.58. It is clearly notable that the hydrophilic amino acid residues are high in the active sites of these enzyme structures ensuring their interaction with substrate in aqueous phase. However the least found residue was CYS which assures stable interaction and bonding. Though the frequency of CYS was too low, it was found in both C. thermocellum and 3 eukaryotic cellulases. So this result validates the higher cellulolytic activity and T. longibrachiatum could be the source of most active cellulase from the present study.

Conclusion

These studies provide an insight for better prospecting of cellulolytic isolates from the environment for various industrial applications. Among the microbial cellulase used in the present work, T. longibrachiatum cellulase was found to be best with high number of active sites.
  11 in total

Review 1.  Microbial cellulose utilization: fundamentals and biotechnology.

Authors:  Lee R Lynd; Paul J Weimer; Willem H van Zyl; Isak S Pretorius
Journal:  Microbiol Mol Biol Rev       Date:  2002-09       Impact factor: 11.056

2.  Kinetic and structural optimization to catalysis at low temperatures in a psychrophilic cellulase from the Antarctic bacterium Pseudoalteromonas haloplanktis.

Authors:  Geneviève Garsoux; Josette Lamotte; Charles Gerday; Georges Feller
Journal:  Biochem J       Date:  2004-12-01       Impact factor: 3.857

Review 3.  Processive and nonprocessive cellulases for biofuel production--lessons from bacterial genomes and structural analysis.

Authors:  David B Wilson
Journal:  Appl Microbiol Biotechnol       Date:  2011-11-24       Impact factor: 4.813

4.  The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling.

Authors:  Konstantin Arnold; Lorenza Bordoli; Jürgen Kopp; Torsten Schwede
Journal:  Bioinformatics       Date:  2005-11-13       Impact factor: 6.937

5.  MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0.

Authors:  Koichiro Tamura; Joel Dudley; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2007-05-07       Impact factor: 16.240

Review 6.  Plant pathogens as a source of diverse enzymes for lignocellulose digestion.

Authors:  Donna M Gibson; Brian C King; Marshall L Hayes; Gary C Bergstrom
Journal:  Curr Opin Microbiol       Date:  2011-04-30       Impact factor: 7.934

7.  AADS--an automated active site identification, docking, and scoring protocol for protein targets based on physicochemical descriptors.

Authors:  Tanya Singh; D Biswas; B Jayaram
Journal:  J Chem Inf Model       Date:  2011-09-15       Impact factor: 4.956

8.  VMD: visual molecular dynamics.

Authors:  W Humphrey; A Dalke; K Schulten
Journal:  J Mol Graph       Date:  1996-02

9.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

10.  The SWISS-MODEL Repository and associated resources.

Authors:  Florian Kiefer; Konstantin Arnold; Michael Künzli; Lorenza Bordoli; Torsten Schwede
Journal:  Nucleic Acids Res       Date:  2008-10-18       Impact factor: 16.971

View more
  3 in total

Review 1.  Role of IL-38 and its related cytokines in inflammation.

Authors:  Xianli Yuan; Xiao Peng; Yan Li; Mingcai Li
Journal:  Mediators Inflamm       Date:  2015-03-19       Impact factor: 4.711

2.  Process optimization and production kinetics for cellulase production by Trichoderma viride VKF3.

Authors:  Vinod Kumar Nathan; Mary Esther Rani; Gunaseeli Rathinasamy; Kannan Narayanan Dhiraviam; Sridhar Jayavel
Journal:  Springerplus       Date:  2014-02-17

3.  In-Silico Characterization of Glycosyl Hydrolase Family 1 β-Glucosidase from Trichoderma asperellum UPM1.

Authors:  Mohamad Farhan Mohamad Sobri; Suraini Abd-Aziz; Farah Diba Abu Bakar; Norhayati Ramli
Journal:  Int J Mol Sci       Date:  2020-06-04       Impact factor: 5.923

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.