Edina Wang1, Suresh Chinni1, Subhash Janardhan Bhore1. 1. Department of Biotechnology, Faculty of Applied Sciences, AIMST University, Bedong-Semeling Road, Bedong, 08100, Kedah, Malaysia.
Abstract
BACKGROUND: The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. OBJECTIVE: The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. MATERIALS AND METHODS: The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. RESULTS: The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. CONCLUSION: The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates.
BACKGROUND: The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. OBJECTIVE: The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. MATERIALS AND METHODS: The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. RESULTS: The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. CONCLUSION: The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates.
Oil-palm is an important source of oil and fats. There are two
species of oil-palm, namely, the African oil-palm (Elaeis
guineensis Jacq.) and the American oil-palm (Elaeis oleifera). The E.
guineensis (Tenera) gives more oil yield in comparison to E.
oleifera, and hence, the African oil-palm (Tenera) is cultivated on
a commercial basis. The fatty acid profile of palm-oil is very
unique. Palm-oil obtained from the African oil-palm fruitmesocarp-
tissue do contain about 53% saturated fatty acids, and
palmitic acid (C16:0) content in it is 44% [1]. A high level of
saturated fatty acids in our diet is considered as unhealthy
[2].
However, palm-oil obtained from the American oil-palm fruitmesocarp-
tissue contains about 28% of saturated fatty acids, and
only about 25% of C16:0 [3,
4]. Therefore, palm-oil obtained from
E. oleifera is considered healthier in comparison to palm-oil
obtained from E. guineensis (Tenera).The level of C16:0 in palm-oil is determined by the efficiency of
palmitoyl-acyl carrier protein [ACP] thioesterase (PATE) and β-
ketoacyl-[ACP] synthase II (KASII) enzymes [5,
6]. The KASII is
known to catalyze the reaction of palmitoyl-ACP (C16:0-ACP) and
malonyl-ACP (C3:0-ACP) that leads to the production of stearoyl-
ACP (C18:0-ACP) [7]. If the KASII is competent enough then it
should use most of the C16:0-ACP; as a result of it, C16:0 will be
less in the palm-oil. Therefore, the activity of KASII is considered
as a rate-limiting step in the African oil-palmfatty acid
biosynthesis pathway [8]. However, the level of C16:0 in the palmoil
obtained from the American oil-palm fruit-mesocarp-tissue is
only 25%, which is about 43.2% less in comparison to C16:0
content in palm-oil obtained from the African oil-palm [8].The understanding of KASII in the African and American oilpalms
is important in order to understand the deviation in its
activity in two oil palm species. As a part of it, the full length
KASII cDNA clones were isolated from both E. guineensis
(EgKASII) and E. oleifera (EoKASII) previously
[2]. The EgKASII
and EoKASII cDNAs are 2011 and 2138 base pair in length
[2].As of March 1, 2014, nobody has reported the structural features
of oil-palms KASII protein; which is needed to understand the
differences in KASII efficiency in the African and the Americanoil palm species. This warrants the study on oil palms KASII
protein structures. Protein structures can be studied using X-ray
crystallography and or NMR techniques [9,
10]. However,
protein structures can be predicted using computational tools to
quickly understand the protein structure [11]. Therefore, this
study was undertaken to predict three-dimensional (3D)
structures of the African and American oil-palms KASII protein
by comparative modelling and to elucidate and understand their
unique features. The predicted 3D structures of EgKASII and
EoKASII proteins are being reported in this paper.
Methodology
Protein sequence retrieval:
The nucleotide database of NCBI contains full-length KASII
cDNA sequences for both EgKASII and EoKASII [Gene Bank
Accession Numbers: AF220453 (EgKASII) and FJ940767
(EoKASII)] [2]. The deduced amino acid sequence for EgKASII
and EoKASII proteins are available in protein database of NCBI.
The amino acid sequence of EgKASII and EoKASII proteins were
retrieved from the NCBI's protein database. These retrieved
protein sequences were used in the experiments to predict 3D
structures.
Secondary structure prediction:
The secondary structures in EgKASII and EoKASII proteins were
predicted using PHYRE server [12] and visualized using Pymol
[13]. The MEMSAT-SVM server
[14] was used to predict the
protein topology and to identify the presence of signal peptide
and transmembrane helices within the EgKASII and EoKASII
proteins.
Template selection and 3D structure prediction:
Two comparative molecular modelling approaches namely,
homology modelling by MODELLER [15] and ab-initio by I
TASSER (server) were used in this study to predict the 3D
structure of EgKASII and EoKASII proteins [16,
17].In homology modelling, the templates were identified based on
position-specific profile search method which improves the
accuracy of sequence alignments and also extends the
boundaries of detectable sequence similarity. Position-specific
iterative basic local alignment search tool (PSI-BLAST)
[18] was
used to derive a position-specific scoring matrix (PSSM) or
profile from the multiple sequence alignment (MSA) of
sequences using protein–protein BLAST. After templates
identification, global alignment was carried out between the
query sequence and the identified templates. The best template
was selected based on the E-value (lowest), highest score, highest
matching secondary structures and the most aligned region
between the query and the template.In ab-initio (by I-TASSER) approach of 3D structure prediction,
the 3D structure models were built based on multiple-threading
alignments and iterative template fragment assembly
simulations by Local Meta-Threading Server (LOMETS) [19].
Five top decoys were predicted for EgKASII and EoKASII
proteins. The structures with the lowest c-score were selected as
the best model for EgKASII and EoKASII proteins.
Refinement and validation of predicted 3D structures:
The predicted 3D structures of EgKASII and EoKASII were
processed for the refinement and the validation. Using
GROMACS v4.5.4 [20] and GROMOS96 53a6 force field on a
Linux system, energy minimization and molecular dynamics
(MD) simulation was performed for the predicted 3D structures.
Energy minimization was performed using steepest descent
algorithm and was allowed to run until it converged to machine
precision or to a maximum force on each atom less than 100
kJ/mol/nm. The 3D structures were centered in a rhombic
dodecahedral cell filled with simple point charge (SPC) water
with a box edge set at 1.0 nanometer (nm). Sodium or chloride
ions were added accordingly to neutralize the overall charge of
the system. The position restraints were applied to (all) protein
and heavy atoms and simulations were performed for NVT
equilibration ensemble where number of particles, volume of the
system and temperature were kept constant at 300K for 100
picoseconds (ps) using velocity rescaling method [21] followed
by 100 ps of NPT equilibration ensemble where number of
particles, pressure and temperature were kept constant at 1 bar.
The temperature and pressure were controlled by Nose-Hoover
thermostat and Parrinello-Rahman barostat, respectivel [22,
23].
After the system has been well-equilibrated, we run a 30
nanosecond (ns) of MD simulation for our protein structures
predicted by ab-initio approach. A time-step of 2 femtoseconds
(fs) was used where all bonds were constrained using the linear
constraint solver (LINCS) algorithm [24]. Coulombs potentials
were calculated using Particle Mesh Ewald (PME) electrostatics
[25] using a cubic-spline interpolated grid with 0.16 nm grid
spacing. The stereochemical quality of the models was
determined using PROCHECK [26].
Result
The EgKASII and EoKASII protein sequences were retrieved, and
its secondary structures were predicted. The topology of both
EgKASII and EoKASII proteins to show secondary structures is
shown in Figure 1 &
Figure 2, respectively. The predicted 3D
structure produced for EgKASII and EoKASII proteins by
homology modelling using MODELLER software covers their
120 to 567 and 145 to 562 regions (amino acids), respectively.
The predicted 3D structure produced for EgKASII and EoKASII
proteins by homology modelling using MODELLER as well as
by using ab-initio method (using I-TASSER) are shown in
Figure 3.
Figure 1
The predicted secondary structures of the African oilpalm
[Elaeis guineensis Jacq. (Tenera)] β-ketoacyl-[ACP] synthase
II protein.
Figure 2
The predicted secondary structures of the American oilpalm
[Elaeis oleifera] β-ketoacyl-[ACP] synthase II protein.
Figure 3
The predicted three-dimensional (3D) structures of oilpalm
β-ketoacyl-[ACP] synthase II (KASII) protein; A) EgKASII
3D structure predicted by MODELLER; B) EoKASII 3D structure
predicted by MODELLER; C) EgKASII 3D structure predicted by
I-TASSER; D) EoKASII 3D structure predicted by I-TASSER. The
structures are colored to show their secondary structures. Red
color represents helices, yellow color represents strands, and
green color represents loops.
Superimposition showing active sites of EgKASII (Cys316,
His456, His492) and EoKASII (Cys316, His453, His489) with their
respective templates used in homology modelling is depicted in
Figure 4A & B. Superimposition of EgKASII on EoKASII active
sites from the structures generated by MODELLER and ITASSER
is shown in Figure 4C & D, respectively.
Figure 4
Superimposition showing active residues of EgKASII
and EoKASII proteins; A) EgKASII active residues (C316, H456,
H492) (red) with active residues of the template 1OX0_A (C164,
H303, H337) (blue) with RMSD of 0.141A; B) EoKASII active
residues (C316, H453, H489) (red) with active residues of the
template 3KZU_A (C170, H311, H347) (blue) with RMSD of
0.298A; C) Superimposition of EgKASII active residues (red) with
EoKASII active residues (blue) from the structures generated by
MODELLER with RMSD of 0.137A. B) Superimposition of
EgKASII active residues (red) with EoKASII active residues
(blue) from the structures generated by I-TASSER with RMSD of
1.090A.
The total energy for the protein models after energy
minimization and equilibration is shown in supplementary
Figure 1. All the four models built were evaluated with
PROCHECK for stereochemistry quality of protein structures.
The comparative study of these structures by using
Ramchandran plot is shown in supplementary Figure 2. Other
details such as the main chain, side chain, bond length, bond
angle and planar groups within limits obtained are shown in
Table 1 (see supplementary material). The RMSF for ab-initio
generated models of EgKASII and EoKASII is shown in
supplementary Figure 3. Most of the residues and the active
residues fluctuate within the range of 0.1 to 0.2 nm which is in
the acceptable range. The radius of gyration plot (supplementary
Figure 4) shows that EgKASII and EoKASII protein remains
compact and stably folded after 30 ns (or 30,000 ps) of
simulation.
Discussion
The secondary structures of both EgKASII and EoKASII proteins
are made up of mostly alpha helices (45%) and coils (43%) with
only 12% of beta strands. The EgKASII protein (sequence) is
known to have a high (95%) similarity with the EoKASII protein
[2]. However, secondary structure analyses of both proteins
suggest that both proteins have the same number of alpha
helices, coils and beta strands. The analysis of the secondary
structures also suggests that most of the differences between
EgKASII and EoKASII protein sequence reported previously [2]
are located in the loop regions. It is in line with the commonly
observed evolutionary patterns in the proteins [27].The template, 1OX0_A (Streptococcus pneumonia KASII) was used
for the EgKASII protein structure prediction by considering
resolution (1.3 Angstrom) and the highest score calculated by
MODELLER [15]. However, the best suitable template, 3KZU_A
(Brucella melitensis KASII) was used for EoKASII protein structure
prediction. The 3D model structures produced by homology
modelling using MODELLER software covers 120 to 567 amino
acids of EgKASII and 145 to 562 amino acids of EoKASII. The
alpha carbons superimposition of EgKASII and EoKASII with
their respective templates showed a RMSD of 1.54A and 1.92A,
respectively. This indicates that RMSD value is in the range of
attainable accuracy for a model [28].We successfully predicted the three-dimensional (3D) structures
for KASII proteins of both Elaeis species, and made comparison
at sequence and structural level. We strongly believe that the 3D
structures predicted for KASII proteins should be closer to real
structures of these respective proteins. However, we suggest the
further wet lab experimental work to validate these predicted
structures using X-ray crystallography or NMR technique [9,
10].
Similarly, the active-site residues of KASII proteins has been
determined successfully but have not been tested
experimentally. In order to confirm the predicted structures and
their active sites, molecular docking and simulations for the
formation of complex between the predicted protein structure
and their respective substrates needs to be carried out for the
structures predicted by MODELLER; so that comparative
analysis can be done for the 3D structures predicted by using
MODELLER and I-TASSER.Oil palm derived from the African oil palm contains high
amount (~54%) of saturated fatty acids [3] in comparison to palm
oil obtained from the American oil palm [4]. A systematic study
of oil palms key genes involved in fatty acid biosynthesis
pathway and a comparative modelling of key proteins important
in fatty acid biosynthesis will help to elucidate and understand
their unique features. Molecular modelling can be used for the
understanding and prediction of the microscopic and
macroscopic properties of the proteins [29]. It is also useful in the
study of enzyme's binding affinity [30], in virtual screening of
natural products [31], in predicting molecular interaction
[32],
and saves the time and money. Our aim was to utilize molecular
modelling tools to predict the 3D structures of the African and
American oil palm KASII protein.In the recent past, Malaysian Palm Oil Board (MPOB) and
collaborators published oil palms (Elaeis guineensis Jacq. and E.
oleifera) genome [33]. They have also reported a unique gene that
controls the oil yield in oil palm fruits [34]. These advances in oil
palm research will have a significant impact in oil palm industry.
It is estimated that there are at least 34,802 genes in oil palm
genome. If we want to understand structures of all proteins in oil
palm by doing wet lab work then this is too much experimental
work. However, to understand the structural features of
important proteins in short time molecular modelling will be
useful. For the real understanding of the protein structures,
structures should be determined by using NMR or X-ray
crystallography. However, the 3D structures predicted in this
study could serve as foundation for the further research on oil
palm KASII protein and could be useful in clear understanding
of the fatty acid biosynthesis pathway in oil palm.
Conclusion
We determined the three-dimensional structure for E. guineensis
and E. oleifera KASII protein. The RMSF value for the three active
residues of EgKASII and EoKASII were around 0.1nm. Both the
structures remain compact and stably folded after 30 ns of
simulation at an average of 2.41 nm and 2.38 nm for EgKASII and
EoKASII, respectively. Molecular docking and simulation study
is required to understand the interactions between the predicted
KASII proteins and their substrates. In addition, further research
is required in wet lab to validate the predicted structures.
Conflict of interests
Authors attest that there are no conflicts of interest to declare.
Authors: Anna K Lytle; Sofia S Origanti; Yupeng Qiu; Jeffrey VonGermeten; Sua Myong; Edwin Antony Journal: J Mol Biol Date: 2014-02-24 Impact factor: 5.469