Stéphanie Valleau1, Romain A Studer2, Florian Häse1, Christoph Kreisbeck1, Rafael G Saer3, Robert E Blankenship3, Eugene I Shakhnovich1, Alán Aspuru-Guzik1,4. 1. Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States. 2. European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, U.K. 3. Departments of Biology and Chemistry, Washington University in Saint Louis, One Brookings Drive, St. Louis, Missouri 63130, United States. 4. Bio-inspired Solar Energy Program, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada.
Abstract
We present a study on the evolution of the Fenna-Matthews-Olson bacterial photosynthetic pigment-protein complex. This protein complex functions as an antenna. It transports absorbed photons-excitons-to a reaction center where photosynthetic reactions initiate. The efficiency of exciton transport is therefore fundamental for the photosynthetic bacterium's survival. We have reconstructed an ancestor of the complex to establish whether coherence in the exciton transport was selected for or optimized over time. We have also investigated the role of optimizing free energy variation upon folding in evolution. We studied whether mutations which connect the ancestor to current day species were stabilizing or destabilizing from a thermodynamic viewpoint. From this study, we established that most of these mutations were thermodynamically neutral. Furthermore, we did not see a large change in exciton transport efficiency or coherence, and thus our results predict that exciton coherence was not specifically selected for.
We present a study on the evolution of the Fenna-Matthews-Olson bacterial photosynthetic pigment-protein complex. This protein complex functions as an antenna. It transports absorbed photons-excitons-to a reaction center where photosynthetic reactions initiate. The efficiency of exciton transport is therefore fundamental for the photosynthetic bacterium's survival. We have reconstructed an ancestor of the complex to establish whether coherence in the exciton transport was selected for or optimized over time. We have also investigated the role of optimizing free energy variation upon folding in evolution. We studied whether mutations which connect the ancestor to current day species were stabilizing or destabilizing from a thermodynamic viewpoint. From this study, we established that most of these mutations were thermodynamically neutral. Furthermore, we did not see a large change in exciton transport efficiency or coherence, and thus our results predict that exciton coherence was not specifically selected for.
The Fenna–Matthews–Olson
complex is a light-harvesting
protein complex found in green sulfur bacteria. These bacteria first
appeared about 1.6 billion years ago, in the Proterozoic era.[1] Green sulfur bacteria survive through anoxygenic
photosynthesis; they use sulfide and other reduced sulfur compounds
or hydrogen as photosynthetic electron donors.[2]In the past decade, much research has been dedicated to understanding
the excitation energy transfer in the Fenna–Matthews–Olson
(FMO) protein complex.[3−13] Once absorbed, photons become molecular excitations, or excitons.
These excitons are then transported through the complex due to their
interactions with neighboring excitations, the protein scaffold, and
the environment. A renewed interest in this complex arose when low
temperature 2D-spectroscopy experiments suggested the presence of
quantum coherence effects[14,15] in the exciton dynamics.
Quantum coherent effects can be thought of as concurrent beats between
electronic excitations which do not occur classically. Following these
experiments, much theoretical work was carried out[6,16−19] with the goal of understanding whether quantum effects were present
and, if so, how they contributed to the exciton transport.Most
of the studies have relied on the existence of an X-ray crystal
structure of the FMO complex of either Chlorobaculum tepidum or Prosthecochloris aestuarii.FMO is found
in all of the anaerobic Chlorobi phyla, and recently
it has also been found in aerobic Chloracidobacterium thermophilum of the Acidobacteria phyla.[20] In these
organisms, the FMO complex forms a homo trimer (see Figure ). Each monomer consists of
7/8 bacteriochlorophyll a (BChl-a) pigment molecules
enclosed in a protein scaffold. The BChl pigment molecules interact
with the protein scaffold through coordination and hydrogen bonds.
The complex is thought to act as an excitonic wire, funneling the
excitation from the baseplate to the reaction center, where charge
separation occurs. This charge separation enables reactions which
are fundamental to the organisms’ survival.
Figure 1
In panel A, an image
describes the location of the FMO complex
in the light-harvesting complex of green sulfur bacteria. The photons
are absorbed by the chlorosome and transported as excitations to the
FMO complex and ultimately to the reaction center. In panel B one
can see the crystal structure of the FMO complex monomer for Chlorobaculum tepidum (PDB: 3ENI). The protein scaffold is colored in
pink for alpha helices, in yellow for beta sheets, and in blue for
loops. The 8 bacteriochlorophyll a (BChl-a) pigments
are colored in cyan (their side chains are not shown for simplicity).
In panel A, an image
describes the location of the FMO complex
in the light-harvesting complex of green sulfur bacteria. The photons
are absorbed by the chlorosome and transported as excitations to the
FMO complex and ultimately to the reaction center. In panel B one
can see the crystal structure of the FMO complex monomer for Chlorobaculum tepidum (PDB: 3ENI). The protein scaffold is colored in
pink for alpha helices, in yellow for beta sheets, and in blue for
loops. The 8 bacteriochlorophyll a (BChl-a) pigments
are colored in cyan (their side chains are not shown for simplicity).The biological origin of the complex
remains a mystery. Olson et
al.[21] speculated that the FMO complex might
have come from an ancient reaction center. They looked for homology
between the FMO protein and PscA, the reaction center protein of green
sulfur bacteria. They found a signature sequence common to both; however,
the sequences only had a 13% identity score. More recently, the role
of some specific site mutations[22] and of
cysteines in the FMO protein has been investigated as well.[23] It was found that cysteines appear to be fundamental
for the photoprotection mechanism of the protein complex.Regarding
exciton transport in this system, there have only been
few efforts to understand whether and how protein evolution might
have influenced it. For instance, Müh et al.[24] computed the BChl transition
energies and looked at the effect of including the point charges coming
from the protein. They also looked at how changes in the polar groups
of amino acids, such as the amino or hydroxyl groups, influence BChl
transition energies. They found that the charges coming from the alpha
helices seem to influence the energies strongly. On the other hand,
no large change in the energies was observed when they modified the
charge distribution of single amino acids as a means to emulate single
point mutations. Experimentally, the Blankenship group[25] looked at comparing the optical properties of
FMO coming from three different species to understand the effect of
the protein scaffold. They observed spectral differences, and these
were assigned to the way the BChl molecules bind to the protein scaffold
in each species. This suggests that one may probe exciton dynamics
and the biological role of coherence through point mutations. In this
sense, we are most interested in mutations occurring in the vicinity
of BChl binding sites as the effect of their charge is expected to
have a stronger influence on the BChl energies.For the above
reasons, we have carried out a theoretical study
of the protein complex from an evolutionary perspective. We aimed
to identify whether quantum coherence in exciton transport and thermodynamic
stability (free energy change upon folding) changed during the evolution
of the protein complex. Hence, we carried out calculations which trace
evolution of exciton transport as well as evolution of thermodynamic
stability. For the thermodynamics, we studied the stability of the
current day protein complex to mutations by computing and comparing
the folding free energy change of the mutant to that of the nonmutated
protein complex. Subsequently, we constructed an ancestor for the
FMO complex and computed the folding free energies due to point mutations
along the phylogenetic tree. Thus, with these two results, we were
able to compare the stability of C. tepidum to the
stability of its ancestors and see how it changed along the tree.
To investigate the role of exciton transport, we computed and compared
exciton transport in the ancestor with that of C. tepidum. Because two-dimensional photon echo (2D-echo) spectroscopy is the
experimental tool employed to reveal coherent phenomena, we computed
this spectrum to determine the role of coherence in the evolution
from ancestor to C. tepidum. Finally, from the reconstructed
phylogenetic tree, we identified sites under positive selection and
determined whether positively selected amino acids were close to BChl
molecules and could influence exciton transport.
Results
Robustness
of Current Day FMO Proteins to Mutations
We computed the
free energy change upon folding, ΔΔG =
ΔGmutantfold – ΔGwildfold, as
a measure of robustness of the FMO complex to mutations. Calculations
were carried out using FoldX[26] for the
trimers of C. tepidum and P. aestuarii for all possible single point mutations (see Methods
and Computational Details for details). In our model, each
single point mutation occurs at the same time in all monomers of the
trimer. We found that most single point mutations would be destabilizing
∼60% (see Figure ). The overall landscape of mutations is the same for C.
tepidum and P. aestuarii (see Figure S4). This is qualitatively similar to
the general statistics of the effects of mutations on stability.[27,28]
Figure 2
Histogram
of free energy variation upon folding, ΔΔG = ΔGmutantfold – ΔGwildfold, for
all possible single point mutations in the FMO complex trimer of P. aestuarii (PDB: 3EOJ). The coloring is shaded to be more red for destabilizing
mutations (positive values) and blue for stabilizing mutations (negative
values). The bins in the histogram are normalized by the area under
the histogram.
Histogram
of free energy variation upon folding, ΔΔG = ΔGmutantfold – ΔGwildfold, for
all possible single point mutations in the FMO complex trimer of P. aestuarii (PDB: 3EOJ). The coloring is shaded to be more red for destabilizing
mutations (positive values) and blue for stabilizing mutations (negative
values). The bins in the histogram are normalized by the area under
the histogram.We also looked at how
the free energy change varies depending on
the location of the mutation in the protein complex monomers of P. aestuarii and C. tepidum and found that
mutations at the interface of monomers are among the most destabilizing
(see Figure S5). These observations rely
on the assumption that each single point mutation (in each monomer)
does not impede the assembling of a trimeric structure.
Ancestor Reconstruction,
Selection of Mutations, and Variation
of the Free Energy Landscape
A phylogenetic tree was generated
using the multiple sequence alignment built from the fmoA amino acid sequences (see details in Methods and
Computational Details). The amino acid sequences were obtained
by translating the fmoA monomer gene sequences. In Figure we show the cladogram
tree (phylogenetic tree is also shown in Figure S2).
Figure 3
Rooted cladogram for fmoA computed using MAFFT
and TranslatorX[29] for the alignment and
PHYML.[30] Above each branch we show a pie-chart
label with a diameter proportional to the number of single point mutations
and color based on the associated free energy change ΔΔG. Red represents destabilizing mutations, blue represents
stabilizing mutations, and gray represents neutral mutations.
Rooted cladogram for fmoA computed using MAFFT
and TranslatorX[29] for the alignment and
PHYML.[30] Above each branch we show a pie-chart
label with a diameter proportional to the number of single point mutations
and color based on the associated free energy change ΔΔG. Red represents destabilizing mutations, blue represents
stabilizing mutations, and gray represents neutral mutations.The tree divides in two main clades:
saltwater (e.g., P.
aestuarii) and freshwater (e.g., C. tepidum) bacteria.At each node of the tree, the sequence and structure
of the protein
scaffold were reconstructed. The ancestral amino acid sequences were
obtained using Bayesian inference, and the scaffold structures were
computed using homology modeling (see Methods and
Computational Details). The pie charts above each branch indicate,
based on the diameter, how many mutations occurred between the two
branch nodes, and based on the color, whether they were stabilizing
(blue), neutral (gray), or destabilizing (red). Most mutations were
neutral in terms of modifying the free energy change upon folding.Given the inferred ancestor amino acid sequence at the root, we
constructed a guess for the full ancestral structure (protein scaffold
and BChl molecules) through a combination of homology modeling and
molecular dynamics simulations (see Methods and Computational
Details). An image of the structure of the putative ancestor
compared to that of C. tepidum for one monomer is
shown in Figure .
The images in each panel are constructed from a snapshot of the molecular
dynamics simulations at 300 K after equilibration. In panel B, we
show two orientations of one monomer of the protein complex. No large
difference in the backbone of the protein is observed. This is due
to the fact that the protein scaffold was constructed with homology
modeling. The differences come from the location of side chains and
the change in charge of the mutated amino acids. In panel B, mutations
are highlighted using a color scheme based on the BLOSUM90 distance.
We see a few red mutations (large negative BLOSUM90 distance) on the
β sheet which could be a problem for the formation of the trimer.
Another effect of the mutations is the change in orientation of BChl
molecules (see panel A). This change in orientation of the BChls influences
the coupling among their first excited states (see Table S1). In particular, the coupling between the first excited
states of BChl 3 and 4 is weaker in the ancestor than in C.
tepidum and, on the other hand, the coupling between the
first excited states of BChl 4 and 5 is stronger in the ancestor than
in C. tepidum. Finally, as can be seen in Table S2, the average distance of the amino acids
from the BChl molecules changes between the ancestor and C.
tepidum, and the differences are on the order of 0.1–0.4
Å.
Figure 4
In panel A, we show the structure of the BChl molecules in a monomer
of the ancestral structure (yellow) overlaid with the BChl molecules
in the current day structure of C. tepidum (light
gray). The images were generated from snapshots of the molecular dynamics
simulations at 300 K. In panel B, for the same snapshots we compare
the backbone of the ancestral FMO structure (yellow) to that of the
current day C. tepidum structure (light gray backbone).
Some amino acids are colored based on the BLOSUM90 matrix, which quantifies
how different they are (positive numbers indicate easy substitution
while negative numbers indicate difficult substitution). These amino
acids are also highlighted in the legend in panel C. The structures
are snapshots from the molecular dynamics simulations at 300 K. The
amino acids which differ in the two structures are colored based on
the BLOSUM90 matrix which quantifies how likely it is to substitute
one amino acid with the other. The amino acid mutants in the ancestral
structure are also shown using the bond and atom representation.
In panel A, we show the structure of the BChl molecules in a monomer
of the ancestral structure (yellow) overlaid with the BChl molecules
in the current day structure of C. tepidum (light
gray). The images were generated from snapshots of the molecular dynamics
simulations at 300 K. In panel B, for the same snapshots we compare
the backbone of the ancestral FMO structure (yellow) to that of the
current day C. tepidum structure (light gray backbone).
Some amino acids are colored based on the BLOSUM90 matrix, which quantifies
how different they are (positive numbers indicate easy substitution
while negative numbers indicate difficult substitution). These amino
acids are also highlighted in the legend in panel C. The structures
are snapshots from the molecular dynamics simulations at 300 K. The
amino acids which differ in the two structures are colored based on
the BLOSUM90 matrix which quantifies how likely it is to substitute
one amino acid with the other. The amino acid mutants in the ancestral
structure are also shown using the bond and atom representation.Given the phylogenetic tree, we
also identified branches under
positive selection (bright green) by using the branch–site
model (details in Methods and Computational Details). In Figure , we
show mutations which were positively selected for along these branches
in the structure of C. tepidum. Most positively selected
mutations are distant from the BChl residues, and therefore we currently
do not believe that they had a strong role in modifying exciton transport.
Given the high computational cost associated with constructing the
full structure (with BChl molecules) at each node, we did not investigate
the role of each positively selected mutation using atomistic ab initio methods. However, we computed the free-energy
changes associated with the mutations.
Figure 5
Positively selected sites
shown in the structure of C.
tepidum (PDB: 3ENI). The color gradient corresponds to the strength of
positive selection (darkest red, strongest selection; blue, no positive
selection). We only plot the side chains of positively selected residues
which are within 6 Å of the magnesium in the BChl molecules.
Positively selected sites
shown in the structure of C.
tepidum (PDB: 3ENI). The color gradient corresponds to the strength of
positive selection (darkest red, strongest selection; blue, no positive
selection). We only plot the side chains of positively selected residues
which are within 6 Å of the magnesium in the BChl molecules.In Figure , panel
A, we show the histogram of free energy changes which connects the
ancestor to current day species in red and, in cyan, the histogram
of folding free energy variation for all possible single point mutations
in the FMO complex trimer of C. tepidum. We found
that 65% of mutations are destabilizing for C. tepidum. On the other hand only 32% of mutations of the ancestral FMO complex
along the phylogenetic tree were destabilizing while 21% were stabilizing
and 47% were neutral. This could indicate that mutations which favored
the folding free energy change variation during evolution led to more
stable current day protein structures. However, we cannot claim this
in general as the trend changes for different current day species.
For instance, as can be seen by comparing the two figures in panel
B of Figure , the
distribution of folding free energy changes for mutations which connect
the ancestor to P. aestuarii is significantly different
from that for mutations which connect the ancestor to C. tepidum. This difference was confirmed by the Kolmogorov–Smirnov
test. Furthermore, we computed that, for the mutations which connect
the ancestor to P. aestuarii, 25% are stabilizing
and 19% are destabilizing. However, in the case of mutations which
lead to C. tepidum, 47% are destabilizing while 22%
are stabilizing. This could indicate that C. tepidum is less stable than the ancestor while P. aestuarii is more stable than the ancestor. For a more accurate comparison,
however, one would need to take into account two environments for
the ancestor (in salt water versus fresh water) as well as account
for the coevolution of the baseplate, chlorosome, and reaction center
which are located above and below the FMO complexes in the light-harvesting
complex.
Figure 6
Panel A: In blue we show a histogram of the folding free energy
changes for all possible single point mutations of the current day C. tepidum FMO complex structure. In red, we show a histogram
of the folding free energy changes for all the single point mutations
which are necessary to connect the ancestral FMO structure to the
current day structures (i.e., all single point mutations along all
branches). Panel B, right-hand side: histogram of the folding free
energy changes for the mutations which connect P. aestuarii to the ancestor. Left-hand side: histogram of the folding free energy
changes for the mutations which connect C. tepidum to the ancestor. Note that, for all histograms, each bin contains
the number of points with energy in that interval normalized by the
area under the entire histogram.
Panel A: In blue we show a histogram of the folding free energy
changes for all possible single point mutations of the current day C. tepidumFMO complex structure. In red, we show a histogram
of the folding free energy changes for all the single point mutations
which are necessary to connect the ancestral FMO structure to the
current day structures (i.e., all single point mutations along all
branches). Panel B, right-hand side: histogram of the folding free
energy changes for the mutations which connect P. aestuarii to the ancestor. Left-hand side: histogram of the folding free energy
changes for the mutations which connect C. tepidum to the ancestor. Note that, for all histograms, each bin contains
the number of points with energy in that interval normalized by the
area under the entire histogram.
Exciton Transport and Coherence
Absorption, Circular Dichroism,
and Linear Dichroism
In Figure we compare
the absorption spectrum of the ancestor to that of C. tepidum. The ancestor spectrum shows three main absorption peaks at 807,
819, and 830 nm while C. tepidum shows strong absorption
at 804, 814, and 826 nm. These trends and the corresponding circular
dichroism spectra (see Supporting Information) of the ancestor and of C. tepidum are very similar,
respectively, to the spectra of FMO in Chlorobiumthiosulfatophilum and Chlorobium
limicola.[31] However, we do not
see strong evidence of a similarity between the ancestor spectrum
and the reaction center absorption spectra reported in that work.
The small bump at about 845 nm (see Figure ) is most likely due to noise, as it does
not correspond to any of the frequencies of the equivalent Fermi golden
rule absorption spectrum. Therefore, our current results do not confirm
that the ancestor is related to current day PscA, the reaction center
protein of green sulfur bacteria; however, we cannot exclude that
it may have been related to ancestral reaction centers.
Figure 7
Comparison
of simulated absorption spectra of the ancestral FMO
complex (blue line) to that of current day C. tepidum at 300 K (gray dashed line) in arbitrary units. Spectra were computed
using the Qy transitions obtained by using QM/MM. See Methods and Computational Details for more details.
Comparison
of simulated absorption spectra of the ancestral FMO
complex (blue line) to that of current day C. tepidum at 300 K (gray dashed line) in arbitrary units. Spectra were computed
using the Qy transitions obtained by using QM/MM. See Methods and Computational Details for more details.In the Supporting Information we also
show the comparison of our C. tepidum spectra to
the experimental absorption, linear dichroism, and circular dichroism
spectra.
Coherence and 2D Spectra
Using the
hierarchy equation
of motion (HEOM) non-Markovian master equation method,[32−34] we computed the 2D-echo spectra of the reconstructed ancestral FMO
complex and of C. tepidum at 150 K. We computed the
spectra at this temperature given that coherent effects are usually
larger at lower temperatures. Therefore, this will allow us to make
a clearer comparison between the ancestral FMO complex and that of C. tepidum. We also computed the population dynamics and
coherences at 300 K, and these are shown in the Supporting Information. To this end, we employed the QMaster software package[35] (for
more details see Methods and Computational Details). The spectral densities which were employed are described in Methods and Computational Details. These were obtained
by fitting the original QM/MM spectral densities to Drude–Lorentz
spectral densities with three peaks. The spectral densities indicated
a similar coupling strength of the system to the bath for the ancestral
and C. tepidumFMO complexes for almost all modes.
The total signal of the 2D-echo spectra, i.e., sum of the stimulated
emission (SE), ground state bleaching (GB), and excited state absorption
(ESA) pathways, is shown in Figure . From the oscillations of the cross peaks in panel
B we cannot deduce an improvement in quantum coherence or a significant
difference between the ancestor and C. tepidum. For
all species, cross-peak beatings in the 2D-echo spectra are dominated
by ground state vibrations[36] superposed
by minor vibronic contributions. This suggests that optimizing exciton
transport through quantum coherence was not part of the evolution
of the FMO complex.
Figure 8
Panel A: Contour plots of the total computed signal for
the 2D-echo
spectra at different delay times at 150 K. The figures in the top
row show results for the ancestor at a delay time of 90 and 400 fs,
and the bottom figures show the analogous plots for C. tepidum. Panel B: The amplitude of oscillation of two specific cross peaks.
The top image is for the ancestor, and the bottom image is for C. tepidum.
Panel A: Contour plots of the total computed signal for
the 2D-echo
spectra at different delay times at 150 K. The figures in the top
row show results for the ancestor at a delay time of 90 and 400 fs,
and the bottom figures show the analogous plots for C. tepidum. Panel B: The amplitude of oscillation of two specific cross peaks.
The top image is for the ancestor, and the bottom image is for C. tepidum.
Conclusion
We have reconstructed
a structure for the ancestral Fenna–Matthews–Olson
complex and found that mutations of its amino acids over time have
influenced the free energy upon folding of the protein complex. In
particular, the current day species
of P. aestuarii was obtained through stabilizing
mutations while mutations which lead to the current FMO complex in C. tepidum were less stabilizing. Regarding exciton transport,
we do not observe a significant change or improvement in the efficiency
of exciton transport or quantum coherent transport. These results
suggest that the complex did not evolve to optimize for quantum coherent
transport. It is also possible that the ancestor was already sufficiently
optimized for exciton transport and therefore further evolution or
optimization was not necessary. One of the results we find is that
mutations most likely led to better binding of the FMO complex to
the baseplate and reaction center. This is confirmed by the larger
instability associated with mutating residues in those locations.
This leaves an open question: perhaps the overall exciton transport,
from the chlorosome to the reaction center, was optimized by favoring
mutations in FMO which led to a stronger binding of FMO to its neighboring
protein systems.For future work, we are interested in looking
at the exciton transport
at each node of the phylogenetic tree and further in expressing the
ancestral sequence in current day C. tepidum to obtain
an experimental structure and 2D-echo spectra for the ancestor. Finally
it would be interesting to look at the role of photoprotection to
see whether the mutations could have influenced it.
Methods and Computational
Details
Stability of the C. tepidum and P.
aestuarii Fenna–Matthews–Olson Complex to Mutations
Each amino acid of the trimer and monomer of the FMO complex of P. aestuarii and C. tepidum was mutated
to each of the other possible 19 amino acids using FoldX.[26] The corresponding free energy variation upon
folding ΔΔG = ΔGmut – ΔGwt was
obtained for each mutation. In order to account for the effect of
mutations at the interface between monomers within the trimer, the
mutations were carried out simultaneously in each monomer of the trimer,
and the resulting free energy variation was normalized by dividing
by three.In FoldX, mutations are modeled as follows. The initial
crystal structure is optimized to remove any eventual steric clashes.
Then, the residue of interest is mutated and its nearest neighbors
are mutated to themselves and conformationally relaxed to remove any
local clash. The nearest neighbors are mutated to themselves so that
their geometry may be optimized together with that of the central
residue. During this procedure the backbone of the protein is kept
fixed, and all other residues that are far from the one of interest
are also kept fixed. The stability, ΔGwt, of this relaxed structure is obtained by using an effective
energy function. Subsequently, the residue of interest is mutated
to each of the other 19 amino acids and its neighboring
amino acids to themselves, and the various ΔGmut are computed.The procedure described above
was repeated 5 times for each mutation
to ensure that the minimum energy conformations of large residues
that have many rotamers were identified. The effective energy function
(“effective energy” here refers to the Helmoltz free
energy of a system (protein + solvent) for a fixed protein conformation)
in FoldX has been optimized for amino acid sequences; thus all BChl-a molecules could not be included directly in this calculation.
To account for their presence, the structures given in input to FoldX
were obtained by homology modeling with Modeller.[37] In the Modeller simulations, the BChl molecules were inserted
and kept fixed as hard spheres. In the FoldX optimization the residues
that are known to bind to BChl-a molecules from the
crystal structure analysis of FMO complexes from current day species
were also kept fixed. A similar procedure was recently employed successfully
for the case of RubisCO where Mg atoms are present.[38]
Determination of Positively Selected Sites:
Site and Branch–Site
Models
Phylogenetic Tree
All fmoA gene sequences
from the EMBL database were gathered and employed to generate an alignment
using translatorX with MAFFT.[29] Phylogenetic
trees were constructed using PhyML[30] with
5024 bootstraps. The parameters and settings used in PhyML were as
follows. The LG substitution model was chosen. We selected to have
four substitution rate categories. The alpha parameter for the gamma
distribution of sites was set to be estimated by the code. Both NNIs
and SPR methods were used to search for the optimal tree topology,
and finally, tree topology, branch lengths, and rate parameters were
chosen to be optimized by the code. A set of three trees were constructed.
In the first tree, all sequences were included excluding FJ210646,
as it was missing about 100 residues. Therefore, including it would
have introduced more error in the tree reconstruction phase. For the
second tree, we removed Chloroacidobacterium thermophilum (ABV27353) as it had the lowest sequence identity percent and thus
also introduced more uncertainty. In the third tree, Chloroherpeton
thalassium (ACF13179) was removed as it was the second most
distant sequence from the rest. We computed ancestral sequences and
searched for positively selected sites on all three trees, but we
present results for the third tree as we believe it has the smallest
error.
Site and Branch–Site models
The branches under
positive selection were identified using the branch–site model[39] as implemented in CodeML. This model allows
for variation of (the ratio of nonsynonymous to synonymous
mutations) among branches and sites. For each simulation, one branch
is selected as a foreground branch and, using CodeML, we computed
the likelihood that the branch was under positive selection. The input
DNA sequence alignment was the same as the one generated to deduce
the phylogenetic tree. Given a branch under positive selection, we
determined which amino acids were under positive selection by using
the Bayes empirical Bayes (BEB) model.[40] We corrected the probability values p-values for
false discovery rate (FDR) by using the q-value.[41,42]
Reconstruction of the Ancestral Protein Structure of FMO
In the Supporting Information, we have
included Figure S1, which reports the procedure
employed to reconstruct the ancestral sequence and structure computationally.
Sequence
Reconstruction
An ancestral sequence was reconstructed
using maximum likelihood as implemented in FastML[43] and the LG+G model for amino acid substitution. The LG+G
model was established to be the best model under the Bayesian Information
Criterion (BIC). With FastML, we obtained the most probable sequences,
together with the posterior probabilities for each character and indel
at each sequence position for each internal node of the tree. As inputs,
we used the phylogenetic trees described in the previous section and rooted them based on the most distant
sequence. The alignments used in the input were the same as those
employed to generate the trees. For the last tree that excludes C. thermophilum and C. thalassium; we used
the midpoint root, as most sequences are similar, (see Figure ) and the branching is almost
identical to that obtained when C. thalassium or C. thermophilum are included. In Figure S3, we report a plot of the marginal probability for each amino
acid reconstructed in the ancestral sequence.
Structure
Reconstruction and Molecular Dynamics
The
ancestral structure was built by using homology modeling with satisfaction
of spatial restraints as implemented in Modeller.[37] The method of modeling by satisfaction of spatial restraints
works as follows. Constraints are generated on the structure of the
target sequence using its alignment to the related protein structures
as a guide. The restraints are obtained based on the assumption that
the structures will be similar in the aligned regions. In Modeller,
the form of the restraints was obtained from a statistical analysis
of the relationships between similar protein structures in a database
of 105 alignments that included 416 proteins of known 3D structure.[37] These restraints are supplemented by stereochemical
restraints such as bond lengths, angles, etc., which are obtained
from a force field. Once all these restraints are established, the
model structure is obtained in Cartesian space by minimizing the violations
of all restraints.We used the structure of P. aestuarii (PDB: 3EOJ) as the homologue and generated 100 possible structures with slow
refinement. The ancestral structure was chosen to be the one with
the best molpdb factor. The initially reconstructed structure only
included the protein scaffold, as there is no current parametrization
for BChl’s in Modeller. Thus, BChl molecules were included
subsequently using minimization as implemented in NAMD.[44] The chromophores were initially positioned as
in the homologous structure, and the structure was optimized by minimizing
the energy in two steps. In the first step, the backbone was kept
fixed, and for the second minimization, the backbone was free to move.
The structure was subsequently equilibrated without constraints using
the Amber14 software package,[45] with the
Amber ff99SB force field.[46] The BChl-a
parameters employed are taken from previous work.[11,47] The protonation states of all amino acids were determined with the
H++ 3.0 software,[48] under neutral pH conditions.
All complexes were solvated using TIP3P periodic water boxes,[49] with a minimum distance of 15 Å between
the complex and the box boundaries. Charges were neutralized by adding
sodium ions for the ancestor and P. aestuarii and
chloride for C. tepidum. Shake constraints were applied
to all bonds containing hydrogen. Minimizations were carried out for
2000 steps for P. aestuarii and C. tepidum and for 10000 steps for the ancestor. Minimizations were followed
by 200 ps adaptation runs to impose a temperature of 300 K and a pressure
of 1 atm on the systems. All three complexes were equilibrated for
50 ns in the same environmental conditions. Long range electrostatic
interactions were calculated using the particle–mesh Ewald
method.[50] For each complex we then carried
out 40 ps production runs with a 1 fs integration step.
Correlating
Biological Stability and Evolution to Exciton Dynamics
QM/MM Hamiltonian
For the exciton dynamics, the FMO
complex is simulated as a system coupled to a bath (the protein environment).
The system Hamiltonian is defined aswhere ε is the
first excited state energy of the ith BChl. The V terms correspond to the
excitonic coupling between the excited
states for the ith and jth molecules.Site energies were computed using TDDFT in QChem[51] with the PBE0 functional[52] and
3-21G basis set for 10000 frames (40 ps) of the MD production run.
Couplings were calculated using two approaches: the point-dipole approximation
(PDA) and transition charges from the electrostatic potential (TrEsp).[53] TrEsp has been shown to be the most accurate,
although it is computationally more demanding than using the PDA.[53] The computed couplings are given in the Supporting Information.The average first
excited state energy of each BChl is shown in Figure . We notice that
there is no large change in the trend of the excited state energies
for the BChl’s.
Figure 9
Average first excited state energies ⟨E⟩ for each BChl molecule in the
FMO complexes of C. tepidum (black) and P.
aestuarii (gray) and for the ancestral structure (blue).
Values were obtained by averaging over the production QM/MM trajectory
at 300 K.
Average first excited state energies ⟨E⟩ for each BChl molecule in the
FMO complexes of C. tepidum (black) and P.
aestuarii (gray) and for the ancestral structure (blue).
Values were obtained by averaging over the production QM/MM trajectory
at 300 K.
Exciton Coherence and 2D
Spectrum
The 2D-echo spectra
in Figure were calculated
as was done by Hein et al.[54] In two-dimensional
electronic spectroscopy the sample is probed by a sequence of three
laser pulses. Adjusting the delay time Tdelay between the second and the third pulse yields a time-resolved picture
of the exciton dynamics. Within the impulsive limit (assuming δ-pulses),
the 2D-echo spectrum is related to the third order response function S(3)(t1, Tdelay, t3). Phase
matching ensures that the components of the detected signal go along
distinct directions, and thus can be experimentally separated. We
considered the rephasing (RP) component of the signalfor which coherent phenomena show as a unique
oscillatory pattern in the cross-peak dynamics as a function of delay
time. Using the notation of double sided Feynman diagrams,[55,56] the total signal is given by three separate pathways: stimulated
emission (SE), ground state bleaching (GB), and excited state absorption
(ESA). The latter involves the double exciton manifold. We numerically
evaluated IRP(ω1, Tdelay, ω3) by propagating the
reduced density matrix, describing the exciton degrees of freedom,
along the double sided Feynman diagrams. Hereby, each interaction
with the laser pulses requires an interruption of the dynamics and
a multiplication of the reduced density matrix with the dipole operator,
either from the right or from the left. The propagation was done using QMaster,[35] a high-performance
implementation of HEOM.[57,58] The rotational average
over random orientations of the probed sample was included by sampling
20 laser polarization vectors aligned with the vertices of a dodecahedron.[54]
Authors: Yihan Shao; Laszlo Fusti Molnar; Yousung Jung; Jörg Kussmann; Christian Ochsenfeld; Shawn T Brown; Andrew T B Gilbert; Lyudmila V Slipchenko; Sergey V Levchenko; Darragh P O'Neill; Robert A DiStasio; Rohini C Lochan; Tao Wang; Gregory J O Beran; Nicholas A Besley; John M Herbert; Ching Yeh Lin; Troy Van Voorhis; Siu Hung Chien; Alex Sodt; Ryan P Steele; Vitaly A Rassolov; Paul E Maslen; Prakashan P Korambath; Ross D Adamson; Brian Austin; Jon Baker; Edward F C Byrd; Holger Dachsel; Robert J Doerksen; Andreas Dreuw; Barry D Dunietz; Anthony D Dutoi; Thomas R Furlani; Steven R Gwaltney; Andreas Heyden; So Hirata; Chao-Ping Hsu; Gary Kedziora; Rustam Z Khalliulin; Phil Klunzinger; Aaron M Lee; Michael S Lee; Wanzhen Liang; Itay Lotan; Nikhil Nair; Baron Peters; Emil I Proynov; Piotr A Pieniazek; Young Min Rhee; Jim Ritchie; Edina Rosta; C David Sherrill; Andrew C Simmonett; Joseph E Subotnik; H Lee Woodcock; Weimin Zhang; Alexis T Bell; Arup K Chakraborty; Daniel M Chipman; Frerich J Keil; Arieh Warshel; Warren J Hehre; Henry F Schaefer; Jing Kong; Anna I Krylov; Peter M W Gill; Martin Head-Gordon Journal: Phys Chem Chem Phys Date: 2006-06-12 Impact factor: 3.676
Authors: Gregory D Scholes; Graham R Fleming; Lin X Chen; Alán Aspuru-Guzik; Andreas Buchleitner; David F Coker; Gregory S Engel; Rienk van Grondelle; Akihito Ishizaki; David M Jonas; Jeff S Lundeen; James K McCusker; Shaul Mukamel; Jennifer P Ogilvie; Alexandra Olaya-Castro; Mark A Ratner; Frank C Spano; K Birgitta Whaley; Xiaoyang Zhu Journal: Nature Date: 2017-03-29 Impact factor: 49.962
Authors: Rafael G Saer; Valentyn Stadnytskyi; Nikki C Magdaong; Carrie Goodson; Sergei Savikhin; Robert E Blankenship Journal: Biochim Biophys Acta Bioenerg Date: 2017-01-31 Impact factor: 3.991
Authors: Kazumi Ozaki; Katharine J Thompson; Rachel L Simister; Sean A Crowe; Christopher T Reinhard Journal: Nat Commun Date: 2019-07-09 Impact factor: 14.919