Siqi Ma1, Shipeng Wang1, Jiawei Cao1, Fengjiao Liu1,2. 1. School of Chemistry and Chemical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China. 2. Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90095, United States.
Abstract
A well-performing machine learning (ML) model is obtained by using proper descriptors and artificial neural network (ANN) algorithms, which can quickly and accurately predict activation free energy in hydrogen atom transfer (HAT)-based sp3 C-H activation. Density functional theory calculations (UωB97X-D) are used to establish the reaction system data sets of methoxyl (CH3O·), trifluoroethoxyl (CF3CH2O·), tert-butoxyl (tBuO·), and cumyloxyl (CumO·) radicals. The simplified Roberts' equation proposed in our recent study works here [R 2 = 0.84, mean absolute error (MAE) = 0.85 kcal/mol]. Its performance is comparable with univariate Mulliken-type electronegativity (χ) with the ANN model. The ANN model with bond dissociation free energy, χ, α-unsaturation, and Nolan buried volume (%V buried) successively improves R 2 and MAE to 0.93 and 0.54 kcal/mol, respectively. It reproduces the test sets of trichloroethoxyl (CCl3CH2O·) with R 2 = 0.87 and MAE = 0.89 kcal/mol and accurately predicts the relative experimental barrier of the HAT reactions with CumO· and the site selectivity of CH3O·.
A well-performing machine learning (ML) model is obtained by using proper descriptors and artificial neural network (ANN) algorithms, which can quickly and accurately predict activation free energy in hydrogen atom transfer (HAT)-based sp3 C-H activation. Density functional theory calculations (UωB97X-D) are used to establish the reaction system data sets of methoxyl (CH3O·), trifluoroethoxyl (CF3CH2O·), tert-butoxyl (tBuO·), and cumyloxyl (CumO·) radicals. The simplified Roberts' equation proposed in our recent study works here [R 2 = 0.84, mean absolute error (MAE) = 0.85 kcal/mol]. Its performance is comparable with univariate Mulliken-type electronegativity (χ) with the ANN model. The ANN model with bond dissociation free energy, χ, α-unsaturation, and Nolan buried volume (%V buried) successively improves R 2 and MAE to 0.93 and 0.54 kcal/mol, respectively. It reproduces the test sets of trichloroethoxyl (CCl3CH2O·) with R 2 = 0.87 and MAE = 0.89 kcal/mol and accurately predicts the relative experimental barrier of the HAT reactions with CumO· and the site selectivity of CH3O·.
With the increasing demand of sustainable
development, use of the
sp3 C–H bond function in chemical synthesis has
received great attention because it can provide a practical solution
to upgrade abundant hydrocarbon raw materials into valuable products.[1−11] The hydrogen atom transfer (HAT)-based method provides an effective
strategy for hydrocarbon activation and has been widely used.[2,12−20] However, the prediction of reactivity is still a challenge. The
reactivity is closely related to the C–H bond strength and
the choice of H-extracting radicals.[21−32] Understanding the reaction mechanisms and predicting the reactivity
should assist a more reasonable and effective design of C–H
activation reactions.[33−38]The activation free energy is a key parameter for the mechanism,
reaction rate, and selectivity of chemical reactions. The prediction
of the activation free energy helps to achieve a comprehensive understanding
of chemical reactions and to construct chemical reaction diagrams,
thus accelerating catalyst design. However, the activation free energy
estimation is time-consuming, both experimentally and computationally,
leading to a bottleneck effect on the rapid catalyst design and characterization.In experimental work, the reaction rate constant is determined
by running multiple experiments at different temperatures, and then
the activation free energy can be obtained by application of the Eyring
equation. In computational science, quantum chemistry methods can
evaluate the activation free energy by identifying the transition
states along a given reaction path. It is inevitable to spend a lot
of computational cost to deal with complex chemical processes involving
many reactions.[39,40] In this situation, it is very
valuable to achieve rapid and accurate prediction of activation free
energies in C–H activation reactions.As an empirical
model, the Bell–Evans–Polanyi (BEP)
correlation is widely used to quickly estimate the activation energy.[41] It correlates the activation energy with reaction
energy through ΔE‡ = γΔEr × n + ξ. Tedder found
the existence of a BEP correlation between the rate constant and the
C–H bond strength for H-abstraction reactions from alkanes
by various radicals (CH3·, CF3·, Br·,
etc.).[42] Mayer et al. showed the generality
of BEP correlations in the HAT of CrO2Cl2 and
MnO4–.[43] Shaik
et al. used density functional theory (DFT) to study the BEP relationship
of enzyme cytochrome P450 oxidizing a series of C–H bonds and
established a prediction model.[44] However,
as the organic system becomes more and more complex, people cannot
observe the simple linear relationship between the activation energy
and the bond energy.[45−48] In addition to bond energy or reaction energy, Roberts pointed out
that polar effects, steric effects, and unsaturation should be considered
when discussing various HAT reactions in the liquid and gas phases.
In addition, his study has emphasized and quantified the polar effects.[49] Tedder proposed that the polar effects in the
HAT of CH3· and CF3· will cause a
deviation from the BEP correlation.[42] Our
previous study used DFT calculations to explore the selectivity of
dimethyldioxirane (DMDO) in the C–H oxidation of various compounds,
and it revealed a different BEP correlation between the C–H
bonds in saturated and unsaturated compounds.[46] This bimodal BEP correlation was also found in HAT from sp3 C–H bonds to the cumyloxyl radical by Bietti et al. All these
precedents urged us to think about the influence of nonlinear characteristics
and factors other than thermodynamics in HAT.Multiple linear
regression (MLR) is one strategy to consider the
effect of several variables on activation energy. Roberts and Steel
proposed an improved form of the Evans–Polanyi equation (Roberts’
equation) involving reaction thermodynamics, radical electronegativity
differences, and conjugate delocalization of the unpaired electron
and structural factors, which work well for a variety of HAT reactions
with different H-extracting radicals.[49] Our recent work simplified Roberts’ equation with reaction
energy, radical electronegativity differences, and unsaturation and
led to good results for the HAT reactions of 26 sp3 C–H
bonds by alkoxyl radicals, with R2 = 0.89
and mean absolute error (MAE) and root mean square error (RMSE) values
of 0.9 and 1.1 kcal/mol, respectively (compared with the DFT calculations).[50]On the other hand, the significant advancement
of machine learning
(ML) technology provides a new strategy for solving various chemical
problems. The artificial neural network (ANN), as one of the most
popular ML methods, is a data-driven adaptive method. Due to its parallel
structure and the ability to simulate arbitrary functions, it can
be regarded as a kind of multiple nonlinear regression that can handle
complex nonlinear problems.[51] By establishing
the potential correlations from the data, the ANN can solve problems
that involve unknown relationships. Nowadays, the ANN is actively
used in various chemical fields, such as quantum chemistry, virtual
screening of molecular materials, and synthetic pathways of chemical
substances.[52,53] While our study was ongoing,
Hong et al. have provided an ML approach for reactivity prediction
in photoredox-mediated HAT catalysis.[54]In the present study, we aim to (1) apply the ANN model to
the
HAT-based sp3 C–H activation to achieve the goal
of rapid and accurate prediction of the activation free energy, (2)
assess the performance of the simplified Roberts’ equation
proposed in our recent study,[50] and (3)
compare the performance between MLR and ANN with the same physical
variables. The substrate C–H bonds have been expanded (Figure ). We added a series
of hydrogen donor substrates to build the model on the basis of our
previous work[46] [including allylic, benzylic,
formylic, and α-heteroatom (O, N, S) group]. The activated C–H
bond is highlighted in red. In addition to methoxyl (CH3O·), trifluoroethoxyl (CF3CH2O·),
and tert-butoxy (tBuO·), cumyloxyl (CumO·)
was also used as the H-extracting radical. Alkoxyl radicals are easily
available and can be highly efficient for the abstraction of H atoms.
They are powerful tools for the catalytic activation of hydrocarbons
under mild reaction conditions. Zuo et al. achieved the C–H
functionalization of short-chain alkanes (methane, ethane, etc.) utilizing
cerium salts and alcohols under mild reaction conditions with CH3O·, CCl3CH2O·, and CF3CH2O· being used as the H-extracting radicals.[55] Bietti carried out systematic kinetic studies
of HAT reactions by CumO·.[48] These
experimental data are helpful to verify the theoretical model.
Figure 1
C–H bonds studied in this work.
C–H bonds studied in this work.
Method
Database
We established data sets for the alkane sp3 C–H activation by various alkoxyl radicals: CH3O·, CF3CH2O·, tBuO·,
and CumO· (Tables S1–S4). The
data sets were used to build the model. The C–Hs listed in Figure were studied. The
substrates and H-extracting radicals were selected to consider polar
effects, steric effects, and α-unsaturation effects. Figure shows an example
of the HAT step by the alkoxyl radical.[55] All calculations were carried out with Gaussian 16.[56] The geometries of minima and transition states were optimized
using UωB97X-D[57] with the 6-31G (d)[58] basis set in the gas phase. Vibrational frequency
analyses confirmed the nature of the structures as either local energy
minima or first-order saddle points (transition states). Single point
energies with a more extensive basis set were obtained with UωB97X-D/6-311++G
(d,p)[59] on the optimized geometries. The
solvent effect of CH3CN on the reaction was estimated by
using the SMD model.[60]
Figure 2
Scheme of alkoxyl radical-mediated
HAT activation with methane.
Scheme of alkoxyl radical-mediated
HAT activation with methane.Figure shows an
example of the data sets. The “sub” and “rad”
qualifiers represent the descriptors of the substrate and H-extracting
radical, respectively. The bond dissociation free energies (BDFEs)
of the substrate C–H and alkoxyl radical O–H were both
taken into consideration. The polar effect was quantified by the Mulliken-type
electronegativity. The Mulliken-type electronegativity of the X·
radical (χ) is defined in eq , where IE and EA are the X· vertical ionization energy
and vertical electron affinity, respectively.
Figure 3
HAT reaction space representation
using vectors.
HAT reaction space representation
using vectors.ΔχAB (ΔχAB = χA – χB, A–H
+ B· →
A· + BH) reflects the electronegativity difference between the
substrate and H-extracting radical. The unsaturation effect refers
to whether an unsaturated group adjacent to the investigated C–H
is present. The “0” and “1” qualifiers
are used for “saturated” C–H and “unsaturated”
C–H bonds, respectively. The Nolan buried volume (%Vburied)[61] is used
to consider the steric effect (see Section S8 for the detailed calculation).
ANN Model
Figure is an overview of the ANN model. Activation barriers
and descriptors were all obtained by DFT methods. The ANN model was
trained on the basis of appropriate descriptors.
Figure 4
Process of the ANN model
predicting activation free energy.
Process of the ANN model
predicting activation free energy.The back-propagation ANN model[62,63] was applied
to predict the activation free energy. Through K-fold cross-validation
(k = 3),[64] the topology
of the ANN model was finally determined with three hidden neurons
and three hidden layers (see Table ). The activation function was a tanh function () that allows the network to map any nonlinear
process. The second-order optimization method was an implementation
of the Levenberg–Marquardt (LM) algorithm.[65−67]
Table 1
Main Parameters of ANN Models
ANN
hidden layer sizes
6
hidden unit
3
activation
tanh
tolerance
0.001
training function
Levenberg–Marquardt
Results and Discussion
Tables S1–S4 list the activation
free energy (ΔG‡), the reaction
free energy (ΔGr×n), the BDFE,
the Mulliken electronegativity (χ), and the Nolan buried volume
(%Vburied) for HAT reactions promoted
by CH3O·, CF3CH2O·, tBuO·,
and CumO·, respectively. The substrate C–H BDFE ranges
from 71.2 to 96.0 kcal/mol. The ΔG‡ values range from 11.3 to 20.3 kcal/mol for CH3O·,
range from 10.6 to 20.1 kcal/mol for tBuO·, range from 9.7 to
21.1 kcal/mol for CumO·, and range from 8.7 to 20.1 kcal/mol
for CF3CH2O·. Overall, CF3CH2O· has a greater reactivity than CH3O·,
tBuO·, and CumO·.
Empirical Model
In HAT, the BEP correlation can be
expressed as shown in eq , where ΔG‡ is the activation
free energy, ΔGr×n is the reaction
free energy, BDFEsub and BDFErad are the BDFE
of the substrate C–H and alkoxyl radical O–H, respectively,
and γ and ξ are obtained from linear regression analysis.Figure shows the linear relationship of ΔG‡ vs ΔGr×n for the reaction data sets of 60 sp3 C–Hs (Figure ) with four alkoxyl
radicals (CH3O·, CF3CH2O·,
tBuO·, and CumO·). The traditional BEP correlation obviously
does not work here. Although the correlation is rough, the “saturated”
and “unsaturated” C–Hs tend to be divided into
two categories (Figure ) as found in our previous study.[46] The
effects of the unsaturation and of the radical nature are responsible
for the inadequacy of the BEP correlation.
Figure 5
Activation free energy
for HAT as a function of ΔGr×n.
Activation free energy
for HAT as a function of ΔGr×n.Figure a–d
shows the scatter diagram of ΔG‡ vs ΔGr×n for each alkoxyl
radical. In comparison with CH3O·, the higher electronegativity
CF3CH2O· radical is more reactive and yields
a worse linear correlation. The steric effect also influences the
BEP correlation, as is evident from a comparison of the differences
between the ΔG‡ vs ΔGr×n scatter plots for CH3O·,
tBuO·, and CumO·. Due to the influence of different H-extracting
radicals, the BEP correlation for the whole series is poor, as shown
in Figure .
Figure 6
Activation
free energy for the HAT reaction as a function of ΔGr×n for the different alkoxyl radicals:
(a) CH3O·, (b) tBuO·, (c) CF3CH2O·, and (d) CumO·.
Activation
free energy for the HAT reaction as a function of ΔGr×n for the different alkoxyl radicals:
(a) CH3O·, (b) tBuO·, (c) CF3CH2O·, and (d) CumO·.Figure shows the
scatter diagram of ΔG‡ vs
Δχ2 (Δχ = χ_sub – χ_rad). This correlation is better than
that of ΔG‡ vs ΔGr×n, and the unsaturation effect is less
marked, to the point that the “saturated” and “unsaturated”
C–Hs can be classified into one category.
Figure 7
Activation free energy
for the HAT reaction as a function of Δχ2 for
different alkoxyl radicals: (a) CH3O·,
(b) tBuO·, (c) CF3CH2O·, (d) CumO·,
and (e) all four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·).
Activation free energy
for the HAT reaction as a function of Δχ2 for
different alkoxyl radicals: (a) CH3O·,
(b) tBuO·, (c) CF3CH2O·, (d) CumO·,
and (e) all four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·).The results of Figure imply that Δχ2 plays
a more important
role than ΔGr×n in predicting
the activation free energies with univariate linear regression. The
importance of Δχ2 is further proved by the
random forest (RF) algorithm.[68] RF can
provide a measure of the feature importance based on the mean decrease
in impurity (MDI), and the impurity is calculated by the split criterion
of the decision trees (entropy).[68]Figure shows the feature
importance of the unsaturation effect, Δχ2,
and ΔGr×n analyzed by the RF
algorithm. As shown in Figure , Δχ2 plays the most important role,
followed by ΔGr×n and the least
important is the effect of unsaturation. It is worth mentioning, however,
that the feature importance analysis based on MDI is biased to high
cardinality features (typically numerical features, e.g., Δχ2) and probably underestimates low cardinality features (binary
features, e.g., the unsaturation effect).[69] Therefore, the impact of descriptors needs to be further analyzed.
Though Δχ2 shows a good correlation with ΔG‡, this is insufficient to accurately
predict the activation free energy, especially using a simple linear
relationship.
Figure 8
Feature importance of each descriptor analyzed by RF.
Feature importance of each descriptor analyzed by RF.Then, the simplified Roberts’ relationship
developed in
our recent work[50] was applied (eq ).This expression contains
the reaction energy (ΔGr×n,
ΔGr×n = BDFE_sub – BDFE_rad), the unsaturation
term (d), and the Mulliken-type electronegativity
difference (Δχ, Δχ = χ_sub – χ_rad) between the substrate and H-extracting
radical. From an MLR analysis of our data sets, the coefficients for
this simplified Roberts’ relationship are shown in eq .The correlation between
ΔG‡DFT and the
ΔG‡Predict obtained
by this multivariate linear relationship,
shown in Figure ,
is good (R2 = 0.84, MAE = 0.85, and RMSE
= 1.05).
Figure 9
DFT-computed vs predicted barriers using eq for the reaction data sets of 60 sp3 C–Hs (Figure ) with four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·).
DFT-computed vs predicted barriers using eq for the reaction data sets of 60 sp3 C–Hs (Figure ) with four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·).The ANN model was trained on the reaction
data sets of 60 sp3 C–Hs (Figure ) with four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·)
by using different descriptors. Figure a,b shows the activation free energies obtained
by the ANN model (with only the BDFE and χ as descriptors, respectively)
versus the DFT method. The BDFE can hardly be used to predict the
activation free energy, as shown by R2 = 0.62, MAE = 1.28, and RMSE = 1.62. The χ still works better
than BDFE when used as the unique descriptor in the ANN model. This
is reflected by R2 = 0.85, MAE = 0.79,
and RMSE = 1.03. It is worth noting that the performance of the single
χ descriptor with the ANN model is comparable to the MLR of eq . Subsequently, we tried
to add the descriptors. The combination of BDFE_sub, BDFE_rad, χ_sub,
χ_rad, and α-unsaturation successively makes the R2 improve to 0.92, accompanied by MAE = 0.60
and RMSE = 0.75, as shown in Figure c. After adding the %Vburied, the ANN model performs best (R2 = 0.93,
MAE = 0.54, and RMSE = 0.68), see Figure d.
Figure 10
DFT-computed barriers vs ANN-predicted barriers
for the reaction
data sets of 60 sp3 C–Hs (Figure ) with four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·)
by different descriptors (a) BDFE_sub and BDFE_rad, (b) χ_sub
and χ_rad, and (c) BDFE_sub, BDFE_rad, χ_sub, χ_rad,
and α-unsaturation. (d) BDFE_sub, BDFE_rad, χ_sub, χ_rad,
α-unsaturation, %Vburied_sub, and %Vburied_rad. The “sub” and “rad”
qualifiers represent the descriptors of the substrate and H-extracting
radical, respectively.
DFT-computed barriers vs ANN-predicted barriers
for the reaction
data sets of 60 sp3 C–Hs (Figure ) with four alkoxyl radicals (CH3O·, CF3CH2O·, tBuO·, and CumO·)
by different descriptors (a) BDFE_sub and BDFE_rad, (b) χ_sub
and χ_rad, and (c) BDFE_sub, BDFE_rad, χ_sub, χ_rad,
and α-unsaturation. (d) BDFE_sub, BDFE_rad, χ_sub, χ_rad,
α-unsaturation, %Vburied_sub, and %Vburied_rad. The “sub” and “rad”
qualifiers represent the descriptors of the substrate and H-extracting
radical, respectively.In order to further evaluate the predictive ability
of the model,
we used the reaction data of CCl3CH2O·
as the test set (Table S5). The R2 and MAE values obtained from the ANN model
without %Vburied are 0.80 and 1.10 kcal/mol,
showing that the ANN model trained on the data set composed of CH3O·, CF3CH2O·, tBuO·,
and CumO· can map the influence of the polarity changes of CCl3CH2O· (Figure a). By adding the %Vburied term, the R2 and MAE values
improve to 0.87 and 0.79 kcal/mol, respectively, showing the importance
of the steric effect (Figure b).
Figure 11
DFT-computed vs ANN-predicted barriers for the test set
of 60 sp3 C–Hs (Figure ) with CCl3CH2O·
by different descriptors:
(a) BDFE_sub, BDFE_rad, χ_sub, χ_rad, and α-unsaturation.
(b) BDFE_sub, BDFE_rad, χ_sub, χ_rad, α-unsaturation,
%Vburied_sub, and %Vburied_rad.
DFT-computed vs ANN-predicted barriers for the test set
of 60 sp3 C–Hs (Figure ) with CCl3CH2O·
by different descriptors:
(a) BDFE_sub, BDFE_rad, χ_sub, χ_rad, and α-unsaturation.
(b) BDFE_sub, BDFE_rad, χ_sub, χ_rad, α-unsaturation,
%Vburied_sub, and %Vburied_rad.Furthermore, we predicted the experimental results
by using the
trained model. According to Bietti’s research on the HAT reaction
of CumO·, we collected another 45 sp3 C–Hs
from his study and used the ANN model trained on the DFT computational
data to make predictions. To eliminate the final prediction bias caused
by the errors of the DFT calculation and the experiment, the relative
activation free energy was used here. Rate constants and relative
activation free energies are provided in the Supporting Information
(Table S6). Figure compares the activation free energies predicted
by the ANN model with the relative activation free energies of Bietti’s
experiments. The prediction results show that the ANN model containing
%Vburied can better map the influence
of the CumO· steric effects and make more accurate predictions
(R2 = 0.70, MAE = 0.65, and RMSE = 0.80).
Figure 12
Experiment
vs ANN-predicted barriers for the test set of 45 additional
sp3 C–Hs with CumO· by different descriptors
(a) BDFE_sub, BDFE_rad, χ_sub, χ_rad, and α-unsaturation.
(b) BDFE_sub, BDFE_rad, χ_sub, χ_rad, α-unsaturation,
%Vburied_sub, and %Vburied_rad.
Experiment
vs ANN-predicted barriers for the test set of 45 additional
sp3 C–Hs with CumO· by different descriptors
(a) BDFE_sub, BDFE_rad, χ_sub, χ_rad, and α-unsaturation.
(b) BDFE_sub, BDFE_rad, χ_sub, χ_rad, α-unsaturation,
%Vburied_sub, and %Vburied_rad.We also tried to predict the results of the CH3O·
selectivity from Zuo’s reports (Table S7).[70] The BDFE, χ, α-unsaturation,
and %Vburied descriptors were used. Figure includes the relative
ratio of site selectivity and the activation free energies obtained
by the ANN prediction and by the DFT calculation. The numbers marked
in red indicate that the prediction error is over 2 kcal/mol, and
the blue backgrounds indicate the main activation site predicted by
the ANN model. Among the six substrates not included in the training
set, 2,3-dimethylbutane has been widely used as a standard substrate
for evaluating selectivity in C–H bond functionalization. The
α-tertiary carbon can be formed with good selectivity (ratio
97:3) when CH3O· was used as a HAT reagent. Comparing
with the DFT-calculated and ANN-predicted activation free energies,
the reaction site is the same. The tertiary C–H bond of adamantane
can be functionalized and predicted as well. N-Hexane
has three different types of C–H bonds in terms of steric hindrance
and bond strength. In this case, CH3O· has high selectivity
for the weaker methylene C–H bonds, and the predicted activation
free energy at this site is also the lowest. In 2,4-dimethylpentane,
CH3O· is used to obtain methine functionalization.
Although the activation free energy predicted by the ANN model is
lower, the selectivity is consistent with the experiment.
Figure 13
Evaluation
of the ability to predict HAT reaction sites of CH3O·.
Evaluation
of the ability to predict HAT reaction sites of CH3O·.
Conclusions
We have strived to achieve a rapid and
accurate estimation of activation
free energies in HAT-based C–H activation reactions with both
an empirical method and the ANN model. First, we established a data
set of 300 HAT reactions on the basis of DFT calculations. By simply
analyzing the data set, we found that unsaturation effects are responsible
for the poor performance of the BEP relationship, while the correlation
between ΔG‡ and Δχ2 is better than that between ΔG‡ and ΔGr×n.
The simplified Roberts’ equation proposed in our recent study
also works here. Then, we used an ML method to establish a reactivity
and selectivity prediction model based on appropriate descriptors.
As a unique descriptor used in the ANN model, χ works better
than the BDFE. Its performance is comparable with that of the simplified
Roberts’ equation. The introduction of %Vburied can improve the generalization ability between different
H-extracting radical and make more accurate predictions, which shows
the importance of steric effects. The combination of BDFE, χ,
α-unsaturation, and %Vburied successively
makes the R2, MAE, and RMSE improve to
0.93, 0.54, and 0.68, respectively. The ANN model reproduces the experimental
CumO· relative activation free energies and CH3O·
selectivities with good accuracy.
Authors: Megan H Shaw; Valerie W Shurtleff; Jack A Terrett; James D Cuthbertson; David W C MacMillan Journal: Science Date: 2016-04-28 Impact factor: 47.728