Literature DB >> 34963899

Closed-Loop Electrolyte Design for Lithium-Mediated Ammonia Synthesis.

Dilip Krishnamurthy¹, Nikifar Lazouski², Michal L Gala², Karthish Manthiram², Venkatasubramanian Viswanathan¹.

Abstract

Novel methods for producing ammonia, a large-scale industrial chemical, are necessary for reducing the environmental impact of its production. Lithium-mediated electrochemical nitrogen reduction is one attractive alternative method for producing ammonia. In this work, we experimentally tested several classes of proton donors for activity in the lithium-mediated approach. From these data, an interpretable data-driven classification model is constructed to distinguish between active and inactive proton donors; solvatochromic Kamlet-Taft parameters emerged to be the key descriptors for predicting nitrogen reduction activity. A deep learning model is trained to predict these parameters using experimental data from the literature. The combination of the classification and deep learning models provides a predictive mapping from proton donor structure to activity for nitrogen reduction. We demonstrate that the two-model approach is superior to a purely mechanistic or a data-driven approach in accuracy and experimental data efficiency.

Entities: Chemical

Year: 2021 PMID： 34963899 PMCID： PMC8704027 DOI： 10.1021/acscentsci.1c01151

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 14.553

Introduction

Ammonia is an important industrial chemical that is predominantly used to produce nitrogen-containing fertilizers, as well as many other important classes of nitrogen-containing materials, such as polymers, pharmaceuticals, and explosives.[1,2] In addition to being a useful synthetic molecule, ammonia (NH3) has potential to be an efficient carbon-free energy carrier, as it can be liquefied at moderate pressures (∼10 bar) at room temperature;[3,4] the gravimetric and volumetric energy density of liquid ammonia greatly exceeds that of lithium-ion batteries, and the volumetric energy density is competitive with other carbon-free fuels, such as pressurized and liquid hydrogen.[5] NH3 is typically produced via the Haber–Bosch process, in which fossil-fuel-derived hydrogen and air-derived nitrogen are reacted at high temperatures (450–550 °C) and pressures (up to 200 bar).[6] This process produces up to 1.44% of the world’s carbon dioxide emissions due to its use of fossil fuels as a hydrogen source.[6−8] In addition, the conventional, fossil-fuel-driven process is economically viable only in large, centralized plants.[9] With decreasing prices of renewable electricity,[10] electrochemical methods have been proposed to produce ammonia in a distributed manner from intermittent sources of energy with no CO2 emissions and low capital costs.[7] While a large number of nitrogen reduction systems with various configurations and catalyst compositions have been proposed,[11,12] many of them report Faradaic efficiencies (i.e., selectivities) and production rates too low for practical utilization. In addition, calls for rigorous controls and reproducibility in the electrochemical nitrogen reduction field suggest that ammonia is often detected from adventitious sources in many reported works.[13−16] Methods utilizing lithium metal for nitrogen reduction can obtain some of the highest Faradaic efficiencies (FEs) and absolute rates of proposed electrochemical approaches for NH3 synthesis, while demonstrating strict and reproducible controls.[17−21] At a high level, the approach relies on first producing lithium metal via electrochemical reduction of lithium ions (Li+) found in the electrolyte. Metallic lithium spontaneously breaks the nitrogen triple bond to produce lithium nitride,[22] which can react with a proton donor to form ammonia, recovering lithium ions (Figure a). The approach has been demonstrated in both batchwise[18−20] and continuous operation systems to produce ammonia.[13,17,21,23] The mechanism for ammonia formation may differ between batchwise and continuous systems.[23,24]

Figure 1

Lithium-mediated ammonia production from nitrogen. (a) The lithium-mediated catalytic cycle, with species flows highlighted. (b) The electrochemical cell setup used for continuous ammonia production and proton donor testing. In the electrochemical approach, a proton donor is necessary to convert reduced nitrogen (e.g., lithium nitride) to release ammonia and recover lithium ions. However, there are reasons to believe that the role of the proton donor goes beyond being a source of hydrogen atoms in ammonia. It could be responsible in activating the reaction between lithium metal and nitrogen gas reaction.[17,21,25] Theoretical analysis of general electrochemical nitrogen reduction reactions shows that the thermodynamic activity of the proton donor is important for selective continuous nitrogen reduction.[26] A preliminary survey of proton donors has shown that the identity of the proton donor can greatly affect the ammonia yields in the lithium-mediated nitrogen reduction reaction (LM-NRR).[17] However, no detailed surveys of the effect of the proton donor on LM-NRR have been performed. In addition, no design rules exist for selecting proton donors that can be active in LM-NRR and improve the selectivity toward ammonia. Approaches to discovering material design rules typically involve learning a physics-based functional mapping from material choice to performance through governing equations that represent relevant physical laws and underlying physical interactions. These approaches typically have a mechanistic basis and are rationalizable or interpretable, though certain approximations or empirical terms pertaining to hard-to-encode interactions may need to be added for model accuracy. On the other hand, with significant increases in computational power, several studies across disciplines have demonstrated the effective use of deep learning models to learn the material-to-performance mapping with increased predictive power albeit with limited interpretability.[27−31] The enhancement in predictive power is in part attributed to the ability of deep-architecture models to accurately learn highly nonlinear physical interactions that are otherwise difficult to identify mechanistically. However, deep-learning models tend to require substantially higher amounts of training data than mechanistic models. In this work, with the overarching goal of identifying novel proton donors, we develop a synchronous prediction-and-testing methodology involving computation and experiments to design the primary component in the electrolyte, the proton donor, for lithium-mediated ammonia production. The computational framework involves a prediction model, which is a mapping from a given proton donor candidate to whether it can promote ammonia production above a threshold Faradaic efficiency. The prediction framework contains two parts in series: (i) a deep-learning regression model to predict the Kamlet–Taft (KT) parameters, which we identify to be the descriptors of the ability to promote ammonia production from our data-driven approach, and (ii) an interpretable classification-tree model, which takes as input the KT parameters, denoted as α and β,[32,33] and predicts whether the candidate triggers ammonia production. The interpretable classification model aligns with our mechanistic understanding since the KT parameters are established scales to quantify the hydrogen-bond accepting and donating tendencies of a molecule, which is rationalizable based on the key chemical role of the proton donor (Figure a). The KT parameters are typically obtained experimentally using nuclear magnetic resonance (NMR) spectra, and therefore larger scale data from the literature were used to train the deep-learning model. It is worth highlighting that the developed two-part prediction framework leverages interpretability with shallow learning in the low-data regime and leverages the ability to learn the nonlinear mapping with deep learning in the larger-data regime. Within the prediction-and-testing loop, testing of proton donors and feeding the outcomes back to refine the prediction model were synchronously carried out in batches. Through experimental testing, we report that 1-butanol can promote LM-NRR better than the state-of-the-art ethanol. We show that the work lays down concrete and rationalizable design principles for proton donor selection to enable lithium-mediated ammonia production.

Experimental Characterization of Proton Donors for Ammonia Production

The presence of a proton donor in the electrolyte during LM-NRR is necessary for forming ammonia from dinitrogen, regardless of the proposed mechanism, as it is the source of hydrogen in the ammonia. However, ammonia may be not detected following electrolysis of a lithium-ion-containing solution at low concentrations of proton donor, as seen in experiments where ethanol is used as the proton donor.[17,21] This suggests that the proton donor may also have a role in promoting the reaction to fix the nitrogen, either electrochemically or thermochemically. The promoting ability of a proton donor appears to depend on its structure.[17] Several classes of proton donors including alcohols, carboxylic acids, esters, phenols, and thiols were tested for activity in LM-NRR. Proton donors containing nitrogen were excluded from testing, as to avoid possible false positive ammonia production via electrolyte decomposition. Although nitrogen-containing proton donors may have desirable properties, conclusively validating ammonia production in their presence was deemed too resource-consuming and is outside the scope of the present work. The compounds were tested in a previously described setup.[21] A flooded stainless steel electrode was used for nitrogen reduction, and the proton donor concentration was varied to determine whether a given proton donor can promote LM-NRR. No adventitious ammonia has been detected in control experiments when using the same setup with nitrogen-free compounds in prior work.[21] We decided that a proton donor is classified as active in LM-NRR if the Faradaic efficiency (FE) toward ammonia in at least one operating condition exceeds 0.5%; if all experiments lead to FEs below 0.5%, then the proton donor is considered inactive. This threshold was chosen based on the minimum quantifiable FE (∼0.1%) and the spread in FE typically observed at low production rates (∼0.1%); a threshold value of 0.5% increases the likelihood that a given proton source is indeed active for LM-NRR when ammonia is detected and reduces the likelihood that the ammonia signal is spurious or comes from adventitious sources.[16,34−36] In general, only a subset of compounds containing hydroxyl groups were found to be active for LM-NRR (Figure ). Of the active compounds, 1-butanol was found to give the highest FE, consistently exceeding that obtainable by using ethanol as a proton donor (15.6% vs 13.2%). See the Supporting Information for a list of FE values and associated error bars based on the standard deviation from three repeat experiments. We believe that 1-butanol should be the proton donor of choice in future LM-NRR studies aimed at high yields of ammonia.

Figure 2

Highest values of Faradaic efficiencies toward ammonia for a variety of tested proton donors. Proton donors for which FE values are in green are classified as active (ammonia FE > 0.5%), those in red are classified as inactive (ammonia FE < 0.5%). Two proton donors, tert-butanol and 1,2-propoanediol, were classified as inactive as the maximum obtained FE (labeled in orange) did not exceed 0.5% when accounting for the error in the measurements. Note that the conditions at which maximum reported FEs were obtained differ between proton sources (Table S2). Proton donors labeled with a star (*) were used in closed-loop improvement of an interpretable model (see below), while those labeled with two stars (**) were selected for validation of a deep-learning model (see below). Despite the consistently higher activity of 1-butanol when compared to ethanol, as confirmed by a relatively large number of experiments, we would like to highlight that the maximum obtained FEs between most other proton donors should not be directly compared. As the goal was to obtain a binary proton donor activity classification, the operating conditions were not optimized for every proton donor. In addition, the concentration at which the highest measured FE is obtained differs between proton donors (Supporting Information Table 2). As the differences in activity are a function of proton donor structure, several simple hypotheses could be proposed to explain the differences in activity between various classes of compounds. For instance, one could propose that the activity of the proton donor is correlated to its acidity (pKa value). For highly acidic donors, such as carboxylic acids, the reaction between lithium metal and the proton donor, or even direct electrochemical reduction of the proton donor to hydrogen gas without lithium deposition, may be favored over the nitrogen reduction reaction. Weakly acidic proton donors, on the other hand, may be inert in electrochemical reactions or reactions involving lithium (e.g., the reaction between t-butanol and lithium is slow[37]), thus not promoting nitrogen reduction significantly. Therefore, an intermediate pKa value could be desired for nitrogen reduction. In light of this, pKa and other potential descriptors were examined for the ability to distinguish between active and inactive proton donors. No significant classification ability was observed for simple chemical and steric descriptors such as pKa and Bader volume (Supporting Information Figure 4). This shows that there may be competing effects at play, and a more complex mapping may be necessary. We turned to a more rigorous, data-driven approach to identify descriptors that can map experimental activity.

Identifying Desirable Properties of Proton Donors

We employed a data-driven approach to determine the descriptors of proton donors that can be used to predict binary activity in LM-NRR. Several quantitative properties of proton donors, curated from existing literature, and our own density functional theory (DFT) calculations were fed into a training data set; the exact parameters used can be found in the Supporting Information. Out of several classification models that were fitted to the experimental data, we found that a decision tree (Figure a) which utilizes Kamlet–Taft parameters denoted as α and β was associated with high classification ability (∼96% accuracy) while being highly interpretable based on the key protonation reaction. The selected parameters α and β quantify solvent hydrogen-bond donating and accepting ability.

Figure 3

Interpretable classification model with the identified molecular descriptors of activity toward ammonia production. (a) A range of proton donors plotted in the α–β space with values either experimentally measured[38−40] or predicted from the developed deep-learning model parameter values. (b) A smaller section of the α–β space with several measured candidates annotated for information on the structure and functional groups. The obtained decision tree (Figure a) identifies a simple criterion for above-threshold activity toward electrochemical ammonia production: α > α = 0.78 and β > β = 0.59. The need for high basicity can be rationalized as the key nitrogen fixation reaction (likely 6Li + N2 → 3Li3N) involves formation of undercoordinated lithium ions (Li+), the closest chemical analogue to a proton, during formation of lithium nitride; these ions can be stabilized by the basicity of the proton donor (β), thus accelerating nitrogen fixation. In addition, the hydrogen-bond accepting ability being predictive of ammonia activity can be rationalized based on the fact that free protons will more likely be reduced to form hydrogen gas than be used to protonate Li3N, and therefore a balance of basicity appears necessary to achieve maximal FE. The need for a threshold solvent acidity (α) can be rationalized by the fact that the nitrogen must be protonated to ultimately produce ammonia; stabilization of deprotonated forms of nitrogen during reduction may accelerate the fixation reaction. Alternatively, proton donating character may be necessary for promoting the formation of defect sites in the lithium metal, which may be necessary for formation of lithium nitride.[25] An inherent proton donating–accepting trade-off emerges in the α–β space (Figure b), where only a small fraction of candidates strike a balance above identified threshold values. A vast majority of compounds identified to be promising for ammonia production from the first set of experimentally tested candidates are recovered (Table S3), which indicates the robustness of the developed classification model. Several additional candidates with experimentally known KT parameter values were then tested to more accurately determine the decision boundary, αT and βT (Figure , Tables S2 and S4). This closed-loop refinement of the interpretable model (Figure ) was performed thrice after initial experiments, which decreased the uncertainty in fitted αT and βT values. The limited number of proton donors exceeding these threshold KT parameter values (Figure b, Supporting Information Figure 5) highlights that identifying novel candidates is challenging due to the narrow diversity of chemical structures that occur within these thresholds for α and β.

Figure 4

Closed-loop learning of the material-activity mapping. (a) The two-part model, consisting of the interpretable decision tree model and the deep-learning model, used to proposed the next batch of experiments to test in order to learn the most about the material-activity mapping with every successive batch of experiments. (b) A schematic showing information flow toward identifying novel proton donors and learning the material–activity relationship.

Physical Significance of the Emerged Descriptors

The Kamlet–Taft parameters are experimental measures or scales of acidity and basicity of the hydrogen bond(s) in the proton donor molecule. While the emergence of α and β KT parameters as the descriptors of activity toward ammonia production is intuitive based on the expected importance of the acidity and the basicity of the hydrogen bond, it is worth noting that our data-driven approach identifies that other seemingly equivalently relevant descriptors such as the acid dissociation constant (pKa), the donor and acceptor numbers (DNs and ANs), turn out to have no or weak correlation with the Faradaic efficiency of tested proton donors, highlighting the usefulness of the employed data-driven shallow-learning approach. In the subsequent section, we discuss a high-dimensional model to predict the KT parameters for a wide-range of molecules. The physical significance of the KT parameters as descriptors has also enabled validation of predictions of these parameters based on known ranges of acidity and basicity for certain functional groups in the proton sources of interest.

Deep-Learning Framework for Prediction of KT Parameters

Kamlet–Taft parameters are experimentally known (measured) only for a few hundreds of compounds, which limits the ability of the model to predict the activity of novel proton donors. While approaches to predict KT parameters have been proposed,[41,42] their associated computational cost is too high for rapidly exploring a large chemical space. Therefore, we hand-curated a data set for the KT parameters and then developed a deep-learning model to predict the parameter values for any compound in order to assess the activity in LM-NRR of the entire chemical space. The model was trained on a carefully curated data set of compounds for which experimentally measured values for α and β are reported in the literature;[38−40] the data set size was n = 222 compounds (low-data regime), thereby requiring careful and robust model training using an ensemble of models. Using an ensemble of models, i.e., a population of independently trained models with varied initial starting configurations allowed us to quantify the uncertainty of predictions for novel compounds and families of compounds.[43,44] We employed a deep-learning-based model as implemented in the DeepChem package[45] to predict the KT parameters because deep-learning models have proved to be powerful in the low-data regime.[46] In addition, the mapping from molecular features to activity is likely high-dimensional due to the complex underlying chemical physics. The deep-learning model (material–descriptor relationship) coupled with the classification model (descriptor–activity relationship) was used to predict the activity of tested and novel proton donors (Figure ). In order to evaluate the robustness of predictions associated with various proton donors and to determine promising candidates to experimentally test, for each candidate we computed the c-value (confidence value ∈ [0,1])[47] from an ensemble of deep-learning models. The c-value for a given material, cM, is computed as the fraction of ensemble models that predict the candidate to exhibit desirable performance. The approach involving an ensemble of models allows us to identify candidates for which there is disagreement between individual models, indicating that additional training data are necessary for higher certainty. The solvatochromic parameters α and β were predicted for 1 000 000 compounds from the PubChem database. We observed that a large fraction of the compounds have predicted KT parameter values that lie outside the active region described by the decision tree obtained from experiments. Only ∼0.54% of the 1 million compounds have c-values exceeding 0.5, and only ∼0.19% have c-values exceeding 0.7, suggesting that compounds which the models predict to be active with a high degree of confidence are rare. Linear aliphatic alcohols, which were experimentally determined to be active in LM-NRR, are recovered through the models with high c-values. However, the vast majority of candidates with high c-values are biological compounds with both hydrogen-bond donating (hydroxyl) and accepting (amine) groups, hence their large α and β. These candidates could not be tested for activity in LM-NRR as they contained nitrogen and were not readily commercially available. They require further exploration in future studies. The goal of further experiments was to stress-test the descriptor–activity relationship and identify the delineating surface between active and inactive candidates with greater accuracy. It is worth highlighting that after every batch of performed experiments, the experimental activity was used to augment the input data to the classification model in a closed-loop fashion to more accurately learn and update the descriptor–activity relationship (Figure ).

Experimental Validation of Models

In order to assess and improve the predictive capability of the interpretable decision tree and deep-learning models, we selected a number of the candidates close to the delineating surface from various regions of the α–β space for experimental validation (Figure ). A set of novel proton donors were tested based on KT parameters from the deep-learning model, in addition to a few with literature-reported KT parameters, with the goal of accurately learning the delineating surface between active and inactive candidates toward promoting NRR. A total of seven tested candidates were selected for further experimental testing; two candidates were found to be active: 2-ethyl-1-butanol (c-value = 0.67) and 2,2-dimethyl-1,3-propanediol (c-value = 0.36) produced ammonia with FEs of 3.62% and 0.84%, respectively (Figure ). On the other hand, candidates with low c-values were predicted and found to be inactive toward promoting NRR: formic acid and ethyl acetate, both with a c-value = 0. Other candidates with intermediate c-values were also tested and found to be inactive: triethylene glycol (c-value = 0.37); 4-methoxybutan-1-ol (c-value = 0.17); 1,4-cyclohexanedimethanol (c-value = 0.37). As mentioned earlier, the outcome of the experimental testing after every successive testing batch was used to augment the classification model to refine and more accurately learn the response surface separating candidates that promote NRR from those that are inactive.

Figure 5

Experimental testing of candidates suggested from deep-learning models. (a) Predicted α and β values from an ensemble of models for select proton donors for brevity. (b) Experimentally measured maximum FEs toward NH3 for the set of proton donors with their c-values for activity. Within the representative set of proton donors, a majority of the predictions (4/7) agree with our experimentally tested activity. FEs for these candidates reported in green represent agreement with predictions as part of the closed-loop methodology to learn the material-to-activity mapping.

Discussion

We highlight that the two-model approach involving the material-descriptor mapping (deep-architecture model) in conjunction with the descriptor-activity mapping (shallow-learning model) is a novel paradigm in material discovery. The shallow-learning model allows the interpretation of identified descriptors. In the current work, the ability of the solvatochromic parameters to describe the activity toward ammonia production is rationalized based on mechanistic hypotheses regarding the key nitrogen fixation reaction. The developed design principles in the form of above-threshold constraints on Kamlet–Taft α and β parameters provide a rationale not only for the promising candidates identified in this study but also others reported in the literature. For example, a concurrent work by Suryanto et al.[48] reports a phosphonium-based proton shuttle with high Faradaic efficiency for a very similar scheme, the performance of which can be rationalized based on our design rule given that the phosphonium cation exhibits high KT parameter values as reported from other literature.[49,50] In light of this independent corroboration, we identify ionic liquids that exhibit high KT parameter values to be promising candidates for exploration in a subsequent effort. The ionic liquids include those with cations such as ammonium, azepanium, benzimidazolium, 1,8-diazabicyclo[5.4.0]undec-7-ene (DBU), guanidinium, imidazolium, morpholinium, octanium, oxazolidinium, phosphonium, piperidinium, pyrazolium, pyridinium, pyrimidinium, pyrrolidinium, sulfonium, and triazolium, and anions such as sulfonate, sulfate, phosphonate, phosphate, bis(trifluoromethanesulfonyl)imide (NTf2), nitrate, halide, dicyanamide, carboxylate, BF4, acetate, phosphite, perchlorate, tricyanomethanide, thiocyanate, PF6, Sb6, and dimethoxy(oxo)phosphanuide. Within the computational prediction framework, the deep-learning model has the ability to capture the potentially nonlinear mapping between material structure and the descriptors. A purely deep-learning approach to predict experimental activity directly from tested compounds would be limited to a few tens of experimental training data points. A key advantage in the current approach is its ability to learn the mapping on hundreds of relevant experimentally derived training data pertaining to solvatochromic parameters, the importance of which was shown via the interpretable model. On the other hand, a purely mechanistic approach (shallow model) would enable activity prediction only on a few hundred materials for which solvatochromic parameters are known. The developed methodology allows interpretability while enabling predictions on the entire chemical space with the ensemble of models as a way to calibrate the associated confidence. This combination approach offers a new paradigm that maintains interpretability while gaining the accuracy benefits of deep-learning. Future work will involve developing the ability within the computational framework to not only separate active candidates toward ammonia production but also robustly order candidates based on the likelihood of success in terms of the Faradaic efficiency on testing. Within the candidates classified as active toward ammonia, it is worth mentioning that there is insufficient data from this study to suggest that candidates with higher KT parameter values would lead to higher Faradaic efficiencies on testing, which relates to potential inherent trade-offs between acidity and basicity.

Conclusions

In the present work, we determined the effect of chemical structure of proton donors on lithium-mediated nitrogen reduction by testing a number of families of proton donors for activity. From these experiments, 1-butanol was discovered as the most effective proton donor for LM-NRR. After failing to explain observed structure–activity trends with simple parametrization models, a rigorous data-driven approach was used to identify descriptors of activity toward ammonia production. Solvatochromic Kamlet–Taft parameters α and β were found to best describe proton donors’ ability to promote nitrogen reduction, leading to an interpretable classification model involving the two parameters. The fact that these solvatochromic parameters emerge as the descriptors can be rationalized based on the mechanistic hypothesis that the solvent’s hydrogen-bond donating (captured by α) and accepting (captured by β) ability are important in the key reaction of simultaneous lithium-ion stabilization and protonation of nitrogen by the proton donor. We develop a deep-learning-based model that enables the prediction of relevant Kamlet–Taft parameters on a wide range of candidates based on molecular-scale features. The model in conjunction with our interpretable classification tree provides a computational pipeline to predict whether a candidate has the ability to trigger ammonia production. We show that experiments were performed with these insights, which in turn informed our computational framework. The closed-loop approach between experiments and theory has enabled an increase in the fraction of tested active candidates from 30% during the initial exploration to 65% during the combined effort, leading to the discovery of a few novel active proton donors. We believe that the developed design principles in this work provide a rationalizable basis for further exploration of candidates that can lead to electrochemical ammonia production.

Methods

Experimental Characterization of Proton Donors

The activity of proton donors was quantified in a previously described setup.[21] Briefly, a 1 M LiBF4 in tetrahydrofuran (THF) electrolyte was used in a two-compartment electrochemical cell with a platinum foil anode, stainless steel foil cathode, and polyporous Daramic separator (Figure b, Supporting Information Figure 1). A range of concentrations of proton donor were added to the electrolyte prior to electrolysis. Nitrogen gas was flowed through the cathode compartment, while a constant current was applied across the cell (Figure b). The ammonia content of the resulting electrolyte solutions was quantified via a colorimetric assay. For a detailed description of the experimental methods, see the Supporting Information.

Computational Methodology

The computational framework provides two important quantities for each candidate proton donor: (i) a prediction of whether the candidate can promote ammonia production above a threshold Faradaic efficiency or not, and (ii) a c-value (confidence value described below) associated with the prediction, which quantifies the fraction of models in the ensemble that agree with the prediction. The c-value is used internally in the experiment-theory loop as a way to quantify the likelihood of activating ammonia production on experimental testing. Overall, the prediction framework provides a mapping from a given material to a binary activity classification (associated with a metric for confidence). This mapping contains two parts serially. The first is a deep-learning regression model, which takes as input a given material and outputs the prediction of the Kamlet–Taft (KT) parameters, which are the identified descriptors of the ability to promote ammonia production. These parameter predictions are fed into an interpretable classification model and outputs the binary activity classification. It is worth noting that for candidates with KT parameters known from the literature (experimentally or otherwise), the classification model is sufficient for activity classification, and the developed deep-learning model is used to widen our candidate search space. Overall, the deep-learning model (material–descriptor mapping) along with the classification model (descriptor–activity mapping) enables the material–activity predictions of the computational method.

The Data-Driven Classification Model

The classification model is aimed at learning the mapping from potential descriptors to the binary experimental activity. Model Input: Therefore, the input to train this model is the matrix with rows containing the following values (potential descriptors based on intuition) for all proton donors that were already experimentally tested: acid dissociation constant (pKa), donor number (DN), dielectric constant (ϵr), Kamlet–Taft parameters (α, β, π), highest occupied molecular orbital level (HOMO), lowest unoccupied molecular orbital level (LUMO), band gap (BG), and Bader volume (BV). Model Output: The output with respect to which the model training minimizes the loss is a vector containing the binary activity classification based on a threshold Faradaic efficiency (0/1 for inactive/active). Model Choice and Optimization: To learn the true descriptors and the mapping, we built and trained a range of models (linear and nonlinear supervised learning models, regression models, decision trees). The obtained models were optimized through cross-validation to balance model complexity, and misclassification error of the model training and optimization was carried out in MATLAB (R2017a). See Supporting Information for details on the employed model optimization algorithm based on the cross-validation score.

The Deep-Learning Model

A deep-learning model is aimed at predicting KT parameters (activity descriptors) for any given candidate proton donor. Model Input: The input data prior to featurization consisted of a list of simplified molecular-input line-entry system IDs (SMILES) corresponding to all candidate proton donor molecules with known (experimentally obtained) KT parameter values (α and β). Featurization and Model Choice: The SMILES representation contains information about all the chemical species and the atomic-scale bonding environment of atoms. The Weave featurization[51] module within the DeepChem package (version 2.3.0) method uses the SMILES representation to encode the local chemical environment around each atom and the connectivity between atoms in any given molecule. It primarily generates an atom feature vector for each atom in the molecule and a pair-featurization matrix for every pair of atoms. The Weave featurization then “weaves” (see Figure S12 in Supporting Information for more details) these atoms and pair features to generate the featurization for molecules. The pair feature matrix encodes connectivity information including bond properties, graph distance, and ring information. This is similar to graph convolution in terms of the atomic feature vectors, whereas the weave featurization does a more involved encoding of the connectivity through pairwise features instead of just neighbor listing. This approach is best leveraged in the case of graph-based models that make use of properties of both atoms and the bonding environment. The weave model (deep neural network)[52] emerged to be the most effective learning of the input–output relationship in terms of the cross-validation score. A more detailed description of the featurization scheme, neural network architecture, and training routines can be found in the Supporting Information. Model Output and Loss Function: The model output was aimed at predicting both α and β values for a given compound. Hence, a bi-task learning/training approach involved minimizing the root-mean-square loss with respect to the ground-truth output matrix containing all available experimentally obtained (from literature) values of α and β parameters. Ensemble of Models for Uncertainty Quantification: To improve the likelihood of success on experimental testing, uncertainty estimates on prediction are crucial. Therefore, an ensemble of models with different initial conditions were trained on randomly chosen subsets of the training data. The confidence value (c-value) is the chosen metric to quantify the agreement in activity predictions between the ensemble of deep-learning models. The c-value represents the fraction of members in the ensemble that predict α and β values above respective threshold values, 0.78 and 0.59, which we define in the following way:where Nens is the number of models in the ensemble, Θ is the Heaviside function, αpred, and βpred, are the predicted values of KT parameters from the nth model in the ensemble, and αT and βT represent the threshold values identified by the classification model.

16 in total

1. Bayesian error estimation in density-functional theory.

Authors: J J Mortensen; K Kaasbjerg; S L Frederiksen; J K Nørskov; J P Sethna; K W Jacobsen
Journal: Phys Rev Lett Date: 2005-11-15 Impact factor: 9.161

2. A rigorous electrochemical ammonia synthesis protocol with quantitative isotope measurements.

Authors: Suzanne Z Andersen; Viktor Čolić; Sungeun Yang; Jay A Schwalbe; Adam C Nielander; Joshua M McEnaney; Kasper Enemark-Rasmussen; Jon G Baker; Aayush R Singh; Brian A Rohr; Michael J Statt; Sarah J Blair; Stefano Mezzavilla; Jakob Kibsgaard; Peter C K Vesborg; Matteo Cargnello; Stacey F Bent; Thomas F Jaramillo; Ifan E L Stephens; Jens K Nørskov; Ib Chorkendorff
Journal: Nature Date: 2019-05-22 Impact factor: 49.962

3. Quantifying robustness of DFT predicted pathways and activity determining elementary steps for electrochemical reactions.

Authors: Dilip Krishnamurthy; Vaidish Sumaria; Venkatasubramanian Viswanathan
Journal: J Chem Phys Date: 2019-01-28 Impact factor: 3.488