Literature DB >> 36124306

Machine-learning-assisted discovery of highly efficient high-entropy alloy catalysts for the oxygen reduction reaction.

Xuhao Wan¹, Zhaofu Zhang^2,3, Wei Yu¹, Huan Niu¹, Xiting Wang¹, Yuzheng Guo¹.

Abstract

High-entropy alloys (HEAs) have recently been applied in the field of heterogeneous catalysis benefiting from vast chemical space. However, huge chemical space also brings extreme challenges for the comprehensive study of HEAs by traditional trial-and-error experiments. Therefore, the machine learning (ML) method is presented to investigate the oxygen reduction reaction (ORR) catalytic activity of millions of reactive sites on HEA surfaces. The well-performed ML model is constructed based on the gradient boosting regression (GBR) algorithm with high accuracy, generalizability, and simplicity. In-depth analysis of the results demonstrates that adsorption energy is a mixture of the individual contributions of coordinated metal atoms near the reactive site. An efficient strategy is proposed to further boost the ORR catalytic activity of promising HEA catalysts by optimizing the HEA surface structure, which recommends a highly efficient HEA catalyst of Ir48Pt74Ru30Rh30Ag74. Our work offers a guide to the rational design and nanostructure synthesis of HEA catalysts.

Entities: Chemical

Keywords: absorption energies; density functional theory; high-entropy alloys; machine learning; oxygen reduction reaction

Year: 2022 PMID： 36124306 PMCID： PMC9481945 DOI： 10.1016/j.patter.2022.100553

Source DB: PubMed Journal: Patterns (N Y) ISSN： 2666-3899

Introduction

High-entropy alloys (HEAs) are multi-principal-component alloys that consist of five or more elements, with each element at near-equimolar proportion, which commonly are the complex solid solution.1, 2, 3, 4, 5 HEAs are stable as the number of elements species and completely disordered atom positions make the system entropy higher. The alloys have attracted the world’s attention after first being discovered in 2004, benefiting from their easily tunable mechanical property such as elasticity modulus, hardness, and strength of extension., Recently, HEAs have been applied in the field of catalysis as their huge chemical space from the enormous number of elements combinations and inherent surface complexity make it possible to achieve higher activity, selectivity, and stability as catalysts.8, 9, 10 Thus far, HEAs have been studied as highly efficient catalysts in many different reactions, including hydrogen evolution reaction (HER), oxygen reduction reaction (ORR),12, 13, 14 carbon dioxide reduction reaction (CO2RR),, and methanol oxidation,, which are mostly experimental studies. In the past several decades, theoretical simulations have been widely used to better understand the catalytic processes and directly design highly efficient catalysts before experiments.18, 19, 20 The huge configuration space of HEAs can provide a surface with a very large number of unique reactive sites. However, millions of different active-site environments of HEAs and current limited computing implementation ability extremely complicate the theoretical research on the catalytic performance of all of the sites by only applying the traditional density functional theory (DFT) method. To break the bottleneck of computing power, one of the most reasonable strategies is to discover new algorithms or improve original algorithms, which can remarkably reduce total computational cost. As a result, machine learning (ML) attracts attention around the globe, as it is very helpful in these two directions.21, 22, 23 The ML method can dramatically reduce the computational cost of the traditional DFT method and it can maintain high accuracy when predicting the catalytic activities. In addition, ML methods can reveal the intrinsic descriptor of catalytic reactions by elucidating the nonlinear relationship between the structure and properties of materials., As a result, the state-of-the-art ML-assisted theoretical computations method has become a rising star in the catalysis field., In 2015, Ma et al. realized high-throughput screening of highly efficient CO2 electroreduction catalysts by combining ML and theoretical calculation. Many researchers have developed and spread the ML-assisted theoretical method to complicated chemical systems and different reactions such as MXene ordered binary alloy for HER, dual-metal-site catalysts (DMSCs) for ORR, and perovskite for oxygen evolution reactions (OER). ML has also been widely applied in many studies on HEAs such as searching for HEAs with large hardness and phase prediction of HEAs. This significant research inspires us to explore and predict the catalytic performance of HEAs by the advanced ML-assisted theoretical method. In this work, the stability of six types of quinary HEAs was investigated at first. Then, the ORR volcano curve between the ORR catalytic activity of reactive sites on HEA surfaces and OH∗ absorption energies was established. The excellent ML model was constructed by reasonable data extraction, feature engineering, and model validation processes to predict the OH∗ absorption energies of millions of reactive sites on different crystal facets of HEAs with high accuracy. Finally, the predicted results and the ML models were further analyzed to find highly efficient HEA catalysts and reveal their ORR activity origin.

Results and discussion

The components and stabilities of HEAs

Six transition metal (TM) elements are under consideration as the constituent elements of quinary HEAs—Ir, Pt, Ru, Rh, Ag, and Fe. They can form six types of HEAs. These elements are chosen because of a similar atomic radius and close lattice constant with the same crystal structure of face-centered cubic (fcc), as listed in Table S1. In addition, their corresponding elementary metals have been reported as being highly efficient catalysts for ORR.34, 35, 36, 37, 38, 39 The HEAs of fcc are constructed to study in this work because six selected elements mostly have fcc structures. The constructed HEA IrPtRuRhAg are shown in Figure 1A as an example, with 256 atoms in the supercell. There are several HEAs with a similar structure as above that had been synthesized and show promising catalytic activities.40, 41, 42

Figure 1

The typical structure and reactive sites of HEA IrPtRuRhAg

(A) The geometric structure of the equimolar IrPtRuRhAg HEA with an fcc crystal configuration.

(B) The schematic diagram of finding all possible sites on HEA surface, including atop (blue), bridge (red), and hollow (black) by the Delaunay triangulate algorithm.

The typical structure and reactive sites of HEA IrPtRuRhAg (A) The geometric structure of the equimolar IrPtRuRhAg HEA with an fcc crystal configuration. (B) The schematic diagram of finding all possible sites on HEA surface, including atop (blue), bridge (red), and hollow (black) by the Delaunay triangulate algorithm. The high disorder resulting from the completely random configuration space of HEAs can increase the mixed configuration entropy of HEAs, which is beneficial for the formation of a stable single-phase solid solution structure rather than fragile intermetallic compounds. Whether HEAs can form stable solid solutions is closely related to some thermodynamic properties of HEAs such as atomic radius, energy differences, and configurational mixing entropy. According to the Hume-Rothery rules, the difference between atom radii and the ratio of formation enthalpy to entropy can evaluate the stability in a straight way.44, 45, 46 Hereafter, the atom radii difference factor (δ) and the ratio of formation enthalpy to entropy (Ω) are used to describe the phase of HEAs. The detailed computation methods of the δ and Ω parameters are listed in Note S1. Instead of simulating a large number of metal atoms in a whole crystal lattice, we calculated smaller 32-atom supercells with periodic boundary conditions, considering the huge computation cost of ab initio calculations of so many extremely disordered metal atoms. The approximation of a supercell had a negligible influence on the DFT-calculated values, as shown in Note S2 and Figure S1. In addition, for each type of HEA, 10 different HEAs with different structure and composition ratios of elements that are adjusted in the range of ±15% are studied to evaluate the stability of HEAs more reliably and thoroughly. That is, 60 HEAs with different structures are studied. The detailed component and ratios of 60 HEAs are listed in Table S2. As shown in Figure 2, all 60 HEAs studied fall within the solid solution phase area, the criteria of which are defined as that the parameter δ is smaller than 6.6% and Ω is larger than 1.1 simultaneously. The HEAs matching the requirements are quite likely to form a single solid solution owing to the subtle lattice structure and mixing energy.

Figure 2

The distribution of δ and Ω parameters of HEAs

The HEAs within the purple area are more likely to form a solid solution. Six types of quinary HEAs are studied, and 10 different component ratios are considered for each type of HEA. The highly efficient ORR catalyst Ir48Pt74Ru30Rh30Ag74 is marked by a red star.

The distribution of δ and Ω parameters of HEAs The HEAs within the purple area are more likely to form a solid solution. Six types of quinary HEAs are studied, and 10 different component ratios are considered for each type of HEA. The highly efficient ORR catalyst Ir48Pt74Ru30Rh30Ag74 is marked by a red star.

The ORR activity volcano of HEA surfaces

After confirming the stabilities of HEAs, the possible active sites for ORR were studied. As shown in Figure 1B, the Delaunay triangulate algorithm was applied to find all of the possible sites on the HEA surface, including atop, bridge, and hollow sites for this high-throughput research. Stability analysis of reactive intermediates on the HEA surface demonstrated that the intermediate adsorbates are mostly unstable on atop and hollow sites, but only stable on bridge sites for different HEA surfaces, which is consistent with previous experimental studies., Therefore, only bridge sites were considered as possible reactive sites on two fcc-structured HEA Miller index surfaces, (100) and (111), as they are the most common surfaces. Then, we focused on the catalytic activities for ORR of different reactive sites on HEA surfaces. Based on the Sabatier rule, the absorption energies of reaction intermediates are generally good descriptors for catalytic activity. Accordingly, the volcano curve between adsorption energies of intermediates and catalytic activities for ORR can be established, which makes it simple to explore the activities of millions of possible active sites on HEA surfaces. To obtain a comprehensive understanding of the ORR catalytic activity on HEA, the intermediates (OH∗, O∗, and OOH∗) involved in the reaction process of 20 different sites on different HEA surfaces experienced systematic and adequate DFT optimizations. The 20 sites were equally and randomly selected from different bridge sites and Miller index surfaces to guarantee the reliability of the results. The previous research demonstrates similar TM−O interactions between TM and ORR intermediates (OH, O, and OOH) will lead to similar adsorption behaviors on metal surfaces, meaning the adsorption energies of OH, O, and OOH on metal surfaces are likely to have a similar trend. As shown in Figure 3A, the results show that the adsorption energies of OH, O, and OOH on HEA surfaces possess apparent scaling relations, as discussed above. Specifically, the relationship between ΔGO∗ and ΔGOH∗ can be expressed as ΔGO∗ = 2.21∗ΔGOH∗ + 1.44, with a high coefficient of determination (R2) of 0.91, while ΔGOOH∗ and ΔGOH∗ are ΔGOOH∗ = 1.02∗ΔGOH∗ + 3.31, with a R2 of 0.85. Based on the aforementioned scaling relationship, the volcano curve was established by choosing ΔGOH∗ as the descriptor for activity, as illustrated in Figure 3B. According to the Sabatier rule, both the too strong and too weak adsorption of reaction intermediates on surfaces harm catalytic performance; because too strong adsorption prevents the desorption process and makes the catalysts poisoned, whereas too weak adsorption impedes the activation of intermediates. Therefore, the sites with the best catalytic activity are the peak of the volcano, where the adsorption energies of intermediates are moderate. As Figure 3B shows, the chosen reactive sites all lie to the left of the volcano, and the reference point of Pt (111) is very close to the peak. Our motivation then changed to find those sites on HEA surfaces with higher activity than Pt (111), namely to cross over Pt (111) and achieve the peak of the ORR volcano (∼0.16 eV weaker adsorption energy than Pt (111)).

Figure 3

The volcano curve between ΔGOH∗ and overpotential

(A) Scaling relations between the adsorption energies of reaction intermediates (ΔGO∗ versus ΔGOH∗ in red, while ΔGOOH∗ versus ΔGOH∗ in blue) on HEA (100) and (111) surfaces.

(B) The ORR volcano curve of reactive sites on HEA surfaces between overpotential and ΔGOH∗. The Pt (111) is marked as a purple star as the reference point. The high activity area is marked in red.

The volcano curve between ΔGOH∗ and overpotential (A) Scaling relations between the adsorption energies of reaction intermediates (ΔGO∗ versus ΔGOH∗ in red, while ΔGOOH∗ versus ΔGOH∗ in blue) on HEA (100) and (111) surfaces. (B) The ORR volcano curve of reactive sites on HEA surfaces between overpotential and ΔGOH∗. The Pt (111) is marked as a purple star as the reference point. The high activity area is marked in red.

ML process

Although the volcano curve properly bridges the catalytic activity and the adsorption energies of OH∗, it is almost impossible to calculate the adsorption energies of millions of reactive sites on HEA. Therefore, ML was introduced to link the local atomic environment around the sites with the adsorbate strength, which makes it feasible to discover the activity of millions of sites on HEAs. At first, 360 reactive sites on HEAs with different Miller index surfaces and component elements were randomly selected, and the absorption energies of OH∗ intermediates on them were obtained by DFT calculations to construct the original dataset for building ML models. The 360 sites can be divided into 12 groups based on 2 different Miller index surfaces and 6 different quinary HEAs; there are 30 sites for each group. The specific types of 12 groups and the specific positions of 360 sites can be found in Note S3 and Figure S2. Then, some easily obtainable physical and chemical properties were selected to reasonably describe the local atomic environment of reactive sites by feature engineering. It is the most significant step as it determines the highest accuracy of the ML model., We abstracted the science problem of describing the local environment of sites into generating the dataset, which was characteristics of coordinated atoms of two metal atoms that make up bridge sites, and the characteristics were reasonable physical and chemical features of each coordinated atom. There are several common strategies to determine features: They should comprehensively describe the atomic, electronic structures, and the local environment of reactive sites, and they should be physically intuitive to guarantee the model’s robustness., Based on these strategies, prior knowledge, and previous studies,, 11 features were selected: atomic radius (r), relative mass (M), atomic number (AN), Pauli electronegativity (N), electron number of d orbital (e), d-band center of the corresponding pure metal surface (ε), electron affinity (A), first ionization energy (I), oxide formation enthalpy (H), generalized coordination number () of the coordinated atoms, and the distance between the coordinated atoms and OH∗ (d). Among these features, r, M, and AN are atomic structure features; N, e, ε, A, I, and H are electronic structure features; and d are local environment features. Considering that wrapper strategy is time-consuming and embedded strategy is only applicable for specific algorithms, filter strategy was chosen to determine final features. The standard of the filter strategy was that the final features should be relatively independent and have larger variance. In addition, as the features that describe the local environment are simple enough and they are not relative, only atomic features and electronic structure features are filtered. First, the Pearson matrix between the features was calculated as shown in Figure S3. The results indicate that all of the atomic features have a strong correlation. There are also two groups of relevant electronic structure features. The first group are N, A, and I, while the second are ε and H. Then, the variance of these relevant features was obtained and painted in three colors at the bottom of Figure S3. Among the three groups, r, N, and ε show the highest variance, so other features are excluded as the features with higher variance are easier to learn by ML models. It should be noticed that the variance is from the features after normalization; therefore, their absolute values are relatively small. In general, six features are selected as the final input features, including r, N, e, ε, CN, and d, as listed in Table S3. It means the whole input dataset is a [360 ∗ n ∗ 6] matrix, where n is 15 for sites on HEA (111) and 14 for HEA (100) surface, as shown in Figure S4 and Table 1. The values of the r, M, N, A, I, and e come from the PTable database.

Table 1

An example table of the input data to establish ML models of 5 reactive sites on HEAs surfaces

Sites	A₁	A₂	A_n	A₁₅	ΔG_OH∗ (eV)
1	[180, 2.20, 7, −2.25, 8, 2.28]	[177, 2.28, 9, −2.42, 8, 2.28]	…	[0, 0, 0, 0, 0, 0]	0.0373
2	[156, 7.83, 6, −0.81, 8, 2.40]	[180, 2.20, 7, −2.25, 8, 2.40]		[0, 0, 0, 0, 0, 0]	−0.6672
3	[173, 2.28, 8, −2.18, 9, 2.22]	[177, 2.28, 9, −2.42, 9, 2.22]		[177, 2.28, 9, −2.42, 9, 4.91]	−0.5263
4	[178, 2.20, 7, −1.95, 9, 2.22]	[180, 2.20, 7, −2.25, 9, 2.22]		[173, 2.28, 8, −2.18, 9, 4.88]	0.1562
5	[156, 7.83, 6, −0.81, 9, 2.23]	[173, 2.28, 8, −2.18, 9, 2.23]		[180, 2.20, 7, −2.25, 9, 4.88]	−0.6126

An connotes the coordinated metal atoms in the order of distance. The values in the square brackets correspond to r, N, e, ε, CN, and d. The first 2 sites are on HEA (100), while the latter 3 are on HEA (111). The data of A15 atom are empty for sites 1 and 2 because the number of coordinated atoms for sites on HEA (100) is only 14.

An example table of the input data to establish ML models of 5 reactive sites on HEAs surfaces An connotes the coordinated metal atoms in the order of distance. The values in the square brackets correspond to r, N, e, ε, CN, and d. The first 2 sites are on HEA (100), while the latter 3 are on HEA (111). The data of A15 atom are empty for sites 1 and 2 because the number of coordinated atoms for sites on HEA (100) is only 14. After completing the process of feature engineering and data extraction, the pre-processed dataset was used to train seven different ML models with seven different regression algorithms, including gradient boosted regression (GBR), feedforward neural network (FNN), and Random forest regression (RFR), support vector regression (SVR), k-neighbor regression (KNR), kernel ridge regression (KRR), and least absolute shrinkage and selection operator regression (LASSO), as illustrated in Figure 4. Then, the first round of model selection was performed by simply tuning the hyperparameters of 7 ML models and comparing model metrics on the test set after 4-fold cross-validation. The results were demonstrated in Figure S5, and it indicated that the model performance of the GBR and RFR models were evidently better than the other 5 models as they have the lowest root-mean-square error (RMSE) (both <0.15) and the highest R2 score (both >0.95). The model performance of FNN was relatively bad, but it was more likely to achieve better performance after a comprehensive parameters-tuning process. As a result, the GBR, RFR, and FNN algorithms were selected to establish better ML models. Hereafter, 500-time repeated 4-fold cross-validation was applied to evaluate model performance so as to reduce the random effect of dataset split and the risk of overfitting. Table S4 indicates that 500 times is enough to eliminate the sampling error and guarantee the generalization ability of ML models. By manually adjusting the hyperparameters over and over again, the GBR, FNN, and RFR models achieved the best of themselves, and the hyperparameters were tuned as listed in Table S5. A more detailed flow chart is illustrated in Figure S6.

Figure 4

The scheme of the machine learning process

The scheme of the machine learning process The results of model training and testing are demonstrated in Figure 5. As shown in Figures 5A and 5B, the GBR and RFR models both performed well on the train and test set with lower RMSE (∼0.1 eV) and higher R2 score (∼0.95). The average metrics of 500-time repeated 4-fold cross-validation on train and test set were close, indicating no risk of overfitting. In addition, the error bar of the two models was relatively short, suggesting good model robustness. As for the well-known FNN model, the model performance was bad and the error bar was long, showing the unstable character of the FNN model. The poor accuracy of the FNN model can be attributed to the limited size of the original dataset from DFT calculations and the difficult hyperparameter tuning process. For the well-performed GBR and RFR models, the metrics on the test set of the GBR model were better with lower RMSE (0.112 eV) and higher R2 score (0.961) than RFR (RMSE of 0.129 eV and R2 of 0.948). Therefore, the GBR model was the optimal ML model that can exactly describe the underlying pattern of the local atomic environment and the ORR catalytic activities of reactive sites on HEA surfaces.

Figure 5

Model performance of 3 different ML models

(A and B) Comparison of the RMSE and the R2 score of 3 ML models on (A) the train set and (B) the test set. The error bar denotes the range of RMSE and R2 in the process of 500-time repeated 4-fold cross-validation.

(C) Parity plot of DFT-calculated ΔGOH∗ with those predicted by the GBR model and model performance metrics of the optimal GBR model. Dotted lines indicate ±0.2 eV deviation.

(D) The learning curve of the GBR model.

Model performance of 3 different ML models (A and B) Comparison of the RMSE and the R2 score of 3 ML models on (A) the train set and (B) the test set. The error bar denotes the range of RMSE and R2 in the process of 500-time repeated 4-fold cross-validation. (C) Parity plot of DFT-calculated ΔGOH∗ with those predicted by the GBR model and model performance metrics of the optimal GBR model. Dotted lines indicate ±0.2 eV deviation. (D) The learning curve of the GBR model. The parity plot of the GBR model applied in the whole dataset is clearly shown in Figure 5C, and the GBR-predicted absorption energies are consistent with those energies from DFT calculations. Nearly all of the dots fell into the ±0.2 eV deviation area, indicating the effective-trained GBR model achieves high accuracy in prediction via learning significant information about the underlying pattern of the local environment and adsorbate strength on HEA surfaces. The learning curve of the GBR model was plotted in Figure 5D, suggesting that there was no risk of overfitting as the RMSE on the train and test set gradually converged with the increase in training size and the convergence values were close to one another. The high efficiency of the ML-assisted method is shown in Note S4.

The ORR activity of HEA surfaces

The well-performed GBR model empowered us to reliably obtain accurate absorption energies of reactive sites, thereby exploring catalytic activities of HEAs from the huge chemical space. Hereafter, 12,000 different bridge sites on HEAs with different Miller index surfaces and component elements were randomly and equally generated, which means there were 2,000 sites for each type of HEA, divided into 1,000 sites for (100) surfaces and 1,000 for (111). Then, these sites were fed into the well-performed GBR model, and the output absorption energies were collected and sorted instantly. The frequency distribution of OH∗ absorption energies of sites on HEA (100) and (111) surfaces are illustrated in Figure 6. As shown, the distribution of energies is continuous, suggesting the possibility that the HEA catalytic activity can be fine-tuned by adjusting the coordination environments of reactive sites. Apparent and individual peaks emerge on the total frequency distribution, inspiring us to classify all energies points according to the types of bridge sites—namely the types of the two metal atoms of sites—to decompose the frequency distribution and discover the intrinsic mechanism. The result demonstrates that the decomposed (colorful) sites type distribution peaks have similar shapes and widths, indicating the effect of the coordinated atoms on the adsorbate strength of reaction intermediates possess similar physical origins. These decomposed peaks own an independent distribution center, which is also the average absorption energy of one type of bridge site. As for the identical metal reactive sites with the XX pattern, the average absorption energies follow the same order for both the (100) and (111) surfaces. The order is FeFe < RuRu < RhRh < IrIr < PtPt < AgAg, which is consistent with the oxygen intermediates absorption energies trend of the corresponding monometallic surfaces. For the dual metal reactive sites with the XY pattern, their absorption energies distribute between the absorption energies distribution center of the XX and YY sites. For example, the absorption energies of PtFe sites all lie in the middle of PtPt and FeFe distribution center. In addition, the frequency of XX sites is relatively lower than XY sites as a result of the XY sites consisting of XY and YX combinations, while the XX sites have only one combination possibility.

Figure 6

The frequency distribution of OH∗ adsorption energies of 12,000 reactive sites

(A) The sites on HEA (100) surface. (B) HEA (111). The bottom row shows the total energy distribution with a gray color, while the top 3 rows show the decomposed varicolored peaks according to the identity of the 2 bridge site atoms.

The frequency distribution of OH∗ adsorption energies of 12,000 reactive sites (A) The sites on HEA (100) surface. (B) HEA (111). The bottom row shows the total energy distribution with a gray color, while the top 3 rows show the decomposed varicolored peaks according to the identity of the 2 bridge site atoms. As shown in Figure 6, we noted that the absorption energies of the same sites on HEA (100) are more negative than HEA (111) and the differences were ∼0.4 eV. To directly compare the absorption energies distribution of different Miller index surfaces and study the ORR activity of HEAs, the total frequency distribution of two types of surfaces and the volcano curve were painted together in Figure 7A. The maximum, minimum, and means values of the absorption energies of HEA (100) are all lower than HEA (111), and it can be concluded that lower coordinated environments result in stronger OH∗ binding on the HEA surface and it can be attributed to high coordination environments mean stronger electronic coupling between metal atoms and it weakens the binding strength of adventitious adsorbate. In addition, only part of the reactive sites on HEA (111) lies in the high activity area and it consists of AgAg, PtAg, and part of the IrAg sites.

Figure 7

Further analysis of differences between HEA(100) and HEA(111) and the feature importance of the GBR model

Frequency distribution of OH∗ adsorption energies of reactive sites on 2 different Miller index surfaces and the volcano curve (A). The mean values of absorption energies, high ORR activity area, and the position of Pt (111) on the volcano curve are marked. The feature importance obtained from the well-performed GBR model in (B) coordinated atom dimension and (C) physical or chemical property.

Further analysis of differences between HEA(100) and HEA(111) and the feature importance of the GBR model Frequency distribution of OH∗ adsorption energies of reactive sites on 2 different Miller index surfaces and the volcano curve (A). The mean values of absorption energies, high ORR activity area, and the position of Pt (111) on the volcano curve are marked. The feature importance obtained from the well-performed GBR model in (B) coordinated atom dimension and (C) physical or chemical property. To reveal the intrinsic mechanism of the OH∗ absorption by model analysis, the feature importance in the coordinated atom dimension and physical or chemical property dimension were studied and plotted in Figures 7B and 7C, respectively. The specific values of feature importance are listed in Tables S6 and S7. As for the coordinated atom dimension, the atoms are sorted and numbered in order from close to far—A1 and A2 are two bridge site atoms. Figure 7B demonstrates that the feature importance of two bridge metal atoms is relatively high and close to each other (39.29% and 39.47%), indicating two bridge metal atoms are equivalent and the most significant coordinated atoms to influence OH∗ absorption, which is consistent with physical intuition. For the same reason, the third and fourth atoms have close feature importance. It is noteworthy that the feature importance presents the downward trend with the increase in the distance between the coordinated atom and oxygen intermediate, indicating that closer coordination atoms have a stronger influence on the adsorbate strength of the OH intermediate. Interestingly, the 15th coordinated atom features an abnormal increase as it can determine the surface is (100) or (111), confirming that the abstract ML model can reflect scientific fact. The specific position of coordinated atoms is displayed in Figure S4. For the physical and chemical properties dimension, ε is the most important descriptor for OH∗ absorption, with a high feature importance of 71.18%, which is in agreement with the theory that the d-band center has a direct and strong relation to adsorbate strength., The feature importance of CN and d are, respectively, 7.04% and 11.24%, suggesting that they are relatively more important than the other three descriptors (<5%). Considering the cruciality of ε, we compared it with the average absorption energies of XX pattern HEA sites. As shown in Figure S7, the changing trend of ε and average absorption energies of the corresponding metal XX sites are correlative, and more negative ε leads to weaker binding strength, which is consistent with the d-band center theory. Based on these advanced analyses, it can be concluded that adsorption energy on the HEA surface is a mixture of the individual contributions from all ambient coordinated component metal atoms; the two bridge metal atoms which directly bonded to OH∗ are the dominant factor to determine the adsorption energy of adsorbent ORR intermediate; the closer the coordinated atom is to OH∗, the stronger its influence. As a result, the decomposed peaks according to the type of two atoms of bridge sites own individual distribution center and the absorption energies of XY sites lie between XX and YY sites. In addition, the coordination environments of two XY bridge atoms are similar in the statistical view, leading to similar shape and width of the decomposed peaks. The model analysis of feature importance also obeys these rules. The importance of two bridge metal atoms occupies ∼80%, which is far larger than other coordinated atoms. ε possesses ∼70% importance, as it represents the absorption ability of a single coordinated atom, while and d possess ∼20%, as they reflect the strength of the single contribution.

Optimizing HEA for activity enhancement

As discussed in the last section, AgAg, PtAg, and part of IrAg sites on the HEA (111) surface show favorable OH∗ absorption energy, which is commonly called active sites and located around the peak of the volcano curve. Therefore, equimolar HEA IrPtRuRhAg, IrPtRuAgFe, and IrPtRhAgFe, including the key elements Ag, Pt, and Ir, are promising highly efficient ORR catalysts in all six types of HEAs studied in this work. Hereafter, 10,000 reactive sites on the (111) surface were randomly generated for each HEA belonging to the above 3; then, they were fed into the ML model to obtain their corresponding absorption energies. The active sites coverage (the proportion of the active sites between all of the reactive sites) of IrPtRuRhAg, IrPtRuAgFe, and IrPtRhAgFe are 16.59%, 15.02%, and 15.84%, respectively. Among them, IrPtRuRhAg is the optimal HEA catalyst, owing to more active sites as the coordination environments including Ru and Rh are more likely than RuFe and RhFe environments to weaken the OH absorption energies of IrAg sites, as listed in Table S8. However, even for the best HEA IrPtRuRhAg catalyst, the ORR activity is limited because of relatively low active sites coverage. As a result, one strategy to enhance the ORR activity of HEA catalysts was proposed by optimizing the surface so that the likelihood of finding highly efficient active sites with desired OH∗ absorption energies can be enlarged. The strategy is to adjust the metal element component ratio of HEAs to enlarge the number of active sites. Taking the optimal HEA IrPtRuRhAg as the research target, we adjusted its component ratio and investigated the variation of the active sites. Considering the AgAg, PtAg, and part of IrAg sites are active sites, the component ratio of Ir was fixed to maintain the amount of the possible IrAg active sites and not to occupy the configurational space of key elements Pt and Ag. The ratio of Pt and Ag was enlarged, while Ru and Rh were minimized in the same proportion to increase the amount of highly efficient AgAg and PtAg active sites. We consider that the GBR ML model constructed by the data from nearly equimolar HEAs may be inaccurate when predicting activities of HEAs with different compositions of the elements. Herein, an active learning strategy was used to make the GBR model more reasonable for predicting the activities of HEAs with different compositions. First, 10 HEA IrPtRuRhAg were generated with random compositions, and 1 site was randomly selected for each HEA. Second, the OH∗ absorption energies of these 10 sites were calculated and compared with the predicted values from the GBR model based on equimolar HEAs. Then, the 10 new sites were regarded as the new data points to train the new GBR model by adding new data points to the original dataset. Finally, 10 new sites were generated, evaluated the performance of the new GBR model, and retrained the GBR model until its performance achieved the standard. The metric to evaluate model performance is RMSE and the standard to stop the iteration is 0.1 as it is close to the stable metric on the test set for the GBR model for equimolar HEAs, as shown in Figure 5D. The iteration results were painted in Figure S8. It shows that the RMSE of the new GBR model has a gradual downward trend and it achieves the standard after 5 complete active learning loops with a low RMSE of 0.098 eV. The specific information of 60 new sites on HEAs with different compositions can be found in Table S9. The result demonstrates that the strategy can effectively improve the model performance when predicting HEAs with different compositions by tuning the original GBR model. In addition, even the RMSE in iteration 1 shows a small difference with RMSE of the GBR model for equimolar HEAs, indicating that the error of applying the same GBR model to predict the activities of HEAs with different composition is small. As a result, the GBR model based on equimolar HEAs also possesses the knowledge of HEAs with different compositions and it can be attributed to the fact that the whole input dataset is statistically related to the elements composition, but the particular single input data is not absolutely related to the composition. As for each HEA IrPtRuRhAg with different metal element component ratios, 10,000 reactive sites on (111) surfaces were generated and fed into the GBR model for corresponding OH∗ absorption energies. The variation of the active site coverage of these HEAs is illustrated in Figure 8A, demonstrating that the activity sites’ coverage increases first and decreases later with the variation of component ratio. The peak of active sites coverage is Ir48Pt74Ru30Rh30Ag74, with a coverage of 31.54%, approximately twice the equimolar HEA IrPtRuRhAg, indicating the efficiency of the strategy. The results are interesting because the increase in activity sites coverage is expected, but the decrease is not predicted. For the decrease, we surmise that the reason is that the new coordination environments with far more Pt and Ag atoms than equimolar HEA make the adsorbate strength between OH and promising active AgAg, PtAg, and IrAg sites weakened immoderately, so their statistical distribution moves away from the high activity area. To prove the hypothesis, the OH∗ absorption energies distribution around the high activity area of the reactive sites on typical HEA Ir48Pt84Ru20Rh20Ag84 (which shows decreasing ORR activity) are shown in Figure 8B. As shown, the OH∗ absorption energies of the promising active AgAg, PtAg, and IrAg sites change to more positive and the distribution of them moves to the right as a whole because of the weakening effect on absorption strength from new coordination environments. The AgAg sites even move out of the high activity area, which is the main reason for the anomalistic decrease in catalytic activity. In addition, it is noteworthy that the increase in OH∗ absorption energies of active sites makes more sites closer to the vertex of the volcano, which means that more single active sites feature lower energy barriers and better activity. By the effective strategy, the ORR catalytic activity of equimolar HEA IrPtRuRhAg successfully enhances more than double. The stability of Ir48Pt74Ru30Rh30Ag74 is then checked and it is more likely to form a stable solid solution, as shown in Figure 2 (star). DFT calculations were performed to compare the GBR-predicted and accurate values of 10 sites on HEA Ir48Pt74Ru30Rh30Ag74, and the error is small enough to determine the high activity of recommend catalysts, as listed in Table S10. In general, the discovery of the highly efficient Ir48Pt74Ru30Rh30Ag74 HEA catalyst offers rational guidance for the experimental nanostructure synthesis of excellent ORR HEA catalysts such as the determination of component elements, composition ratio, and crystal phase.

Figure 8

The activity results of HEAs with different compositions

(A) The variation trend of the active sites coverage of HEA IrPtRuRhAg with the change of metal element component ratio.

(B) Frequency distribution of OH∗ adsorption energies around the high activity area of the reactive sites on the (111) surface of the typical HEA Ir48Pt84Ru20Rh20Ag84, which shows decreasing ORR activity.

The activity results of HEAs with different compositions (A) The variation trend of the active sites coverage of HEA IrPtRuRhAg with the change of metal element component ratio. (B) Frequency distribution of OH∗ adsorption energies around the high activity area of the reactive sites on the (111) surface of the typical HEA Ir48Pt84Ru20Rh20Ag84, which shows decreasing ORR activity.

Conclusions

In this study, considering that the HEA surface can provide a near-continuum of oxygen intermediate adsorption energies distribution benefit from their huge configurational and chemical space, the potential of highly active ORR electrocatalysts of six types of quinary HEAs were investigated with the aid of ML. The stability of these HEAs was confirmed because they quite likely to form a stable single solid solution according to the Hume-Rothery rule. The ORR volcano curve was established based on the Sabatier rule, providing the way to study the catalytic activity by the key ORR descriptor absorption energies of the OH∗ intermediates. Then, the well-performed GBR model with high accuracy, generalizability, and simplicity was constructed by reasonable data extraction, feature engineering, and model validation process. Using this excellent GBR model, millions of reactive sites on HEA surfaces with different coordination combinations, which are impossible to study by traditional DFT calculations or experiments, can be predicted with high fidelity. The ML-predicted results and further model analysis demonstrate that the adsorption energy on HEA surfaces is approximately a mixture of the individual contributions of the metal atoms near the reactive site. Finally, a strategy to engineer the HEA surface structure by tuning the metal element component ratio so that the adsorption energies distribution can be closer to the peak of the volcano to enlarge the ORR catalytic activity was proposed, which doubles the ORR activity. Our proposed DFT-ML scheme demonstrates its ability and potential to become the pioneer in the field of HEA catalysis as it can conquer the extremely broad configurational and chemical space, as well as offer a rational and direct guide on the practical nanostructure synthesis of highly efficient HEA catalysts.

Experimental procedures

Resource availability

Lead contact

Requests for further information should be directed to and will be fulfilled by the lead contact, Yuzheng Guo, PhD (yguo@whu.edu.cn).

Materials availability

This study did not generate chemical reagents.

DFT calculations

In this work, spin-polarized DFT calculations were conducted with the Synopsys Quantum ATK simulation package. Detailed computation methods and parameters were listed in Notes S5, S6, and S7.

ML

The ML process was conducted by DMCP, together with the Scikit-learn, and PyTorch package. The computational methods of RMSE and R2 score are listed in Note S8. Normalization is conducted for the original dataset. The FNN model uses a three-layer neural network (input, hidden, and predict layer) and ReLU as the activation function. The specific hyperparameters of the ML models can be found in Table S11.

Limitations of the study

The study does not include experimental validation considering the workloads and difficulties of synthesizing the corresponding HEAs, exactly controlling their component ratio, and evaluating their ORR performance. The significance of the work is to guide later experimental studies, which can be ensured by the accuracy of DFT calculation and reasonable cross-validation process.

9 in total

1. Catalyst design by interpolation in the periodic table: bimetallic ammonia synthesis catalysts.

Authors: C J Jacobsen; S Dahl; B S Clausen; S Bahn; A Logadottir; J K Nørskov
Journal: J Am Chem Soc Date: 2001-08-29 Impact factor: 15.419

2. Machine-Learning-Augmented Chemisorption Model for CO2 Electroreduction Catalyst Screening.

Authors: Xianfeng Ma; Zheng Li; Luke E K Achenie; Hongliang Xin
Journal: J Phys Chem Lett Date: 2015-08-27 Impact factor: 6.475

Review 3. Platinum-based oxygen reduction electrocatalysts.

Authors: Jianbo Wu; Hong Yang
Journal: Acc Chem Res Date: 2013-06-28 Impact factor: 22.384

4. QuantumATK: an integrated platform of electronic and atomic-scale modelling tools.

Authors: Søren Smidstrup; Troels Markussen; Pieter Vancraeyveld; Jess Wellendorff; Julian Schneider; Tue Gunst; Brecht Verstichel; Daniele Stradi; Petr A Khomyakov; Ulrik G Vej-Hansen; Maeng-Eun Lee; Samuel T Chill; Filip Rasmussen; Gabriele Penazzi; Fabiano Corsetti; Ari Ojanperä; Kristian Jensen; Mattias L N Palsgaard; Umberto Martinez; Anders Blom; Mads Brandbyge; Kurt Stokbro
Journal: J Phys Condens Matter Date: 2019-08-30 Impact factor: 2.333

5. Quantum Chemistry in the Age of Machine Learning.

Authors: Pavlo O Dral
Journal: J Phys Chem Lett Date: 2020-03-09 Impact factor: 6.475

6. Machine-Learning-Accelerated Catalytic Activity Predictions of Transition Metal Phthalocyanine Dual-Metal-Site Catalysts for CO₂ Reduction.

Authors: Xuhao Wan; Zhaofu Zhang; Huan Niu; Yiheng Yin; Chunguang Kuai; Jun Wang; Chen Shao; Yuzheng Guo
Journal: J Phys Chem Lett Date: 2021-06-25 Impact factor: 6.475

7. Activity Origin and Design Principles for Oxygen Reduction on Dual-Metal-Site Catalysts: A Combined Density Functional Theory and Machine Learning Study.

Authors: Xiaorong Zhu; Jiaxian Yan; Min Gu; Tianyang Liu; Yafei Dai; Yanhui Gu; Yafei Li
Journal: J Phys Chem Lett Date: 2019-12-04 Impact factor: 6.475

8. Theoretical Investigation of the Adsorption Properties of CO, NO, and OH on Monometallic and Bimetallic 13-Atom Clusters: The Example of Cu13, Pt7Cu6, and Pt13.

Authors: Anderson S Chaves; Maurício J Piotrowski; Diego Guedes-Sobrinho; Juarez L F Da Silva
Journal: J Phys Chem A Date: 2015-11-11 Impact factor: 2.781

9 in total