Literature DB >> 30228262

Deep neural networks for accurate predictions of crystal stability.

Weike Ye¹, Chi Chen², Zhenbin Wang², Iek-Heng Chu², Shyue Ping Ong³.

Abstract

Predicting the stability of crystals is one of the central problems in materials science. Today, density functional theory (DFT) calculations remain comparatively expensive and scale poorly with system size. Here we show that deep neural networks utilizing just two descriptors-the Pauling electronegativity and ionic radii-can predict the DFT formation energies of C3A2D3O12 garnets and ABO3 perovskites with low mean absolute errors (MAEs) of 7-10 meV atom-1 and 20-34 meV atom-1, respectively, well within the limits of DFT accuracy. Further extension to mixed garnets and perovskites with little loss in accuracy can be achieved using a binary encoding scheme, addressing a critical gap in the extension of machine-learning models from fixed stoichiometry crystals to infinite universe of mixed-species crystals. Finally, we demonstrate the potential of these models to rapidly transverse vast chemical spaces to accurately identify stable compositions, accelerating the discovery of novel materials with potentially superior properties.

Entities: Chemical Disease Gene

Year: 2018 PMID： 30228262 PMCID： PMC6143552 DOI： 10.1038/s41467-018-06322-x

Source DB: PubMed Journal: Nat Commun ISSN： 2041-1723 Impact factor: 14.919

Introduction

The formation energy of a crystal is a key metric of its stability and synthesizability. It is typically defined relative to constituent unary/binary phases (E) or the stable linear combination of competing phases in the phase diagram (Ehull, or energy above convex hull)[1]. In recent years, machine learning (ML) models trained on density functional theory (DFT)[2] calculations have garnered widespread interest as a means to scale quantitative predictions of materials properties[3-7], including energies of crystals. However, most previous efforts at predicting E or Ehull of crystals[5,8-12] using ML models have yielded mean absolute errors (MAEs) of 70–100 meV atom−1, falling far short of the necessary accuracy for useful crystal stability predictions. This is because approximately 90% of the crystals in the Inorganic Crystal Structure Database (ICSD) have Ehull < 70 meV atom−1[13], and the errors of DFT-calculated formation energies of ternary oxides from binary oxides relative to experiments are ~ 24 meV atom−1[14]. We propose to approach the crystal stability prediction problem by using artificial neural networks (ANNs)[15], i.e., algorithms that are loosely modeled on the animal brain, to quantify well-established chemical intuition. The Pauling electronegativity and ionic radii guide much of our understanding about the bonding and stability of crystals today, for example, in the form of Pauling’s five rules[16] and the Goldschmidt tolerance factor for perovskites[17]. Though these rules are qualitative in nature, their great success points to the potential existence of a direct relationship between crystal stability and these descriptors. To probe these relationships, we choose, as our initial model system, the garnets, a large family of crystals with widespread technological applications such as luminescent materials for solid-state lighting[18] and lithium superionic conductors for rechargeable lithium-ion batteries[19,20]. Garnets have the general formula C3A2D3O12, where C, A and D denote the three cation sites with Wyckoff symbols 24c (dodecahedron), 16a (octahedron) and 24d (tetrahedron), respectively, in the prototypical cubic garnet crystal shown in Fig. 1a. The distinct coordination environments of the three sites result in different minimum ionic radii ratios (and hence, species preference) according to Pauling’s first rule. We further demonstrate the generalizability of our approach to the ABO3 perovskites (Fig. 1b), another broad class of technologically important crystals[21-25].

Fig. 1

Crystal structures of garnet and perovskite prototypes. a Crystal structure of C3A2D3O12 garnet prototype. Green (C), blue (A), and red (D) spheres are atoms in the 24c (dodecahedron), 16a (octahedron), and 24d (tetrahedron) sites, respectively. The orange spheres are oxygen atoms. b Crystal structure of Pnma ABO3 perovskite prototype. Green (A) and blue (B) spheres are atoms in the 4c (cuboctahedron) and 4d (octahedron) sites, respectively. The orange spheres are oxygen atoms In this work, we show that ANNs using only the Pauling electronegativity[26] and ionic radii[27] of the constituent species as the input descriptors can achieve extremely low MAEs of 7–10 meV atom−1 and 20–34 meV atom−1 in predicting the formation energies of garnets and perovskites, respectively. We also introduce two alternative approaches to extend such ANN models beyond simple unmixed crystals to the much larger universe of mixed cation crystals—a rigorously defined averaging scheme for the electronegativity and ionic radii for modeling complete cation disorder, and a novel binary encoding scheme to account for the effect of cation orderings with minimal increase in feature dimension. Finally, we demonstrate the application of the NN models in accurately and efficiently identifying stable compositions out of thousands of garnet and perovskite candidates, greatly expanding the space for the discovery of materials with potentially superior properties.

Results

Model construction and definitions

We start with the hypothesis that the formation energy E of a C3A2D3O12 garnet is some unknown function f of the Pauling electronegativities (χ) and Shannon ionic radii (r) of the species in the C, A, and D sites, i.e., Here, we define E as the change in energy in forming the garnet from binary oxides with elements in the same oxidation states, i.e., as opposed to the more commonly used formation energy from the elements in previous works[8-11]. Using the Ca3Al2Si3O12 garnet (grossular) as an example, is given by the energy of the reaction: 3CaO + Al2O3 + 3SiO2 → Ca3Al2Si3O12. This choice of definition of E is motivated by two reasons. First, binary oxides are frequently used as synthesis precursors. Second, our definition ensures that garnets that share elements in the same oxidation states have E that are referenced to the same binary oxides, minimizing well-known DFT errors. In contrast, and Ehull are both poor target metrics for a ML model. suffers from non-systematic DFT errors associated with the incomplete cancellation of the self-interaction error in redox reactions[28], while Ehull is defined with respect to the linear combination of stable phases at the C3A2D3O12 composition in the C-A-D-O phase diagram, which can vary unpredictably even for highly similar chemistries. Henceforth, the notation E in this work refers to unless otherwise stated. The binary oxides used to calculate the E for garnets and perovskites are listed in Supplementary Table 1 and 2, respectively. Based on the universal approximation theorem[29], we may model the unknown function f(χC,rC,χA,rA,χD,rD), which is clearly non-linear (see Supplementary Fig. 1), using a feed-forward ANN, as depicted in Fig. 2. The loss function and evaluation metric are chosen to be the mean squared error (MSE) and MAE, respectively. We will denote the architecture of the ANN using n−n[1]−n[2]−···−1, where n and n[ are the number of neurons in the input and lth hidden layer, respectively.

Fig. 2

General schematic of the artificial neural network. The artificial neural network (ANN) comprises an input layer of descriptors (the Pauling electronegativity and ionic radii on each site), followed by a number of hidden layers, and finally an output layer (E). The large circle in the centre shows how the output of the ith neuron in lth layer, , is related to the received inputs from (l−1)th layer . and denote the weight and bias between the j neuron in (l−1)th layer and ith neuron in lth layer. σ is the activation function (rectified linear unit in this work). The ANN models were implemented using Keras[39] deep learning library with the Tensorflow[40] backend

Neural network model for unmixed garnets

We developed an initial ANN model for unmixed garnets, i.e., garnets with only one type of species each in C, A, and D. A data set comprising 635 unmixed garnets was generated by performing full DFT relaxation and energy calculations (see Methods) on all charge-neural combinations of allowed species (Supplementary Table 3) on the C, A, and D sites[30]. This dataset was randomly divided into training, validation, and test data in the ratio of 64:16:20. Using 50 repeated random sub-sampling cross validation, we find that a 6-24-1 ANN architecture yields a small root mean square error (RMSE) of 12 meV atom−1, as well as the smallest standard deviation in the RMSE among the 50 sub-samples (Supplementary Fig. 2a). The training, validation and test MAEs for the optimized 6-24-1 model are ~7–10 meV atom−1 (Fig. 3a), an order of magnitude lower than the ~100 meV atom−1 achieved in previous ML models[5,8-10]. For comparison, the error in the DFT E of garnets relative to experimental values is around 14 meV atom−1 (Supplementary Table 4). Similar RMSEs are obtained for deep neural network (DNN) architectures containing two hidden layers (Supplementary Fig. 2b), indicating that a single-hidden-layer architecture is sufficient to model the relationship E and the descriptors.

Fig. 3

Performance of artificial neural network (ANN) models. a Plot of against of unmixed garnets for optimized 6-24-1 ANN model. The histograms at the top and right show that the training, validation and test sets contain a good spread of data across the entire energy range of interest with standard deviations of 122–134 meV atom−1. Low mean absolute errors (MAEs) in E of 7, 10, and 9 meV atom−1 are observed for the training, validation, and test sets, respectively. b MAEs in E of unmixed and mixed samples in training, validation, and test sets of all garnet models. The C-, A- and D-mixed deep neural networks (DNNs) have similar MAEs as the unmixed ANN model, indicating that the neural network has learned the effect of orderings on E. Each C-, A- and D-mixed composition has 20, 18, and 7 distinct orderings, respectively, which are encoded using 5-bit, 5-bit, and 3-bit binary arrays, respectively. c MAEs in E of unmixed and mixed samples for training, validation and test sets of unmixed perovskites for 4-12-1 ANN model. The of training, validation, and test sets similarly contain a good spread of data across the entire energy range of interest with standard deviations of 104–122 meV atom−1. Low mean absolute errors (MAEs) in E of 21, 34, and 30 meV atom−1 are observed for the training, validation, and test sets, respectively. d MAEs in E for training, validation, and test sets of all perovskite models. Each A- and B- mixed perovskite compositions has ten distinct orderings, which are both encoded using 4-bit binary arrays. The black lines (dashed) in (a, c) are the identity lines serving as references

Averaged neural network models for mixed garnets

To extend our model to mixed garnets, i.e., garnets with more than one type of species in the C, A, and D sites, we explored two alternative approaches—one based on averaging of descriptors, and another based on expanding the number of descriptors to account for the effect or species ordering. The data set for mixed garnets were created using the same species pool, but allowing two species to occupy one of the sites. Mixing on the A sites was set at a 1:1 ratio, and that on the C and D sites was set at a 2:1 ratio, generating garnets of the form C3A’A”D3O12 (211 compositions), C’C’’2A2D3O12 (445 compositions), and C3A2D’D’’2O12 (116 compositions). For each composition, we calculated the energies of all symmetrically distinct orderings within a single primitive unit cell of the garnet. All orderings must belong to a subgroup of the garnet space group. In the first approach, we characterized each C, A, or D site using weighted averages of the ionic radii and electronegativities of the species present in each site, given by the following expressions (see Methods): where X and Y are the species present in a site with fraction x and (1−x), respectively, and O refers to the element oxygen. The implicit assumption in this “averaged” ANN model is that species X and Y are completely disordered, i.e., different orderings of X and Y result in negligible DFT energy differences. Using the same 6-24-1 ANN architecture, we fitted an “averaged” model using the energy of the ground state ordering of the 635 unmixed and 772 mixed garnets. We find that the training, validation, and test MAEs of the optimized model are 22, 26, and 26 meV atom−1, respectively (Supplementary Fig. 3a). These MAEs are about double that of the unmixed ANN model, but still comparable to the error of the DFT E relative to experiments. The larger MAEs may be attributed to the fact that the effect of species orderings on the crystal energy is not accounted for in this “averaged” model.

Ordered neural network model for mixed garnets

In the second approach, we undertook a more ambitious effort to account for the effect of species orderings on crystal energy. Here, we discuss the results for species mixing on the C site only, for which the largest number of computed compositions and orderings is available. For 2:1 mixing, there are 20 symmetrically distinct orderings within the primitive garnet cell, which can be encoded using a 5-bit binary array [b0,b1,b2,b3,b4]. This binary encoding scheme is significantly more compact that the commonly used one-hot encoding scheme, and hence, minimizes the increase in the descriptor dimensionality. We may then modify Eq. 1 as follows:where the electronegativities and ionic radii of both species on the C sites are explicitly represented. In contrast to the “averaged” model, we now treat the 20 ordering-E pairs at each composition as distinct data points. Each unmixed composition was also included as 20 data points with the same descriptor values and E, but different binary encodings. We find that a two-hidden-layer DNN is necessary to model this more complex composition-ordering-energy relationship. The final optimized 13-22-8-1 model exhibits overall training, validation and test MAEs of ~11–12 meV atom−1 on the entire unmixed and mixed dataset (Supplementary Fig. 3b). The comparable MAEs between this extended DNN model and the unmixed ANN model is clear evidence that the DNN model has successfully captured the additional effect of orderings on E. We note that the average standard deviation of the predicted E of different orderings of unmixed compositions using this extended DNN model is only 2.8 meV atom−1, indicating that the DNN has also learned the fact that orderings of the same species on a particular site have little effect on the energy. Finally, similar MAEs can be achieved for A and D site mixing (Supplementary Fig. 3c and 3d) using the same approach.

Stability classification of garnets using ANN models

While E is a good target metric for a predictive ANN model, the stability of a crystal is ultimately characterized by its Ehull. Using the predicted E from our DNN models and pre-calculated DFT data from the Materials Project[31], we have computed Ehull by constructing the 0 K C-A-D-O phase diagrams. From Fig. 4a, we may observe that the extended C-mixed DNN model can achieve a >90% accuracy in classifying stable/unstable unmixed garnets at a strict Ehull threshold of 0 meV atom−1 and rises rapidly with increasing threshold. Similarly, high classification accuracies of greater than 90% are achieved for all three types of mixed garnets. Given the great flexibility of the garnet prototype in accommodating different species, there are potentially millions of undiscovered compositions. Even using our restrictive protocol of single-site mixing in specified ratios, 8427 mixed garnet compositions can be generated, of which 2307 are predicted to have Ehull of 0 meV atom−1, i.e., potentially synthesizable (Supplementary Fig. 4a). A web application that computes E and Ehull for any garnet composition using the optimized DNNs has been made publicly available for researchers at http://crystals.ai.

Fig. 4

Accuracy of stability classification. Plots of the accuracy of stability classification of the ANN models compared to DFT as a function of the Ehull threshold for a. garnets, and b. perovskites. The accuracy is defined as the sum of the true positive and true negative classification rates. A true positive (negative) means that the Ehull for a particular composition predicted from the optimized artificial neural network model and DFT are both below (above) the threshold. For the mixed compositions, an Ehull is calculated for all orderings (20, 7, and 18 orderings per composition for C-, A-, and D-mixed garnets, respectively, and ten orderings per composition for both A- and B-mixed perovskites)

Neural network models for unmixed and mixed perovskites

To demonstrate that our proposed approach is generalizable and not specific to the garnet crystal prototype, we have constructed similar neural network models using a dataset of 240 unmixed, 222 A-mixed and 80 B-mixed ABO3 perovskites generated using the species in Supplementary Table 5. We find that a 4-12-1 single-hidden-layer neural network is able to achieve MAEs of 21–34 meV atom−1 in the predicted E for unmixed perovskites (Fig. 3c), while two 10-24-1 neural networks are able to achieve MAEs of 22–39 meV atom−1 in the E of the mixed perovskites (Supplementary Fig. 5). These MAEs are far lower than those of prior ML models of unmixed perovskites, which generally have MAEs of close to 100 meV atom−1 or higher[9,16]. As shown in Fig. 3b, the accuracy of classifying stable versus unstable perovskites exceeds 80% at a strict Ehull threshold of 0 meV atom−1 and maintains at above 70% at a loosened Ehull threshold of 30 meV atom−1. During the review of this work, a new work by Li et al.[32] reported achieving comparable MAEs of ~28 meV atom−1 in predicting the Ehull of perovskites using a kernel ridge regression model. However, this performance was achieved using a set of 70 descriptors, with model performance sharply dropping with less than 70 descriptors. Furthermore, Li et al.’s model is restricted to perovskites with Ehull < 400 meV atom−1 and only a single ordering for each mixed perovskite, while in this work, the highest Ehull is 747 meV atom−1 for the perovskite dataset and all symmetrically distinct orderings on the A and B sites within a √2×√2×1 orthorhombic conventional perovskite unit cell (ten structures each) are considered.

Discussion

To summarize, we have shown that NN models can quantify the relationship between traditionally chemically intuitive descriptors, such as the Pauling electronegativity and ionic radii, and the energy of a given crystal prototype. A key advantage of our proposed NN models is that they rely only on an extremely small number (two) of site-based descriptors, i.e., no structural degrees of freedom are considered beyond the ionic radii of a particular species in a site and the ordering of the cations in the mixed oxides. This is in stark contrast to most machine-learning models in the literature utilizing a large number of correlated descriptors, which render such models highly susceptible to overfitting, or machine-learning force-fields, which can incorporate structural and atomic degrees of freedom but at a significant loss of transferability to different compositions. Most importantly, we derive two alternative approaches—a rigorously defined averaging scheme to model complete cation disorder and a binary encoding scheme to account for the effect of orderings—to extend high-performing unmixed deep learning models to mixed cation crystals with little/no loss in error performance and minimal increase in descriptor dimensionality. It should be noted that our NN models are still restricted to the garnet and perovskite compositions (with or without cation mixing) with no vacancies, though further extensions to other common crystal structure prototypes and to account for vacancies should in principle be possible. Finally, we show how predictive models of E can be combined with existing large public databases of DFT computed energies to predict Ehull and hence, phase stability. These capabilities can be used to efficiently traverse large chemical spaces of unmixed and mixed crystals to identify stable compositions and orderings, greatly accelerating the potential for novel materials discovery.

Methods

DFT calculations

All DFT calculations were performed using Vienna ab initio simulation package (VASP) within the projector augmented-wave approach[33,34]. Calculation parameters were chosen to be consistent with those used in the Materials Project, an open database of pre-computed energies for all known inorganic materials[31]. The Perdew-Burke-Ernzehof generalized gradient approximation exchange-correlation functional[35] and a plane-wave energy cut-off of 520 eV were used. Energies were converged to within 5 × 10−5 eV atom-1, and all structures were fully relaxed. For mixed compositions, symmetrically distinct orderings within the 80-atom primitive garnet unit cell and the 40-atom √2×√2×1 orthorhombic perovskite supercell were generated using the enumlib library[36] via the Python Materials Genomics package.[37]

Training of ANNs

Training of the ANNs was carried out using the Adam optimizer[38] at a learning rate of 0.2, with the mean square error of E as the loss metric. For each architecture, we ran with a random 64:16:20 split of training, validation and test data, i.e., random sub-sampling cross validation.

Electronegativity averaging

Pauling’s definition of electronegativity is based on an “additional stabilization” of a heteronuclear bond X–O compared to average of X–X and O–O bonds, as follows.where χX and χO are the electronegativities of species X and O, respectively, and E is the dissociation energy of the bond in parentheses. Here, O refers to oxygen. For a disordered site containing species X and Y in the fractions x and (1−x), respectively, we obtain the following: We then obtain the effective electronegativity for the disordered site as follows:

11 in total

1. Generalized Gradient Approximation Made Simple.

Authors:
Journal: Phys Rev Lett Date: 1996-10-28 Impact factor: 9.161

2. Bulk superconductivity at 91 K in single-phase oxygen-deficient perovskite Ba2YCu

Authors:
Journal: Phys Rev Lett Date: 1987-04-20 Impact factor: 9.161

3. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set.

Authors:
Journal: Phys Rev B Condens Matter Date: 1996-10-15

4. Projector augmented-wave method.

Authors:
Journal: Phys Rev B Condens Matter Date: 1994-12-15

Review 5. Deep learning.

Authors: Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal: Nature Date: 2015-05-28 Impact factor: 49.962

6. Machine Learning Energies of 2 Million Elpasolite (ABC_{2}D_{6}) Crystals.

Authors: Felix A Faber; Alexander Lindmaa; O Anatole von Lilienfeld; Rickard Armiento
Journal: Phys Rev Lett Date: 2016-09-20 Impact factor: 9.161

7. Perovskite oxides for visible-light-absorbing ferroelectric and photovoltaic materials.

Authors: Ilya Grinberg; D Vincent West; Maria Torres; Gaoyang Gou; David M Stein; Liyan Wu; Guannan Chen; Eric M Gallo; Andrew R Akbashev; Peter K Davies; Jonathan E Spanier; Andrew M Rappe
Journal: Nature Date: 2013-11-10 Impact factor: 49.962

8. Accelerating materials property predictions using machine learning.

Authors: Ghanshyam Pilania; Chenchen Wang; Xun Jiang; Sanguthevar Rajasekaran; Ramamurthy Ramprasad
Journal: Sci Rep Date: 2013-09-30 Impact factor: 4.379

9. Machine learning bandgaps of double perovskites.

Authors: G Pilania; A Mannodi-Kanakkithodi; B P Uberuaga; R Ramprasad; J E Gubernatis; T Lookman
Journal: Sci Rep Date: 2016-01-19 Impact factor: 4.379

10. Universal fragment descriptors for predicting properties of inorganic crystals.

Authors: Olexandr Isayev; Corey Oses; Cormac Toher; Eric Gossett; Stefano Curtarolo; Alexander Tropsha
Journal: Nat Commun Date: 2017-06-05 Impact factor: 14.919

11 in total

1. Calibrating DFT Formation Enthalpy Calculations by Multifidelity Machine Learning.

Authors: Sheng Gong; Shuo Wang; Tian Xie; Woo Hyun Chae; Runze Liu; Yang Shao-Horn; Jeffrey C Grossman
Journal: JACS Au Date: 2022-09-09

2. Novel inorganic crystal structures predicted using autonomous simulation agents.

Authors: Weike Ye; Xiangyun Lei; Muratahan Aykol; Joseph H Montoya
Journal: Sci Data Date: 2022-06-14 Impact factor: 8.501

3. Accelerating amorphous polymer electrolyte screening by learning to reduce errors in molecular dynamics simulated properties.

Authors: Tian Xie; Arthur France-Lanord; Yanming Wang; Jeffrey Lopez; Michael A Stolberg; Megan Hill; Graham Michael Leverick; Rafael Gomez-Bombarelli; Jeremiah A Johnson; Yang Shao-Horn; Jeffrey C Grossman
Journal: Nat Commun Date: 2022-06-14 Impact factor: 17.694