Literature DB >> 35921428

Physics-Guided Descriptors for Prediction of Structural Polymorphs.

Bastien F Grosso¹, Nicola A Spaldin¹, Aria Mansouri Tehrani¹.

Abstract

We develop a method combining machine learning (ML) and density functional theory (DFT) to predict low-energy polymorphs by introducing physics-guided descriptors based on structural distortion modes. We systematically generate crystal structures utilizing the distortion modes and compute their energies with single-point DFT calculations. We then train a ML model to identify low-energy configurations on the material's high-dimensional potential energy surface. Here, we use BiFeO3 as a case study and explore its phase space by tuning the amplitudes of linear combinations of a finite set of distinct distortion modes. Our procedure is validated by rediscovering several known metastable phases of BiFeO3 with complex crystal structures, and its efficiency is proved by identifying 21 new low-energy polymorphs. This approach proposes a new avenue toward accelerating the prediction of low-energy polymorphs in solid-state materials.

Entities: Chemical

Year: 2022 PMID： 35921428 PMCID： PMC9376952 DOI： 10.1021/acs.jpclett.2c01876

Source DB: PubMed Journal: J Phys Chem Lett ISSN： 1948-7185 Impact factor: 6.888

Computational materials science has undergone a recent paradigm shift with the advent of data-driven methods such as machine learning (ML). These techniques are now considered a standard tool, along with widely used methods such as density functional theory (DFT), molecular dynamics, or Monte Carlo simulations. One central task for effectively applying machine learning algorithms to materials science problems is developing appropriate descriptors. Descriptors are vector-based numerical representations that should uniquely define the material and are mainly based on compositional or structural features or a mixture of both. In addition, these descriptors should provide meaningful connections to the physics of the materials by establishing a unique and invariant “barcode” for each one of them. The advantage of descriptors based only on the chemical composition, such as atomic number, covalent radius, and number of valence electrons,[1] is that no prior knowledge of the system is required. However, while this approach has been successful in predicting different properties such as band gap, hardness, and thermodynamic stability,[2−5] it is not capable of distinguishing between different structures of the same composition (polymorphs). This problem is highly important because, while metastable in bulk, structural polymorphs can in many cases be stabilized through strain, pressure, or electrostatic boundary conditions and often exhibit new properties absent in the bulk ground state enabling new applications.[6,7] It is therefore crucial to incorporate the influence of structural features to correctly describe phenomena such as metal–insulator transitions or ferroelectricity among others.[8−10] Descriptors based on structural features, such as the smooth overlap of atomic positions (SOAP),[11] minimum bounding ellipsoid (MBE),[12] ordered eigenvalues of the Coulomb matrix,[13] or universal fragment descriptors,[14] have been developed to capture the local environment of the atoms and encode structural features. A natural approach to representing a crystal structure is to use the concept of irreducible representations (irreps), in which the distortions from a higher symmetry reference structure are decomposed into a set of normal modes that describe the transformation of the reference structure into the considered one.[15,16] This approach has been used to understand and explain various phenomena such as thermopower anisotropy, negative thermal expansion, proper and improper ferroelectricity, and antiferroelectricity.[17−21] Furthermore, in the context of phase transitions in transition-metal compounds, the distortion modes have been used as features in statistical analysis of the correlation between distortions and functionalities.[8,22,23] More recently, polyhedral distortions have been used as descriptors to explain trends in behaviors across perovskites.[24] In this work, we explore the idea of using the distortion modes as descriptors in ML. Our method utilizes distortion modes to explore the Born–Oppenheimer potential energy surface (PES) and identify local minima corresponding to metastable phases (polymorphs). While the method that we introduce remains applicable to any crystalline material, here we focus on perovskites for which the distortion modes are simply obtained as a decomposition of the high-symmetry prototype perovskite parent structure with Pm3̅m symmetry. We apply the method to multiferroic bismuth ferrite, BiFeO3 (BFO), which exhibits simultaneous ferroelectricity and magnetic ordering and has a rich structural landscape composed of many local minima with interesting technological properties.[25,26] The ground state structure of BFO (R3c) has a 10-atom unit cell and is reached from the Pm3̅m prototype by a combination of polar displacement of the Bi cation (mode at Γ) and antiphase rotation of consecutive oxygen octahedra in all three Cartesian directions (mode at R).[27] Several metastable phases have been established computationally, many with larger unit cells.[26] Furthermore, the experimental stabilization of many new phases of BFO[28−30] motivates our choice of BFO both to test the predictive power of the method and to identify new metastable phases for future experimental study. When a structure experiences a symmetry lowering, part of its symmetry elements are lost, and the remaining symmetry operations constitute a subgroup of the initial parent space group. If there exists a group–subgroup relation between the structures, one can decompose the distorted one into symmetry-adapted modes of the parent structure that encode patterns of displacements of the atoms. For example, the perfect perovskite structure with space group Pm3̅m can be considered as the higher symmetry parent structure of all lower symmetry structures that are distorted versions of the Pm3̅m phase. It becomes then clear that different subgroups of Pm3̅m can share common symmetry modes. Nevertheless, each structure is characterized by a unique combination of symmetry modes with given relative amplitudes. As each mode consists of a displacement pattern of the atoms, the combination of different modes can either lower or increase the structure’s total energy compared to the undistorted perovskite structure. The increase in energy can occur, for example, in the case of the interatomic distances between anions and cations becoming too small. While chemical intuition could, in principle, allow one to create simple subgroups of distortion modes, resulting in (meta)stable structures, the number of possibilities and the complexity of many distortions are prohibitively large, and a more systematic approach, which we propose here, is required. Our study starts with selecting the distortion modes that will form our descriptors. To select appropriate distortion modes, we extract from the phases previously reported in refs (25) and (26) all the different modes that are contained in unit cells of 80 atoms or fewer using ISODISTORT.[15,16] This yields the 15 modes shown in Figure a. Because certain modes result in displacements that can arbitrarily be applied along different directions, we include six additional building block structures obtained by linear combinations of the nonrotational distortion modes along the directions defined by the lattice vectors. We show an example in Figure b with the X5– mode, where we take the linear combination of the structures obtained by this mode applied along two different directions (a and b) and obtain a structure where the distortions are along ab. While the structures with the distortions along a and b are identical, the new structure with the distortions along the diagonal of the unit cell has a different symmetry and can then be considered as a building block, keeping the uniqueness of the descriptors. We then obtained 21 building blocks constructed by distorting a perfect cubic supercell according to 15 different modes and six of their spatial variations.

Figure 1

(a) Structures of the different distortion modes used as descriptors. The structures are gathered according to the number of atoms each primitive unit cell contains and contain 10, 20, 40, and 80 atoms per unit cell for the blue, green, red, and pink boxes, respectively. Note that we distinguish between distortion modes with the same symmetry by indicating the corresponding symmetry index in parentheses. (b) Structure of a distortion mode along orthogonal directions illustrates how we create extra descriptors with different orientations. Because the structures corresponding to the distortions imposed by the X5– mode along a and b are identical, we include in our set of building blocks the symmetrically distinct structures obtained from X5––a the X5––ab. The remaining structures are shown in Figure S1. Using symmetry considerations, we reduce the unit cell size of the distorted structure to its minimal possible size. We then fix its volume to match that of a supercell of the Pm3̅m structure (Figure a) coinciding with the distorted structure. This constrains the structure to four possible unit cell sizes displayed in Figure b, all of which accommodate G-type antiferromagnetic order. Furthermore, we normalize all the modes in such a way that the sum of the displacements multiplied by is equal to 1 Å, equivalent to a mode amplitude of 1, where Vp and Vs are the primitive and supercell volumes, respectively. This allows us to describe evenly all modes, no matter the size of their unit cell.

Figure 2

(a) Basic cubic unit cell labeled 111. (b) Possible supercells. Each supercell is constructed from multiples of the cubic unit cell structure in (a) along each Cartesian direction and labeled accordingly. (c) Example of a combination of four modes to build a structure. All structures of the modes are scaled to the size of the structure of the third mode (which has the largest supercell), and for each mode, we create a distortion vector capturing the displacement from the 124 structure displayed in (b) and the considered mode. Each vector is then multiplied by the corresponding amplitude factor (α, β, γ, or δ), and the vectors are summed to give the final structure. Finally, a structure can be created by displacing the atoms in the parent structure by a sum of displacements resulting from each normalized mode multiplied by a scalar amplitude, as represented in Figure c. The displacements are computed in supercells corresponding to the largest structure present in the set of modes selected (see Figure b). In Figure , we show our combined DFT and ML approach, which proceeds in two rounds. In the first round, the model is trained by building a training set on distortion modes and computing the energies of the structures by using DFT single-point calculations. In this stage, only the electrons are relaxed while the ions are kept fixed to determine the energies as a function of the modes’ amplitudes. In the second round, using the same methodology, we generate a large number of structures and utilize our ML model, instead of the more expensive DFT calculations, to predict their energies. Finally, we perform DFT ionic relaxation for a selected number of low-energy structures. This approach allows us to start the DFT relaxation on initial structures predicted to be low energy by our ML model and thus are likely to be close to a local minimum of the potential energy surface.

Figure 3

Overview of our DFT-ML procedure. The first round is indicated by solid arrows and the second by dashed ones. (a) (yellow) Construction of the training set. For a given value of N, we start from a list of N nonzero random amplitudes (first panel), where each row of the list corresponds to a structure and each column to the amplitude of a given distortion mode. Nonzero amplitudes are represented by full squares and zero amplitudes by empty squares. From the list, we discard (red) the structures in which two atoms are closer than the minimum distance allowed (distmin) (second and third panels). Finally, we generate the input files corresponding to the structures (fourth panel) that we retain for the DFT section. (b) (blue) DFT calculations for electronic relaxation (b, top) or electronic, ionic, and volumetric relaxation (b, bottom). We compute the energy with a single-point DFT calculation for each line of the list. (c) Using ML, we use the list of amplitudes gathered in the previous step as descriptors and fit the corresponding energies. We then obtain a model that can predict the energy of a given combination of amplitudes. Dashed arrows represent the second round. Following the same steps as described in the first round, we (a) generate structures with different values of N. Then, we use the ML model created to predict the energy of those structures, select the lowest energy ones, and create the corresponding structures (c). Finally, we use these structures as starting configurations and fully relax the ions, lattice parameters, and electrons (b). The first round starts with the generation of the training set. We consider the potential energy surface of BFO in a 21-dimensional space where each dimension is given by a distortion mode, and each coordinate in the space is given by the amplitude of the modes. To span the energy surface unbiasedly, we randomly generate the amplitudes of the modes in the interval [−1.2; 1.2]. This approach generates amplitudes corresponding to the sum of the atomic displacements from the parent structure. The choice of the [−1.2; 1.2] range, even though arbitrary, is set to capture a wide range of the potential energy for each mode. Furthermore, to account for the physics of each mode individually and to capture any favorable or unfavorable couplings up to the seventh order between different modes, we include structures generated by including up to seven modes with amplitudes different than zero. Our procedure starts (Figure a) by choosing an order of coupling N, equal to the number of nonzero amplitudes, and generating a matrix of L lines and 21 random numbers in the interval [−1.2; 1.2] for each line, where L is the number of potential structures. We then obtain a matrix of dimensions L × 21. Second, we randomly set amplitude entries per line to zero (21 – N). We then discard nonphysical structures, determined by computing the minimum distance between pairs of atoms in each structure and removing the structures for which any minimum distance is too small. On the basis of the distribution of the Fe–O bond lengths in all structures containing these ions in the ICSD database,[31] we keep only structures with the minimum distance between any pair of atoms larger than 1.8 Å. Note that the Fe–O bond lengths was chosen because it is usually the smallest distance in regular BFO. This procedure is repeated for each value of N considered and results in a set of structures ready for the following step. Note that all structures’ volumes are fixed to integer multiples of the cubic undistorted cell. We then take all the created structures and compute their energies with a single-point DFT calculation in which only one electronic loop is computed. As a result, we obtain the energy for each structure described by a row of amplitudes in the matrix (Figure b, top). We present in Figure S2 the energies of the different structures included in the training set as well as the number of structures per mode. As shown in Figure S2, most of the structures included in the training set have energy lying between that of the R3c phase constrained to the cubic parent structure volume and angles and that of the cubic parent structure. Furthermore, when the number of modes present in the structure increases, the energy tends also to increase. This is a result of our choice of random amplitudes, for which additional distortions reduce the likely distance between ions and consequently increase the energy. The last step of the first round (see Figure c) consists of using the training set to fit the energies as a function of the modes’ amplitudes. We use a Support Vector Regression (SVR) algorithm, which outperformed other machine learning algorithms that we tried, such as Random Forest and Extreme Gradient Boost (XGB), to construct a model evaluated on the test set. Figure a illustrates the reasonable agreement between the machine learning predicted total energies and DFT calculations. The coefficient of determination (r2) is around 0.77, and the root-mean-square error (RMSE) is determined to be 0.2 eV/f.u. It is worth noting that within the range of energy contained by the constrained R3c and parent cubic structures (lower left of the plot) the points are less spread around the diagonal line (ML = DFT) than they are in the upper right of the plot. This can be attributed to the lower representation of high-energy structures in the training set.

Figure 4

Evaluation of the prediction accuracy of our ML models. (a) Regression plot of predicted (ML) energies versus computed (DFT) energies on the test set. (b) Predicted energy versus computed energy when all the structures containing only one mode were removed from the training set. Each blue dot represents the energy of a structure. The black dashed line represents the position of perfect predictions, where the energy predicted by the ML model perfectly agrees with the DFT energy. The red and green lines are the energy of the cubic and R3c-like structures, respectively. (c) Computed energies (blue) compared to the energies predicted with the model fitted on the data without the one-mode structures (orange) for four often discussed modes. The red and green lines have the same meaning as in (a) and (b). Next, we use the permutation feature importance technique to evaluate which features (distortion modes) significantly contribute to our SVR model prediction of the test set. In this technique, we randomly shuffle the value of each feature 30 times and evaluate the variation in the r2 metric. A feature is deemed important if the shuffling leads to substantial changes in the model’s error. Figure S3 shows the permutation importance of each feature based on changes in r2 sorted from highest to lowest. On the basis of this analysis, the X5– mode along ab is the most important feature, and as expected, the Γ4– modes are also significant. However, surprisingly, modes corresponding to 40-atom supercells, such as T2(24) or Δ5–(24), are also crucial for the model to predict the energy of various structures, demonstrating the importance of including these less often discussed modes. We further evaluate the extrapolative power of our model by removing all the structures that contain only one distortion mode (1922 points) from the training set before predicting their energies. Figure b shows the predicted energy values by using the SVR model against the DFT energies, constructed on the training set without single distortion modes. As expected, the r2 evaluated on the test set is now lower (≈74%) with RMSE = 0.21 eV/f.u. compared to the full test set. Next, utilizing the model built without the single distortion modes, we test whether our machine learning model accurately reproduces the energy variation of the distortion modes with respect to the amplitude by comparing the ML energies to their corresponding DFT energies. Our results are shown in Figure c. We see that the SVR model correctly predicts the signature double-well behavior for the Γ4––a, X5––a, R5–(12)–a, and M2––a modes as well as determining the high energy barrier between the two minima for R5–(12)–a and M2––a. These curves are also plotted for the rest of the individual distortion modes in Figure S4, where we see that good agreement between DFT and ML predictions is obtained for most of them. Thus, despite showing imperfect statistical metrics on the test set, the ML model successfully captures the structural energetics of BFO. In the second round, we use the constructed model to predict the energies of a large number of structures built with the random composition of 1–7 modes combined. First, we construct 457081 structures in the same way as described earlier and as shown schematically in Figure a and predict their energies. Using the model, we select the structures within 250 meV/f.u. of the ground state and classify them into subsets according to their distortion modes. Finally, we select the lowest energy structure of each subset. As a result, we obtain 290 structures, all with different combinations of the modes and containing 20, 40, or 80 atoms, which we then fully relax with DFT. Focusing only on the structures displaying an energy of 100 meV/f.u. or less above the R3c ground state after full relaxation, we obtain 29 different structures, among which eight are rediscovered phases: four phases are the common R3c, Pnma, Pbam, and Pmc21[25,32,33] and four were reported in ref (26) and labeled Pc(1), Pnma(1), Cmc21(2), and P21/c. We report the 21 new phases in Table S1. We observe that the majority of the phases found have 80 atoms per unit cell and have an energy around 40–50 meV/f.u. above the ground state. While this is the largest unit cell size allowed by our way of building the structures (see Figure b), it suggests that even larger unit cell could display stable phases. While a full analysis of all these phases are out of the scope of this work, we provide information about their structures in Table S1 and focus on the analysis of the lowest-energy phase found with P21 symmetry and present its crystal structure in Figure . Note that all the crystal structures are available at the open-source GitHub provided at https://github.com/grossoba/DistortionModesDescriptors. Its unit cell, as shown in Figure a, is similar to the Pc(1) phase referenced in ref (26), and it has the same energy difference with respect to the ground state (23 meV/f.u.). While in the Pc(1) structure the Bi cations are displaced perpendicular to the long axis, with three cations moving in one direction and one in the opposite direction, in the P21 structure, two Bi cations move in one direction and the next two in the opposite direction (see Figure b). The P21 structure also has a different rotation pattern with aα̅βaα̅βcα̅βγδ̅ rotations, as opposed to the aα̅βaα̅βcα̅βγδ̅ rotations in Pc(1) (notation adopted from ref (26)). Considering the isotropic Born effective charges of 4.86 [e] for Bi, 3.99 [e] for Fe, and −2.95 [e] for O in units of the electronic charge magnitude, and multiplying the atom displacements with respect to the cubic parent structure by the corresponding isotropic charge,[26] we evaluate the spontaneous polarization to be around 54 μC/cm2 along the c-axis (long axis). In Figure c we present views of the structures along the pseudocubic orientations and the structures that could be experimentally observed by using high-resolution transmission electron microscopy. In particular, we see that for the view down the c-axis in-plane a “2up–2down” displacement pattern of the Bi cations appears along the [100] direction, reminiscent of the recently discovered antiferroelectric Pnma phase.[29]

Figure 5

Crystal structure of the P21 phase. (a) Full unit cell. (b) b–c, a–c, and a–b projections of the unit cell. (c) P21 supercells imaged along the three pseudocubic directions.

Crystal structure of the P21 phase. (a) Full unit cell. (b) b–c, a–c, and a–b projections of the unit cell. (c) P21 supercells imaged along the three pseudocubic directions. In summary, we introduced a new DFT-ML approach to efficiently explore the energy landscape of complex solid-state materials by implementing distortion modes as descriptors. Our approach was successfully implemented within the BiFeO3 phase space, rapidly rediscovering low-energy crystal structures that had previously been identified by using various computational and experimental methods. In addition, we predicted 21 new low-energy polymorphs of BFO (<100 meV/f.u. above the ground state), including one of the lowest-energy polymorphs of BFO reported, with the P21 symmetry and showing a large polarization. Our predictions (all crystallographic information of the predicitons can be accessed through the GitHub link provided at https://github.com/grossoba/DistortionModesDescriptors) further highlight the rich phase space of BFO and hopefully motivates additional experimental work to synthesize these predictions. Our approach of utilizing distortion modes as building blocks provides a facile way to generate crystal structures and navigate the energy landscape by controlling the coefficients of their amplitudes. Additionally, while we demonstrated the relevance of the use of distortion modes as descriptors through the search for new metastable phases in BFO, our framework can easily be adapted to other materials and could be used for example in materials undergoing multiple polymorph transitions such as metallic halide perovskites. Moreover, our method could provide significant insights for many other applications. One example is Landau theory for phase transitions.[34] To elaborate, building a Landau model to study phase transitions in materials involves the study of couplings between distortion modes, which is often limited to second or third order[21] due to the high computational cost. Our model implicitly contains these couplings, which could readily be extracted at minimal computational cost if one would beforehand relax the volume of the structures (in the training set) to avoid the presence of resulting stress contributions to the energy. Another potential application of our methodology is in the field of ML-driven interatomic potentials with a focus on crystal structure prediction.[35] Constructing machine-learning-guided interatomic potentials involves creating an efficient training set of structures. While a random sampling of the phase space is a valid strategy and has presented great success in elemental systems,[36] we believe that our approach can be a useful complement to efficiently exploring phase spaces and predicting energetically accessible polymorphs. Finally, while the building blocks of the modes in our example are the atomic displacements, the use of group–subgroup relations and the corresponding irreducible representations as descriptors in machine learning should be more widely applicable. For example, symmetry-allowed combinations of magnetic moments could be used as a basis to search for low-energy magnetic orderings in complex antiferromagnetic systems.

Computational Details

The DFT calculations are performed by using VASP[37−40] with the PAW method[41,42] and explicit treatment of the following valence electrons: 6s26p3 for Bi, 3d74s1 for Fe, and 2s2 2p4 for O. A 8 × 8 × 6 k-point Γ-centered Monkhorst–Pack mesh[43] is used to sample the Brillouin zone of a 20-atom unit cell, and an energy cutoff of 700 eV for the plane-wave basis is chosen. For the xcs functional we use PBEsol[44] with an effective Hubbard-like correction, Ueff = 4 eV, for the Fe d orbitals according to Dudarev’s approach.[45] G-type antiferromagnetism is adopted for all the calculations. The machine learning model is created based on a Support Vector Machine Regression (SVR) method[46,47] as implemented in the Scikit-learn[48] library. We use a SVR model with a radial basis function (RBF) as the kernel function. Using the approach illustrated in Figure , we generate 9569 structures and energies using DFT (see Sec. and split the data into a training set (80%, 7655 data points) and a test set (20%, 1914 data points). We further perform a 5-fold cross-validation scheme to optimize the cost constant (C) and the RBF free parameter (γ) to be C = 10 and γ = 1 for the best performance. We finally use these parameters to evaluate the model on the test set.

22 in total

1. Ab initio molecular-dynamics simulation of the liquid-metal-amorphous-semiconductor transition in germanium.

Authors:
Journal: Phys Rev B Condens Matter Date: 1994-05-15

2. Fast and accurate modeling of molecular atomization energies with machine learning.

Authors: Matthias Rupp; Alexandre Tkatchenko; Klaus-Robert Müller; O Anatole von Lilienfeld
Journal: Phys Rev Lett Date: 2012-01-31 Impact factor: 9.161

3. Restoring the density-gradient expansion for exchange in solids and surfaces.

Authors: John P Perdew; Adrienn Ruzsinszky; Gábor I Csonka; Oleg A Vydrov; Gustavo E Scuseria; Lucian A Constantin; Xiaolan Zhou; Kieron Burke
Journal: Phys Rev Lett Date: 2008-04-04 Impact factor: 9.161

4. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set.

Authors:
Journal: Phys Rev B Condens Matter Date: 1996-10-15

5. Projector augmented-wave method.

Authors:
Journal: Phys Rev B Condens Matter Date: 1994-12-15

6. Designing a robustly metallic noncenstrosymmetric ruthenate oxide with large thermopower anisotropy.

Authors: Danilo Puggioni; James M Rondinelli
Journal: Nat Commun Date: 2014-03-17 Impact factor: 14.919

7. Finding the Next Superhard Material through Ensemble Learning.

Authors: Ziyan Zhang; Aria Mansouri Tehrani; Anton O Oliynyk; Blake Day; Jakoah Brgoch
Journal: Adv Mater Date: 2020-12-04 Impact factor: 30.849

8. Universal fragment descriptors for predicting properties of inorganic crystals.

Authors: Olexandr Isayev; Corey Oses; Cormac Toher; Eric Gossett; Stefano Curtarolo; Alexander Tropsha
Journal: Nat Commun Date: 2017-06-05 Impact factor: 14.919

9. Ellipsoidal analysis of coordination polyhedra.

Authors: James Cumby; J Paul Attfield
Journal: Nat Commun Date: 2017-02-01 Impact factor: 14.919

10. Liberating a hidden antiferroelectric phase with interfacial electrostatic engineering.

Authors: Julia A Mundy; Bastien F Grosso; Colin A Heikes; Dan Ferenc Segedin; Zhe Wang; Yu-Tsun Shao; Cheng Dai; Berit H Goodge; Quintin N Meier; Christopher T Nelson; Bhagwati Prasad; Fei Xue; Steffen Ganschow; David A Muller; Lena F Kourkoutis; Long-Qing Chen; William D Ratcliff; Nicola A Spaldin; Ramamoorthy Ramesh; Darrell G Schlom
Journal: Sci Adv Date: 2022-02-02 Impact factor: 14.136