Literature DB >> 33869772

How Machine Learning Will Revolutionize Electrochemical Sciences.

Aashutosh Mistry¹, Alejandro A Franco^2,3,4,5, Samuel J Cooper⁶, Scott A Roberts⁷, Venkatasubramanian Viswanathan⁸.

Abstract

Electrochemical systems function via interconversion of electric charge and chemical species and represent promising technologies for our cleaner, more sustainable future. However, their development time is fundamentally limited by our ability to identify new materials and understand their electrochemical response. To shorten this time frame, we need to switch from the trial-and-error approach of finding useful materials to a more selective process by leveraging model predictions. Machine learning (ML) offers data-driven predictions and can be helpful. Herein we ask if ML can revolutionize the development cycle from decades to a few years. We outline the necessary characteristics of such ML implementations. Instead of enumerating various ML algorithms, we discuss scientific questions about the electrochemical systems to which ML can contribute.

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 33869772 PMCID： PMC8042659 DOI： 10.1021/acsenergylett.1c00194

Source DB: PubMed Journal: ACS Energy Lett Impact factor: 23.101

Clean energy, pure water, reduced air pollution, and sustainable fuels are some of the most urgent global challenges that must be answered within the next few decades.[1] Electrochemical systems are promising technologies for many of these quests.[2,3] These devices function via interconversion of electric charge and chemical species. In turn, they intrinsically offer a direct control over the desired chemical transformation by externally modulating electricity. For example, the chemical energy stored in a battery can be converted to electricity on demand. Another example is electrochemical conversion of CO2 to useful fuels, where the amount and selectivity can be controlled by the electrochemical driving force. However, the successful implementations of electrochemical systems are rather limited, as we lack the material systems that exhibit the desired performance and longevity for these applications. These materials typically perform multiple functions, and the challenge is to find not only materials with appropriate functionalities but also the ones exhibiting these functions efficiently. To further complicate this process, the electrochemical systems contain multiple material phases—electrode and electrolyte in the simplest form—and the overall functionality strongly relates to how these phases interact with each other (in addition to their individual behavior). Accordingly, the development times have been historically very long, e.g., the first commercial Li-ion battery took about two decades, and all subsequent chemistries have required a decade or longer for the lab-to-market transition.[4] Traditionally, this development has been through trial and error for discovering promising materials and subsequently a sequential process of understanding their individual and joint electrochemical responses. We must shorten this time frame to come up with feasible solutions to the aforementioned global challenges. One can condense the development cycle for any electrochemical system into answering the four essential why questions identified in Figure :

Figure 1

Research, development, and deployment tasks in any electrochemical system involve fundamentally four why questions. Each implicitly identifies the length and time scales of interest, thus specifying how to answer these questions using experiments and modeling as the tools. The sub-figures in the bottom panel are drawn as modules of energy storage systems and can be used to represent equivalent examples of other electrochemical systems. [Reprinted with permission from ref (5). Copyright 2020 The Electrochemical Society.]

Relationship between structure and relevant property, e.g., how the molecular structure of an electrolyte relates to properties describing ion transport. Here structure can be the atomic/molecular structure, the crystal structure of bulk phases, or the porous structure of electrodes. Equivalently, the relevant properties differ. Property ↔ performance relationship describes how different properties (and, in turn, the corresponding processes) come together to define an observable electrochemical response. Design and control deal with how to scale up to commercial systems and their operation. For example, how to combine cells to make a battery pack and modulate its operation. Comparing viability of different electrochemical systems for a given task: a battery designed for electric vehicles is not suitable for electric aircraft or storing energy on the grid. Research, development, and deployment tasks in any electrochemical system involve fundamentally four why questions. Each implicitly identifies the length and time scales of interest, thus specifying how to answer these questions using experiments and modeling as the tools. The sub-figures in the bottom panel are drawn as modules of energy storage systems and can be used to represent equivalent examples of other electrochemical systems. [Reprinted with permission from ref (5). Copyright 2020 The Electrochemical Society.] These four questions are valid across any electrochemical system, since the fundamental interactions, such as ion transport, reactions, porous electrodes, etc., are the common denominator.[6] Given the authors’ primary research focus, batteries are used as tangible examples illustrating the concepts, but one can easily find equivalent specific examples for any electrochemical system of interest. Of these four questions, the smaller scale questions, ① and ②, represent the electrochemical sciences and prolong the development process. Any new material comes with its own peculiarities, and its behavior has to be understood sufficiently for commercialization. Electrochemical sciences examine these smaller scale phenomena that are strongly material dependent and prohibit us from naively assuming similarities to previously explored materials (larger scales are comparatively material agnostic). Physics-based analysis has increasingly become commonplace to quantitatively describe structure ↔ property and/or property ↔ performance relationships and facilitate predictability across scales.[7−16] Such model predictions decrease the experimental efforts as well as identify the rate-limiting processes to guide material development, thus rationalizing the otherwise empirical development scheme. An implicit assumption in these physics-based models is that the physics of the material response is accurately known. While the fundamental laws governing material behavior, e.g., conservation of mass, energy balance, etc., are unambiguously known, multiple processes simultaneously contribute to each of these; for example, reactions and transport both contribute to species balance. Furthermore, one has to sufficiently characterize these processes (in terms of relevant constitutive relations and corresponding material properties). Machine learning (ML), on the other hand, is a type of data-driven modeling that makes predictions without knowing the underlying physics. The data-driven nature of ML substitutes knowledge of the underlying physical mechanisms with many observations of system behavior. This has revolutionized many domains in the past decade,[17,18] especially where large datasets are available. Successful ML applications typically rely on abundant data, be it speech patterns to train personal assistants (e.g., Apple’s “Siri”), purchase history to predict consumer preferences (e.g., Amazon), or video data to train self-driving cars (e.g., Comma’s “openpilot”). This success of ML in the technology sector might lead one to expect a similar shift in the sciences.[19,20] However, breakthroughs in science have traditionally relied on our ability to understand, reason, and formalize underlying physical mechanisms. The data-based character of ML appears insufficient to answer such scientific questions. Accordingly, the time scale and nature of the ML revolution in sciences will be different. This dichotomy between the physics-based nature of scientific discoveries and data-driven nature of ML has cornered its visible scientific applications to the data-heavy end of the spectrum, such as automated experiments[21] and data-driven predictions for battery aging.[22] The electrochemical sciences are meant to offer rational guidelines for designing electrochemical systems. The predictability of the material response is essential to the rational design. Both data-driven and physics-based approaches facilitate predictability and offer complementary information. Accordingly, the choice of analysis is driven by the questions the investigator chooses to ask (a secondary criterion is the efforts required in pursuing each approach). For example, consider making a high-performing Li-ion porous electrode using prescribed materials, such as nickel manganese cobalt oxide (NMC). A data-driven solution is to make multiple porous electrodes—each with different material compositions (active material : carbon : binder weight fractions), porosities, and thicknesses—and carry out electrochemical measurements of the resulting performance across (dis)charge rates of interest. Once such a dataset of controlled factors (compositions, porosities, and thicknesses) and corresponding outcomes (e.g., energy and power) is available, data analysis identifies an optimally performing electrode. Such an approach identifies the optimal electrode within the design space studied, but it does not offer any insight into why this electrode configuration is the optimal one. Therefore, if one were to change the active material to a different chemistry or even just change the particle morphology, the previously generated dataset would lose nearly all usefulness. The physics-based understanding of the porous electrode performance answers the why question by relying on intrinsic material properties (e.g., diffusivities, reaction rate constants, etc.) and predicting the performance differences across a variety of electrodes having different geometrical arrangements. The underlying cause for the resultant performance is precisely identified in this approach, and any ambiguity is related to inaccurate properties or incomplete physics. Alternatively, if we combine both approaches, we would use the measured performance (data) and the physics rules to characterize the geometrical properties of the electrodes.[23] This amounts to creating a structure–property–performance mapping—a generalized thought across many material systems (Figure )—that provides more, as well as quantitatively precise, information (e.g., uncertainty bounds) than either of the approaches alone and answers the following questions: What electrode specifications lead to better performance? Why a particular electrode specification leads to better performance? How to translate the understanding developed by studying a particular set of electrode materials to other materials? Thus, instead of the either-or fallacy, we should explore combinations of physics- and data-driven predictions to unlock the true potential of ML for sciences. A judicious combination of data-driven and physics-based approaches can speed up scientific discoveries by translating mechanistic information across systems using physics (i.e., causation) and substituting unknown or complex physics via data (i.e., correlation). With the help of physics, one can partially relax the data overhead since the physics-constrained behavior can be approximated using a limited dataset. This is particularly suited for ML applications in sciences[19] where the observed response satisfies fundamental laws such as energy conservation, entropy generation, charge neutrality, etc. The goal is to improve predictive accuracy while minimizing efforts. This approach also aids in the development of transferable functions. In a conventional physics-based analysis, the accuracy is improved by progressively introducing advanced constitutive relations, while the fundamental laws remain unchanged (for example, replacing dilute solution theory with concentrated electrolyte transport). In a typical data-driven model, the accuracy is improved by adding data points. If sufficient data is available, the underlying physics can be approximated, and if the physics is accurately known, the observed behavior can be explained. However, either approach becomes prohibitively expensive as more accuracy is desired. If pursued alone, accuracy and efforts scale positively for each approach. For scientific discoveries, neither sufficient data nor accurate physics is known, and a suitable combination of the two approaches is an efficient path forward to simultaneously improve accuracy and reduce efforts. The subsequent electrochemical examples will illustrate these ideas. The examples are presented in the order of increasing length and time scales in Figure .

Predicting Material Properties

For Li intercalation materials such as NMC, the thermodynamic energy storage response is prescribed as voltage for different extents of intercalated Li.[24] Density Function Theory (DFT) calculations can, in principle, provide this information. However, the task becomes computationally prohibitive if one wishes to compute the open-circuit voltage for all possible combinations of Ni, Mn, and Co contents over multiple Li intercalation states.[25] The problem becomes even less tractable in the presence of additional dopants/impurity atoms. Herein, ML surrogates offers a reasonable solution. Based on selected DFT calculations, an ML model can be developed that accurately predicts the inter-species interactions and honors the requisite geometrical symmetries and invariances.[26] Using these ML potentials, one can accurately explore the open-circuit voltage over a quaternary composition space of Li, Ni, Mn, and Co. This approach effectively changes how we answer the first question in Figure . ML potentials have vastly improved in accuracy and reliability[27,28] and are approaching the accuracy of ab initio methods at a minuscule fraction of the computational cost. Such computational improvements relate to the choice of regression (i.e., approximation of the underlying trends) as well as featurization of the structure information.[29,30] The featurizations are also necessary and effective for unsupervised learning in materials classification and inference.[31] Additionally, these techniques have been shown to accurately and efficiently expand to many-component systems,[32] enabling design searches that were not possible previously. In a recent work, featurization using atom-centered symmetry functions and neural network as the regressor are used to generate the voltage profile and lattice structure dynamics as a function of Li intercalation states for any arbitrary NMC composition, marking the first step toward a computationally feasible optimization workflow for relevant performance properties of cathode material[24] and anode materials.[33] The ML potentials are seeing incredible progress[34] toward increasing the generalizability, extrapolation capabilities, and principled selection of feautrization and hyperparameters.[31] Such progress can lead to mapping high-fidelity multi-component (n > 5) phase diagrams to discover new battery electrode and electrolyte materials in the coming years.

Rational Electrode Manufacturing

A philosophically equivalent question arises while defining the mapping from porous electrode structure (mesostructure) to corresponding effective properties such as tortuosity factor. As the mesostructure is set during the electrode manufacturing stage, one can go a step further and correlate electrode manufacturing to mesostructure properties. For the same electrode materials, the mesostructure properties describe the variations in the electrochemical performance. While the physical modeling of the manufacturing processes has received some attention,[35−37] the data-driven approaches[38] are just emerging. We essentially face two interrelated challenges: unraveling the influence of manufacturing parameters (e.g., recipe, calendering pressure) and determining the role of different processing steps on the final electrode mesostructure. Classically, physical models can be used to simulate each process step and combine them through sequential multiscale coupling.[15] For example, calculated electrode slurries[40] can be used in the simulation of their drying,[35] and the dried electrode mesostructures can be used as inputs for calendering simulations.[41] The resulting geometrical arrangement the electrodes can then be used in electrochemical performance simulators to establish the manufacturing–mesostructure–performance links.[36] ML models are efficient tools in ensuring the experimental validity of such involved multiscale computational models. For instance, ML models have been used to correctly parameterize force fields used in the coarse-grained simulation of electrode slurries.[40] They ensured a proper matching of calculated and experimental properties (e.g., viscosity vs applied shear rate) with about 20 times reduction in efforts—from 6 months to 8 days—compared to manual parameterization.[40] ML can be also used in combination with surrogate models to bypass these expensive physical simulations, which usually solve the dynamics of a very significant number of particles[37,40]), and to accelerate the manufacturing parameters’ optimization. For instance, a surrogate modeling approach informed with experimental data to predict electrode mesostructures in three dimensions and their properties has been recently proposed.[39] The experimental data and the surrogate model results are used to successfully train a ML model to be able to predict the influence of calendering conditions on the electrode properties, such as the tortuosity factor (Figure a).

Figure 2

(a) Example of a workflow coupling experimental data, a surrogate electrode mesostructure predictor, and ML (Sure Independent Screening and Sparsifying Operator) to predict the impact of electrode composition, initial porosity, and calendered pressure on the electrode tortuosity factor. [Reprinted with permission from ref (39). Copyright 2020 Elsevier.] (b) Example of a classification machine learning algorithm (Support Vector Machine) able to predict the impact of the percentage of NMC active material, solid-to-liquid ratio, and viscosity of the slurry on the final porosity of a lithium ion battery positive electrode. [Reprinted with permission from ref (38). Copyright 2019 Wiley-VCH GmbH.] Another way to approach these problems is to apply ML directly to experimental data. This works only if accurate experimental measurements are available for electrodes prepared under different conditions—composition, solid-to-liquid ratio, etc. ML has been employed to map electrode properties, e.g., porosity as a function of the manufacturing conditions, as shown in Figure b.[38] Once such a mapping is generated, it is used to identify optimal conditions for electrode manufacturing.

Accurate 3D Mesostructures

Instead of sequentially building mesostructure ↔ effective properties and effective properties ↔ electrochemical performance relationships, if detailed mesostructure information is available, one may directly simulate electrochemical interactions at the pore scale. X-ray computed tomography (XCT) and other advances in 3D imaging allow us to study the composition and structure of critical materials as manufactured, rather than using idealized representations. The use of such realistic geometries is directly related to higher fidelity predictions of the electrochemical responses of these materials. However, many challenges are prevalent in obtaining accurate 3D mesostructures, including image segmentation (i.e., assigning correct material phase to each voxel) and the effort required for 3D imaging, resulting in limited datasets. Convolutional neural networks (CNNs) are particularly suited for image segmentation using supervised learning methods. Unlike 2D image analysis in other fields, electrodes are 3D and require appropriate customization to typical CNN algorithms.[42,44]Figure a–c shows a recent application of CNN-based image segmentation for graphite anode materials. In this and other cases,[44] CNNs are shown to produce more convincing segmentations than several conventional segmentation approaches. Amazingly, CNNs can even generate segmentations that are, in a sense, more reliable than the training data used to produce them, as they apply their learned rules consistently over the whole volume, which can be difficult for a human when manually segmenting billion-voxel volumes. Crucially, the segmentations are based on features resulting from 3D convolutions, meaning that non-trivial (i.e., not “thresholded”) segmentations result and imaging artifacts (such as varying brightness) can be overcome. The training itself is the computationally intensive step for CNNs, but once trained, inferences are very fast (orders of magnitude faster than manual segmentation) and repeatable. Such CNNs are specific to particle morphology, i.e., segmenting graphite vs NMC electrodes. In other words, a CNN trained on one electrode can be used to convincingly segment many electrode samples of the same type, but likely not a different particle morphology without additional training.

Figure 3

(a–c) Comparison between human (b) and CNN (c) segmentations of 3D XCT images. (d) Bayesian CNNs used to quantify the uncertainty in image segmentations.[42] (e, f) Application of GANs to create unique, yet realistic, mesostructures.[43] Since training data derived from real images is never perfect, it is important to characterize associated uncertainties. An emerging direction is to combine Bayesian inference with CNNs to quantify uncertainties. By probing the trained variances in the weights of such networks, uncertainty maps can be generated (Figure d). 3D image uncertainties can then be propagated to subsequent physics calculations, for example, porosity, effective property, and electrochemical predictions (unpublished results). In addition, following segmentation, Generative Adversarial Networks (GANs) are now being developed to learn the phase arrangement in segmented data and generate mesostructure realizations with customized properties in volumes larger than could be obtained from imaging alone (Figure e,f).[43]

Estimating Properties from Experiments

Typically the effective mesostructure properties ↔ electrochemical performance mapping is used to explore how performance varies with effective properties. This mapping can be inverted to characterize effective properties if appropriate performance measurements are available. As shown in Figure , first physics-based performance calculations are carried out for multiple effective property combinations. Once such a dataset is available, the data-driven modeling is used to generate such mappings. Subsequently, it is used to estimate mesostructure properties from performance measurements.[23] The data-driven modeling avoids explicitly solving the governing equations for all possible combinations of property values, which is prohibitively expensive.

Figure 4

(a) Measured electrode performance is interpreted using (b) physics-based electrochemical description. (c) The difference between the two is mapped in terms of mesostructure properties using data-driven modeling. The most representative properties are retrieved using this error landscape. (d) Experiments and predictions using interpreted mesostructure properties are shown to illustrate reliability of analysis. [Used with permission from Mistry et al., ref (23).] For example, consider identifying mesostructure properties, e.g., tortuosity factor, from the electrochemical performance of porous electrodes, as shown in Figure . Not every mesostructure property ↔ electrochemical performance mapping can be inverted, and accordingly one must ensure that the mapping is sensitive to every property one wishes to estimate. Figure c is an example mapping generated for a given experimental dataset (Figure a) and physics-based porous electrode theory responses (Figure b) based on a select few property combinations. Herein the sensitivity to each property is achieved by comparing performance at multiple currents (C-rates). The accuracy of such an approach is presented in Figure d by comparing measurements against the physics-based predictions using the estimated mesostructure properties. In essence, ML builds reduced order (or surrogate) models from data. The model building is an iterative process where the reliable approximation of the datasets is not known beforehand (refer to “Model Parameters and Data Accuracy” in the Supporting Information). If pursued as a purely data-driven problem, the usefulness of such models is limited. The fidelity of ML predictions is constrained by (i) the quality and quantity of the training data and (ii) the appropriateness of the function representation. It is implicitly assumed that, given sufficient data and suitable function, the necessary trends can be learned efficiently. It is possible that the chosen representation is effort-intensive to learn, and either a customized learning approach (to find model coefficients faster) or a different representation (to speed up learning) is required for a practical ML implementation. To illustrate these nuances, consider having a set of discrete measurements of diffusivity, D, at different temperatures, T. This discrete information needs to be converted into a continuous function for further analysis, such as obtaining activation energies from the slope or using the D = D(T) property relation in a temperature-dependent analysis. In essence, machine learning builds reduced order (or surrogate) models from data. Figure shows three different datasets in each of the columns, and two different Neural Network (NN) representations are used to learn the underlying trends (each row respectively). The datapoints contain inaccuracies (noise in the measurements). The learning ensures that the model predicts the training data accurately, while a similar accuracy is not necessarily guaranteed for predicting datapoints not part of the training set. For example, Figure e,f shows that predicted trends exhibit drastic changes away from the training datapoints. Note that not just extrapolation but also interpolation in between the two data clusters are questionable.

Figure 5

Data-dependent characteristics of ML are illustrated by learning D(T) relation from discrete datapoints using two NN representations (with Sigmoid activation functions) shown in the insets. Columns represent different data complexity, while rows express model complexity. The solid red line is the trained model in each plot. Approaching this as a data-driven modeling question, testing the model accuracy on a dataset not used for training can help expose and manage artifacts. The model complexity is intrinsically tied to the accuracy of the dataset. Compare Figure , panels b and e, having identical datapoints: the simpler representation in (b) is reliable if the data contains inaccuracies, while the more complex representation in (e) is meaningful if the datapoints are reliable. (“Model Parameters and Data Accuracy” in the Supporting Information further discusses the connection between model complexity and data reliability; model complexity often scales with the number of model parameters.) Alternatively, the physics can guide through this impasse. The slope of log(D) vs 1/T in Figure represents activation energy and is typically a positive and a slowly varying property (if at all). Accordingly, the trends in Figure e,f are likely unphysical. These qualifications are easier to make from Figure where a one-dimensional dataset is explored, but become quite difficult to identify when higher dimensional datasets are studied. Appropriately pre-processing datasets using physical symmetries or geometrical invariances (known as feature engineering), for example, training log(D) vs 1/T, instead of D vs T, helps considerably with building data-driven models. Since any ML implementation relies on data, data generation and curation are crucial steps. If data is generated through experiments, one must ensure repeatability and reproducibility of measurements. Such precautions minimize systematic errors so that the remaining variability is a true random error and analyzed statistically. Instead, if data is generated using physics-based calculations, the accuracy of computed trends in deterministic simulations and reliability of statistics in stochastic simulations must be ensured. Essentially, one should be mindful of the confidence in the raw data and how the uncertainty propagates to predictions. One must also be wary of over-fit models (often nicer-looking fits of the data) that may not be useful or predictive outside of the scope in which they are fit. Typically, the datasets are not as simple as D = D (T) so that one can visually assess the reliability of the data-driven model. In addition to rigorous verification of model accuracy, we should also focus on interpreting these approximations. Either our intuition needs to evolve to comprehend the information flow or we need to visually express the data-driven models for human interpretation. The interpretation is essential to generating insights from data, identifying limiting mechanisms, and making decisions. When combined with physics, the overall analysis scheme offers both more accurate correlations and clearer causality.[45] Most of the examples discussed so far train ML on explicit physics-based calculations (physics-informed mappings). An alternative is to modify the training process to explicitly follow physics-based governing equations[46−48] (which should be referred to as physics-encoded mappings). Materials discovery[49−52] is a promising ML application. Atomic- or molecular-scale calculations are performed over a wide range of compounds to map atomic/molecular variations to macroscopically relevant properties. For example, electrolytes with different solvent molecules can be analyzed to map molecular structure to ionic conductivity.[53] Such structure-to-property maps (① in Figure ) reliably compute properties for new structures without having to do explicit physics-based calculations once the map is built. For target property values, these maps can be used in an inverse fashion to identify essential structural attributes for the property targets.[54] A seemingly different but philosophically equivalent application is the calculation of effective properties from 3D mesostructures. The traditional approach is to solve 3D species conservation equations. ML can speed this up by mapping 3D mesostructures to corresponding effective properties.[39,55] Afterward, new 3D mesostructures of a similar type do not require 3D physics calculations since the physics is implicitly captured in the mapping. Taking this idea a step further, ML can streamline electrode manufacturing–mesostructure–effective properties–electrochemical performance mapping in a physically consistent fashion (Figure ). Such a mapping allows one to track the influence of a processing step on performance and, in turn, rationally design porous electrodes for the target performance. Present-day electrode processing controls the bulk specifications such as composition and porosity, but with advances in 3D printing, in the future, we should be able to explicitly control electrode arrangement by leveraging the aforementioned structure–property–performance mapping. An alternative to building such structure ↔ property and property ↔ performance mappings (① and ② in Figure ) is to simultaneously resolve all scales using a suitable physics-based approach. A new paradigm of exascale computing has been introduced recently that aims to build computing solutions catering to such expensive problems.[56] Exascale computing is ideally suited for simultaneously resolving multiple length scales, such as performing DFT or ab initio calculations for length and time scales approaching continuum behavior or simulating electrochemical interactions of large 3D porous electrodes (∼100 μm thick and ∼1000 × 1000 μm2 cross-section) with pore-scale resolution. Alternatively, an appropriate combination of ML and physics-based simulations may offer a computationally less expensive solution where physics-based simulations work at different scales and these scales are coupled through ML. For example, as discussed earlier, the force fields from a DFT simulation can be machine learned and separately used in Molecular Dynamics or Monte Carlo simulations. Such a solution essentially replaces the hardware (e.g., exascale computing) requirements with specialized software development. As these physics-based simulations produce larger and larger datasets, their interpretation becomes challenging. ML can parse through these datasets to identify relevant information that should be visualized by the researchers. Consider a 3D simulation of an intercalating porous electrode[13,36] where multiple small-scale entities jointly reproduce a macroscopic response. Given the sheer number of such entities, it is infeasible (and unnecessary) to visually track each of them. Rather the interest is in visualizing norms and outliers. For this electrode, the representative particles are the ones whose lithiation follows the macroscopic response (the norms) and those severely lagging or leading (i.e., outliers). Unsupervised learning is suitable to parse through the simulation data and identify such representative events.[31,57,58] Alternatively, the dimensionality of the data can be reduced to correlate the most essential features.[59] An operational constraint in executing such a multiscale investigative scheme is the development time of the physics-based simulation for mesoscale interactions. Smaller (quantum, atomic, molecular) and larger (porous electrode and above) scales have relatively mature computational methods, while the interactions at intermediate scales (mesoscale) range widely, and consequently many methods exist, e.g., phase-field modeling, discrete element method, kinetic Monte Carlo, etc., each suitable for a specific set of interactions, with no off-the-shelf simulation tool that can be directly applied to any new material system. ML can speed up this development by (at least partially) eliminating the overhead for manually learning a new method. Not only can it sift through literature to suggest solutions for a new problem, but it can iterate through multiple simulations and automatically identify meaningful conditions. The hope is to let the researcher focus on understanding mechanisms and automate the tools used to probe these mechanisms. A philosophically similar example is Sony’s recently proposed music creation paradigm which allows the artist to focus on creating the music without having to worry about the required instruments.[60]The hope is to let the researcher focus on understanding mechanisms and automate the tools used to probe these mechanisms. While ML offers a new toolset for scientific discoveries, not all ML can revolutionize electrochemical sciences. Any meaningful ML implementation needs to help identify promising materials or pinpoint mechanisms limiting material behavior so that the development cycle for the electrochemical systems can be shortened. Hence, we should focus on adopting and developing ML that provides more insights than before or allows us to pursue questions that have remained unanswered due to effort-intensive existing approaches.

20 in total

1. Building better batteries.

Authors: M Armand; J-M Tarascon
Journal: Nature Date: 2008-02-07 Impact factor: 49.962

2. Unsupervised machine learning in atomistic simulations, between predictions and understanding.

Authors: Michele Ceriotti
Journal: J Chem Phys Date: 2019-04-21 Impact factor: 3.488

3. Controlling Binder Adhesion to Impact Electrode Mesostructures and Transport.

Authors: Ishan Srivastava; Dan S Bolintineanu; Jeremy B Lechman; Scott A Roberts
Journal: ACS Appl Mater Interfaces Date: 2020-07-16 Impact factor: 9.229

Review 4. Inverse molecular design using machine learning: Generative models for matter engineering.

Authors: Benjamin Sanchez-Lengeling; Alán Aspuru-Guzik
Journal: Science Date: 2018-07-26 Impact factor: 47.728

5. Stability of Electrodeposition at Solid-Solid Interfaces and Implications for Metal Anodes.

Authors: Zeeshan Ahmad; Venkatasubramanian Viswanathan
Journal: Phys Rev Lett Date: 2017-08-04 Impact factor: 9.161

6. Closed-loop optimization of fast-charging protocols for batteries with machine learning.

Authors: Peter M Attia; Aditya Grover; Norman Jin; Kristen A Severson; Todor M Markov; Yang-Hung Liao; Michael H Chen; Bryan Cheong; Nicholas Perkins; Zi Yang; Patrick K Herring; Muratahan Aykol; Stephen J Harris; Richard D Braatz; Stefano Ermon; William C Chueh
Journal: Nature Date: 2020-02-19 Impact factor: 49.962

7. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost.

Authors: J S Smith; O Isayev; A E Roitberg
Journal: Chem Sci Date: 2017-02-08 Impact factor: 9.825

8. Quantitative Mapping of Molecular Substituents to Macroscopic Properties Enables Predictive Design of Oligoethylene Glycol-Based Lithium Electrolytes.

Authors: Bo Qiao; Somesh Mohapatra; Jeffrey Lopez; Graham M Leverick; Ryoichi Tatara; Yoshiki Shibuya; Yivan Jiang; Arthur France-Lanord; Jeffrey C Grossman; Rafael Gómez-Bombarelli; Jeremiah A Johnson; Yang Shao-Horn
Journal: ACS Cent Sci Date: 2020-06-18 Impact factor: 14.553

9. Machine-learning-revealed statistics of the particle-carbon/binder detachment in lithium-ion battery cathodes.

Authors: Zhisen Jiang; Jizhou Li; Yang Yang; Linqin Mu; Chenxi Wei; Xiqian Yu; Piero Pianetta; Kejie Zhao; Peter Cloetens; Feng Lin; Yijin Liu
Journal: Nat Commun Date: 2020-05-08 Impact factor: 14.919

10. Exascale applications: skin in the game.

Authors: Francis Alexander; Ann Almgren; John Bell; Amitava Bhattacharjee; Jacqueline Chen; Phil Colella; David Daniel; Jack DeSlippe; Lori Diachin; Erik Draeger; Anshu Dubey; Thom Dunning; Thomas Evans; Ian Foster; Marianne Francois; Tim Germann; Mark Gordon; Salman Habib; Mahantesh Halappanavar; Steven Hamilton; William Hart; Zhenyu Henry Huang; Aimee Hungerford; Daniel Kasen; Paul R C Kent; Tzanio Kolev; Douglas B Kothe; Andreas Kronfeld; Ye Luo; Paul Mackenzie; David McCallen; Bronson Messer; Sue Mniszewski; Chris Oehmen; Amedeo Perazzo; Danny Perez; David Richards; William J Rider; Rob Rieben; Kenneth Roche; Andrew Siegel; Michael Sprague; Carl Steefel; Rick Stevens; Madhava Syamlal; Mark Taylor; John Turner; Jean-Luc Vay; Artur F Voter; Theresa L Windus; Katherine Yelick
Journal: Philos Trans A Math Phys Eng Sci Date: 2020-01-20 Impact factor: 4.226

5 in total

Review 1. Artificial Intelligence Applied to Battery Research: Hype or Reality?

Authors: Teo Lombardo; Marc Duquesnoy; Hassna El-Bouysidy; Fabian Årén; Alfonso Gallo-Bueno; Peter Bjørn Jørgensen; Arghya Bhowmik; Arnaud Demortière; Elixabete Ayerbe; Francisco Alcaide; Marine Reynaud; Javier Carrasco; Alexis Grimaud; Chao Zhang; Tejs Vegge; Patrik Johansson; Alejandro A Franco
Journal: Chem Rev Date: 2021-09-16 Impact factor: 72.087

5. Autonomous optimization of non-aqueous Li-ion battery electrolytes via robotic experimentation and machine learning coupling.

Authors: Adarsh Dave; Jared Mitchell; Sven Burke; Hongyi Lin; Jay Whitacre; Venkatasubramanian Viswanathan
Journal: Nat Commun Date: 2022-09-27 Impact factor: 17.694