| Literature DB >> 32939174 |
Jaekyun Hwang1, Yuta Tanaka2, Seiichiro Ishino3, Satoshi Watanabe1,4.
Abstract
We propose a novel descriptor of materials, named 'cation fingerprints', based on the chemical formula or concentrations of raw materials and their respective properties. To test its performance, this method was used to predict the viscosity of glass materials using the experimental database INTERGLAD. Using artificial neural network models, we succeeded in predicting the temperature required for glass to have a specific viscosity within a root-mean-square error of 33.0°C. We were also able to evaluate the effect of particular target raw materials using a model trained without including the specific target raw material. The results show that cation fingerprints with a neural network model can predict some unseen combinations of raw materials. In addition, we propose a method for estimating the prediction accuracy by calculating cosine similarity of the input features of the material which we want to predict.Entities:
Keywords: 107 Glass and ceramic materials; 404 Materials informatics/Genomics; 60 New topics/Others; Chemical composition; glass; machine learning; materials informatics; neural network; oxide; viscosity
Year: 2020 PMID: 32939174 PMCID: PMC7476533 DOI: 10.1080/14686996.2020.1786856
Source DB: PubMed Journal: Sci Technol Adv Mater ISSN: 1468-6996 Impact factor: 8.090
Figure 1.Schematic diagram of the generation of cation fingerprints.
Hyperparameter tuning results. The optimization details of the other prediction models are listed in the supplementary material.
| Hyperparameter | Search space | Optimized value |
|---|---|---|
| Prediction model | Neural network, random forest, support vector machine, kernel ridge regression, linear ridge regression | Neural network |
| Activation function | Sigmoid, Tanh, ReLU | ReLU |
| Number of hidden layers | 1 to 6 | 2 |
| Number of neurons in each hidden layer | 70, 100, 140, 200 | 140 |
| L2 regularization scale | 0, 10−5 to 10−2 | 0.0001 |
| Number of physical properties | 1 to 7 | 7 |
| Number of bins | 5, 8, 10, 15, 18, 20, 24, 30, 50, 100 | 20 |
Figure 2.Prediction error of ANN models from different number of bins with seven physical properties. The two parallel dashed lines represent the prediction error from the elemental attributes and molar ratio, respectively.
Test set root mean squared error (°C) of ANN training from one (diagonal components) or two physical properties. Training was done by (a) cation fingerprints and (b) elemental attributes. Each physical property is indicated by one capital letter (A: Electronegativity, B: coordination number, C: density, D: ionic radius, E: formation enthalpy, F: formation entropy, and G: melting point).
Figure 3.Test error of ANN training from fewer physical properties. Each line represents the worst and the best choices from each combination.
Nine target materials with their respective test set size and training set size.
| Raw materials | MgO | Li2O | ZrO2 | PbO | SrO | P2O5 | TiO2 | Sb2O3 | SnO2 |
|---|---|---|---|---|---|---|---|---|---|
| Test set size | 3631 | 3044 | 2673 | 2593 | 2518 | 2036 | 1895 | 1775 | 1307 |
| Training set size | 8472 | 9059 | 9430 | 9510 | 9585 | 10,067 | 10,208 | 10,328 | 10,796 |
Figure 4.Penalty dependence of averaged root mean squared error of the nine test sets fromTable 4.
Root mean squared error (RMSE) of test sets with various L2 regularization scales.
| Raw materials | MgO | Li2O | ZrO2 | PbO | SrO | P2O5 | TiO2 | Sb2O3 | SnO2 | Average |
|---|---|---|---|---|---|---|---|---|---|---|
| RMSE of test set (°C) with L2 = 0.0001 | 63.06 | 92.87 | 53.32 | 148.43 | 56.69 | 453.72 | 62.14 | 73.58 | 31.23 | 115.00 |
| RMSE of test set (°C) with L2 = 0.001 | 57.92 | 80.36 | 52.67 | 129.28 | 55.13 | 271.85 | 52.94 | 62.45 | 30.56 | 88.13 |
| RMSE of test set (°C) with L2 = 0.01 | 51.71 | 76.77 | 52.56 | 100.73 | 54.20 | 168.40 | 48.71 | 45.73 | 30.55 | 69.93 |
| RMSE of test set (°C) with L2 = 0.3 | 48.68 | 64.50 | 48.61 | 60.89 | 54.81 | 81.91 | 47.34 | 39.74 | 30.27 | 52.97 |
| RMSE of test set (°C) with L2 = 1.0 | 53.33 | 66.21 | 50.93 | 86.53 | 58.00 | 80.52 | 47.16 | 39.15 | 37.20 | 57.67 |
Figure 5.Distribution of first two principal components of fingerprints. Test sets are divided by (a) randomly selected 20%, (b) P2O5 test case, (c) TiO2 test case, and (d) Li2O test case.
Figure 6.Maximum cosine similarity for training data (e.g. one dot of 0.99 means that this test data has no training data with cosine similarity larger than 0.99.) versus absolute error in Li2O and SnO2 test sets from Table 4. Dashed line represents the changes in mean absolute error over similarity range.
One example of Li2O test material and the top three training materials with similar fingerprints.
| Raw oxides composition (mol %) | Isokom temperature at 106.6 Pa·s | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Fingerprint similarity | SiO2 | Al2O3 | PbO | As2O3 | CaO | La2O3 | TiO2 | Li2O | Experimental value | Predicted value | |
| Test material | 70.77 | 14.91 | 1.88 | 0.10 | 12.34 | 944 | 980.4 | ||||
| Training materials | 0.9916 | 81.33 | 13.69 | 2.50 | 2.48 | 760 | 741.1 | ||||
| 0.9906 | 76.63 | 13.89 | 3.17 | 6.31 | 740 | 732.3 | |||||
| 0.9900 | 82 | 12 | 4 | 2 | 1110 | 1087.5 | |||||