| Literature DB >> 32103085 |
Rui Wang1, Fei Liang1,2, Zheshuai Lin3,4.
Abstract
Combining high-throughput screening and machine learning models is a rapidly developed direction for the exploration of novel optoelectronic functional materials. Here, we employ random forests regression (RFR) model to investigate the second harmonic generation (SHG) coefficients of nonlinear optical crystals with distinct diamond-like (DL) structures. 61 DL structures in Inorganic Crystallographic Structure Database (ICSD) are selected, and four distinctive descriptors, including band gap, electronegativity, group volume and bond flexibility, are used to model and predict second-order nonlinearity. It is demonstrated that the RFR model has reached the first-principles calculation accuracy, and gives validated predictions for a variety of representative DL crystals. Additionally, this model shows promising applications to explore new crystal materials of quaternary DL system with superior mid-IR NLO performances. Two new potential NLO crystals, Li2CuPS4 with ultrawide bandgap and Cu2CdSnTe4 with giant SHG response, are identified by this model.Entities:
Year: 2020 PMID: 32103085 PMCID: PMC7044425 DOI: 10.1038/s41598-020-60410-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(a) Crystal structures of cubic-stacked DL compounds, AX, ABX2 and A2BCX4. (b) The anion-centered tetrahedron as the basic unit in DL structures. (c) The screening workflow for DL-type crystals stacked cubically.
Figure 2Overall workflow for the machine learning model. Four atomic or structural features are generated for the Random Forests Regression, in which “bootstrapping”[55] and performance evaluation are used to validate the model, leading to a predictive model for the SHG coefficient of NLO crystals.
Space groups, band gaps and SHG coefficients of DL-type crystals in the test set.
| Formula | Space Group | SHG | |||
|---|---|---|---|---|---|
| P. v.§ | C.v.§ | E.v.§ | |||
| CdSe | F-43m | 1.74 | 39.18 | 42.16 | 40[ |
| GaAs | F-43m | 1.42 | 94.73 | 126.46 | 119[ |
| AgGaS2 | I-42d | 2.64 | 11.23 | 16.64 | 13.4[ |
| AgGaSe2 | I-42d | 1.80 | 73.83 | 67.99 | 41.4[ |
| ZnGeP2 | I-42d | 2.05 | 74.03 | 78.57 | 68.9[ |
| CdGeAs2 | I-42d | 0.57 | 254.32 | 194.04/(904.08*) | 236[ |
| Cu2CdSnS4 | I-42m | 1.80 | 34.46 | 25.42 | 31[ |
| Li2SrGeS4 | I-42m | 3.75 | 5.48 | 4.75 | 0.5*AGS[ |
| Li2SrSnS4 | I-42m | 3.10 | 5.76 | 6.64 | 0.8*AGS[ |
| Cu3SbS4 | I-42m | 0.88 | 56.47 | 66.06 | |
| Cu2CdSnSe4 | I-42m | 0.98 | 61.68 | 71.80/(223*) | |
| Cu2CdSnTe4 | I-42m | 0.80 | 239.05 | 209.05/(528*) | |
| Li2CuPS4 | I-4 | 3.30 | 7.66 | 6.40 | |
*First-principles value when upshifting the bands to agree the experimental band gap.
§P.v., C.v. and E.v. refer to RFR predicted value, first-principles value and experimental value, respectively.
Figure 3Performance evaluation and model prediction. (a) The RMSE and R2 of training and validation set with the change of number of estimators. (b) Comparison of DFT training data or experimental data with RFR model predictions for dij. Blue circles represent the training and validation data; red squares represent the test data and the y axis is the calculated value; black squares represent the test data and the y axis is the experimental value. The error is less as the point approaching the green line (y = x).