| Literature DB >> 33182274 |
Yu Mao1,2, Ningning Dong1,2, Lei Wang1,2, Xin Chen1,2, Hongqiang Wang1,2, Zixin Wang1,2, Ivan M Kislyakov1,2, Jun Wang1,2,3,4.
Abstract
Defects introduced during the growth process greatly affect the device performance of two-dimensional (2D) materials. Here we demonstrate the applicability of employing machine-learning-based analysis to distinguish the monolayer continuous film and defect areas of molybdenum disulfide (MoS2) using position-dependent information extracted from its Raman spectra. The random forest method can analyze multiple Raman features to identify samples, making up for the problem of not being able to effectively identify by using just one certain variable with high recognition accuracy. Even some dispersed nucleation site defects can be predicted, which would commonly be ignored under an optical microscope because of the lower optical contrast. The successful application for classification and analysis highlights the potential for implementing machine learning to tap the depth of classical methods in 2D materials research.Entities:
Keywords: 2D materials; Raman spectrum; machine learning; random forest algorithm
Year: 2020 PMID: 33182274 PMCID: PMC7695331 DOI: 10.3390/nano10112223
Source DB: PubMed Journal: Nanomaterials (Basel) ISSN: 2079-4991 Impact factor: 5.076
Figure 1(a) Optical image of the MoS2 sample. The inset shows the height profile, and the atomic force microscopy (AFM) profile is taken along the gray line drawn on the optical image. (b) Photoluminescence (PL) spectra of the monolayer and bilayer areas. (c) Raman spectra of the monolayer, crack and bilayer areas.
Figure 2(a) K-means algorithm clustered image for the selected sample with monolayer (cyan), bilayer (dark cyan), and crack (light cyan) regions. Raman spectral mapping of the (b) Pos() and the (c) Pos(A1g). (d) Raman spectral mapping of the frequency difference between the Pos( ) and the Pos(A1g).
Figure 3Raman spectral intensity mapping of the and the A1g peaks. Gray areas and the newly added yellow and red areas indicate the situation when the and the A1g peak intensity drops below (a,d) 68%, (b,e) 70% and (c,f) 72% of the monolayer signal, respectively.
Figure 4Basic architecture of the learning procedure in the random forest method. Each small square represents a spatial measurement point carrying Raman characteristic information. The subtraining sets from 1 to n are acquired by a bootstrap sampling process, and then decision trees based on these subtraining sets can be built. The out-of-bag data of each tree can be used to estimate the effectiveness of the trained random forest model.
Figure 5(a) Basic architecture of the prediction procedure in the random forest method. The new samples from the untrained data are judged through each tree one by one, and the final output results are acquired by the majority voting process. (b–e) The predicted pictures for different samples with crack (grown), monolayer (grass green) and bilayer (dark green) areas. The dispersed dots shown in Figure 5b are predicted to be bilayer. The inset figures in Figure 5b show the corresponding optical micrograph (left inset) and Raman mapping of the input variable ε (right inset). The other inset figures show the corresponding optical micrographs. Scale bars indicate 1 μm.
Figure 6(a) Receiver operating characteristic (ROC) curves for the crack and bilayer identification only use the characteristic of the Raman frequency difference. Cyan and green curves show the ROC for two-class classifications of crack/others and bilayer/others, respectively. The red line corresponds to the situation of a random guess. (b) Precision-recall (PR) curves for the crack and bilayer identification only use the characteristic of the Raman frequency difference.