Literature DB >> 32939165

Development of spectral decomposition based on Bayesian information criterion with estimation of confidence interval.

Hiroshi Shinotsuka¹, Kenji Nagata¹, Hideki Yoshikawa¹, Yoh-Ichi Mototake², Hayaru Shouno³, Masato Okada^1,4.

Abstract

We develop an automatic peak fitting algorithm using the Bayesian information criterion (BIC) fitting method with confidence-interval estimation in spectral decomposition. First, spectral decomposition is carried out by adopting the Bayesian exchange Monte Carlo method for various artificial spectral data, and the confidence interval of fitting parameters is evaluated. From the results, an approximated model formula that expresses the confidence interval of parameters and the relationship between the peak-to-peak distance and the signal-to-noise ratio is derived. Next, for real spectral data, we compare the confidence interval of each peak parameter obtained using the Bayesian exchange Monte Carlo method with the confidence interval obtained from the BIC-fitting with the model selection function and the proposed approximated formula. We thus confirm that the parameter confidence intervals obtained using the two methods agree well. It is therefore possible to not only simply estimate the appropriate number of peaks by BIC-fitting but also obtain the confidence interval of fitting parameters.

Entities: Chemical Disease Gene

Keywords: 404 Materials informatics / Genomics; 502 Electron spectroscopy; Bayesian estimation; X-ray photoelectron spectroscopy; exchange Monte Carlo method; pseudo-Voigt function; spectral decomposition

Year: 2020 PMID： 32939165 PMCID： PMC7476551 DOI： 10.1080/14686996.2020.1773210

Source DB: PubMed Journal: Sci Technol Adv Mater ISSN： 1468-6996 Impact factor: 8.090

Introduction

High-throughput measurements have become increasingly important for the efficient development of science and technology, and there is an urgent need to accumulate large amounts of spectral data. In X-ray photoelectron spectroscopy (XPS), which is a time-consuming characterization technique, the use of high-intensity synchrotron radiation and a high-sensitivity detector enables a rapid accumulation of large amounts of spectral data [1-3]. Matsumura et al. performed peak shift analysis of high-throughput XPS spectra using the expectation-maximization algorithm [4]. High-throughput data processing is therefore required for efficient spectral data analysis. Peak fitting is performed in the analysis of XPS spectra. Such fitting is usually carried out using the gradient method. This technique faces three main problems. The first is that the technique tends to find a local solution, the second is that the number of peaks cannot be estimated, and the third is that the confidence interval of fitting parameters cannot be evaluated. In the gradient method, the initial value of the parameter must first be given, and the result is readily affected by the initial value. Peak fitting requires the number of peaks to be determined at the beginning, but the gradient method does not show how many peaks are appropriate. The method is also susceptible to spectral noise, and although it is intuitively understood that the confidence interval of the fitting parameter is wide when there is much noise, there is no framework for evaluating the confidence interval. By incorporating informatics knowledge, we previously developed a low-cost and efficient method of obtaining appropriate models in terms of not only the fitting of parameters but also the number of peaks, even though the developed method is based on the gradient method [5]. Having many initial models allows us to search for pseudo-global solutions, and a Bayesian information criterion (BIC) allows us to obtain a model with an appropriate number of peaks. We refer to this technique as BIC-fitting in the present paper. However, it remains difficult to evaluate the confidence intervals of the fitting parameters with this technique. A spectrum decomposition technique based on Bayesian estimation has been proposed for quantitative evaluation of the confidence interval of fitting parameters [6]. This technique solves all three problems described above. By carrying out the model selection to a given spectrum on the basis of Bayesian estimation, we may be able to estimate not only peak parameters such as the peak position but also the number of peaks. Furthermore, when adopting this technique, global solutions can be searched for efficiently by performing optimization using algorithms in what is called the exchange Monte Carlo (EMC) method, even in the case that there is a local optimal solution. Through Bayesian estimation, all peak parameters can be optimized, and the confidence intervals of the parameters can also be determined using the standard deviation (STD) of the Bayesian posterior distribution. However, the EMC method has a huge computational cost and is difficult to use when analyzing many spectra. In this study, therefore, we develop an algorithm to calculate the confidence interval of fitting parameters obtained by the EMC method from the results of BIC-fitting. By generating various spectral data on a computer and applying Bayesian estimation, we obtain the behavior of model selection and the STD of the Bayesian posterior distribution for each peak parameter by computer simulation. In particular, in this paper, we cover the peak-to-peak distance between two peaks and the signal-to-noise (S/N) ratio of spectral data. As a result, we succeed in deriving an approximated model formula representing the relationship of the STD of the posterior distribution with the peak-to-peak distance and S/N ratio. We also apply the approximated model formula to real spectra. The confidence interval of the peak parameters obtained using the EMC method is compared with that obtained by applying BIC-fitting to the approximated formula. As a result, it is confirmed that the parameter confidence interval obtained by the EMC method can be reproduced by BIC-fitting and using the approximated formula. This approximated formula is applicable to not only BIC-fitting but also other optimization methods and can be used to estimate the parameter confidence interval to the same extent as when adopting the EMC method.

Calculation methods

Fitting model: pseudo-Voigt function

We first describe the model function used in this study. We consider fitting spectral data , where is the number of spectral data points, by summing pseudo-Voigt functions : The pseudo-Voigt function is frequently used in spectral decomposition. We here adopt the sum type of the pseudo-Voigt function, defined as a linear combination of the Gaussian and Lorentzian functions: where is the number of peaks. The fitting parameters are , where is the peak height, is the peak position, is the half width at half maximum (HWHM) of the peak, and is the Lorentz–Gauss mixing ratio of the pseudo-Voigt functions. In the peak fitting of XPS, the appropriate basis function is the Voigt function defined by the convolution of a Lorentzian function derived from the natural width and a Gaussian function derived from a device. Indeed, the pseudo-Voigt function, an approximated form of the Voigt function, is commonly used because of computational difficulty in peak fitting with the Voigt function [7]. A least-squares method is often used to optimize fitting parameters. In this method, parameters are obtained so as to minimize the error function representing the difference between the model function and the spectral data : In spectral decomposition, this problem becomes a nonlinear least-squares problem, and it is difficult to derive such an optimum solution analytically. It is therefore common to find the parameter that minimizes the error function based on the gradient method. However, there is a problem that the fitting result is easily trapped into a local solution depending on the selection of initial values. In addition, it is impossible to objectively determine the number of peaks from the data. The gradient method also has a problem that the confidence interval of the fitting parameter cannot be obtained. Bayesian estimation can solve these problems as we will see below [6].

Bayesian spectral deconvolution

Bayesian estimation is a framework in which the process of generating data in a probabilistic model is formulated and an estimation is made by tracing back the causal relationship using the Bayesian theorem [6,8]. By combining Bayesian estimation with the exchange Monte Carlo (EMC) method, we may be able to not only perform spectral deconvolution but also obtain confidence intervals for fitting parameters through Bayesian posterior probabilities. It is also possible to select a good model by comparing the Bayesian free energies of different models with different numbers of peaks. In this study, we call this method the Bayesian EMC method. Details are shown in Appendix A.

BIC-fitting

It is usually difficult to analytically evaluate the Bayesian free energy for model selection because it requires multiple integration on the parameter space. The BIC is obtained by approximating the multiple integration under the assumption that the likelihood function can be approximated with a Gaussian distribution for all parameters. The BIC is expressed as the sum of a likelihood term and a penalty term, and the model is selected on the basis of the trade-off between models. We have developed a low-cost and efficient method of obtaining appropriate models in terms of not only the fitting parameters but also the number of peaks using many initial models and the BIC [5]. The method searches many initial fitting models by changing the degree of smoothing, and then optimizes the peak parameters using the modified Levenberg–Marquardt method [9-11], which is one of the gradient methods. The goodness of the optimized models is ranked on the basis of the BIC, written as where is the maximum likelihood calculated from the likelihood between the measured spectrum and the model function obtained as a result of optimization. is the number of parameters included in the model function. When we ignore the background, is obtained using the number of peaks . The logarithm of the maximum likelihood can be obtained as Using the BIC values of optimized models as a criterion for model selection, we can select a simple model with reasonably good agreement and a moderate number of peaks. We hereafter refer to this technique as BIC-fitting in this paper. BIC-fitting can perform spectral fitting and model selection, but cannot obtain confidence intervals for fitting parameters. The purpose of this study is to extend the BIC-fitting method so that the confidence intervals of the fitting parameters can be obtained at the same time, using the results of simulation by the Bayesian EMC method. Models used in spectral decomposition are singular models whose parameters and properties do not correspond to each other [12,13]. In this case, the BIC may have penalty terms different from those in the exact evaluation of free energy, and the approximation of the BIC may affect the result of model selection. In Section 4, we compare the model obtained using the Bayesian EMC method with the model obtained from BIC-fitting, targeting the analysis of the measured XPS spectrum, and we discuss the effectiveness of BIC-fitting.

Simulation with artificial spectra

In the present study, we use the Bayesian spectral decomposition framework described in Section 2 to clarify the effects of the peak-to-peak distance in the true spectra and the S/N ratio of the data on the confidence interval of the estimated parameters. In this section, we discuss the simulations performed for verification.

Settings

In the simulation, we use spectral data artificially measured by computer simulation. For the data set , we take the number of data and between [0.0, 3.0] in steps of 0.01. Assuming that the number of peaks , we define the true spectral function used for data generation as Here, the true parameters are , and . We also fix and set at various peak-to-peak distances for discussion. We generate data in 32 patterns for the range . We add noise that follows a Gaussian distribution with zero mean and variance to the data and prepare 14 patterns of values in the range [0.0005, 10.0]. Hence, the total number of prepared data sets is . Three examples of artificially measured spectral data are shown in Figure 1, where Figure 1(a–c) present spectral data with settings of , , and , respectively.

Figure 1.

Three examples of artificially measured spectra with Gaussian noise. The solid line is the true curve and the dots are the artificially measured spectral data: (a) and , (b) and , and (c) and .

Three examples of artificially measured spectra with Gaussian noise. The solid line is the true curve and the dots are the artificially measured spectral data: (a) and , (b) and , and (c) and . The S/N ratio of the data can be defined using the value of the noise level . The intensity of the true spectrum is , and the signal intensity is thus 1.0. In this study, we define the S/N ratio as . The range of the S/N ratios in the simulation is [0.1, 2000]. We perform a Bayesian estimation for all data sets. We assume that the candidate numbers of peaks are one and two. The prior distribution for the number of peaks is thus defined as The prior distribution for each parameter is set as where , and are respectively the gamma distribution, Gaussian distribution, and uniform distribution: The prior distribution of the parameters of interest can be given by the analyst. However, it is possible to predict an appropriate distribution shape by considering the characteristics of the spectrum. For example, the peak height and width must be positive. We can infer the approximate range of peak height and width values by looking at the structure of the spectrum. We have adopted the gamma distribution as the distribution function with those characteristics. On the other hand, the peak position can move either to the positive or negative side. We then adopted the Gaussian distribution without any special boundary in the prior distribution. Since it is clear that the peak position is between 1 and 2, we set a Gaussian distribution with an average of 1.5 and an STD of 0.2 as in Eq. (12). The Lorentz–Gauss mixing ratio can range from 0 to 1 by definition. In this simulation, we decided not to impose any further constraints and adopted a uniform distribution for the Lorentz–Gauss mixing ratio. As settings of the Monte Carlo method, we use 50,000 Monte Carlo steps (MCSs) as a burn-in and then use 30,000 MCSs for sampling. The inverse temperatures in the EMC method are defined as [14] The value of and the total number of inverse temperatures in the equation are set according to the level of noise added to the data (Table 1).

Table 1.

Settings of the inverse temperature corresponding to Eq. (18).

σ	0.0005	0.001	0.002	0.005	0.01	0.02	0.05	0.1	0.2	0.5	1	2	5	10
S/N	2000	1000	500	200	100	50	20	10	5	2	1	0.5	0.2	0.1
γ	1.2	1.5	1.5	1.5	1.5	1.5	1.5	1.5	1.5	1.5	1.5	1.5	1.5	1.5
M	128	64	64	64	44	44	32	32	32	32	32	32	32	32

Settings of the inverse temperature corresponding to Eq. (18).

Results of Bayesian EMC method and derivation of approximated formula

We first show the results of model selection through Bayesian estimation. The results of model selection corresponding to the spectral data in Figure 1 are shown in Figure 2. The straight line represents the free energy and the histogram represents the posterior probability . It is seen that the correct number of peaks is estimated in Figure 2(a), whereas is estimated in Figure 2(b). In the case of Figure 2(b), the peak-to-peak distance is too small to extract information dividing the spectrum into two peaks at the given S/N ratio. In the case of Figure 2(c), there is no significant difference between and , and we cannot estimate the number of peaks because the S/N ratio is too small. Therefore, by performing model selection, we expect that we can determine the peak-to-peak distance and S/N ratio that are necessary for estimating the correct structure from the spectral data.

Figure 2.

(a)–(c) Results of model selection by Bayesian estimation respectively corresponding to spectral data in Figure 1(a)–(c).

(a)–(c) Results of model selection by Bayesian estimation respectively corresponding to spectral data in Figure 1(a)–(c). We then perform model selection for various peak-to-peak distances and S/N ratios. Results are shown in Figure 3. The abscissa represents the S/N ratio and the ordinate represents the peak-to-peak distance . The values in the figure indicate the posterior probability for . It is seen that when S/N < 0.5, the number of peaks cannot be estimated regardless of the peak-to-peak distance. Furthermore, we can clearly classify the region where or . We next discuss the posterior distributions for the parameters . The posterior probabilities for each parameter when Bayesian estimation is performed on the spectral data in Figure 1(a) are shown in Figure 4. The dashed lines in the figures indicate the true parameter values corresponding to the artificially measured spectral data. In the case of the peak position, the histogram shows the differences in peak positions and from their true values and , respectively. The results show that each parameter is estimated with good accuracy in that the distribution roughly includes the true value and appears similarly to a Gaussian distribution. As exceptions, the HWHM and Lorentz–Gauss mixing ratio for are distributed away from the true values. This might be due to the properties of the pseudo-Voigt functions and that there is large variability around the true values depending on the artificially measured spectral data. Details are given in Appendix B.

Figure 3.

Results of model selection for various peak-to-peak distances and S/N ratios. The values in the figure indicate the posterior probability for .

Figure 4.

Posterior distribution of each parameter when Bayesian estimation is performed on the spectral data in Figure 1(a). Dashed lines indicate the true parameter values used to generate the spectral data.

Results of model selection for various peak-to-peak distances and S/N ratios. The values in the figure indicate the posterior probability for . Posterior distribution of each parameter when Bayesian estimation is performed on the spectral data in Figure 1(a). Dashed lines indicate the true parameter values used to generate the spectral data. On the basis of the above results, we examine the STDs of the posterior distributions of various values and S/N ratios to obtain the confidence intervals of these parameters. A plot of the relationship between the S/N ratio and the STD of the posterior distribution for is shown in Figure 5. The results suggest that estimation of the Lorentz–Gauss mixing ratio is the most difficult because its STD is larger than the STDs of the other parameters. We also find that there is the relationship between the STD and the S/N ratio for any peak parameter. An exception is that the Lorentz–Gauss mixing ratio deviates from the relationship for S/N < . We believe that this is because the STD of in the posterior distribution cannot exceed the STD in the prior distribution defined by Eq. (14) . The STD as a function of the peak-to-peak distance for S/N = 100.0 is shown in Figure 6. The results indicate that the STD is stable for all parameters when > 0.4. Conversely, when , the STD is larger for smaller , indicating that the estimation is unstable. The estimation thus becomes unstable when is small because the two peaks overlap. According to this analysis, peak overlap begins to affect parameter estimation when is less than about 4 times the HWHM.

Figure 5.

STD of the posterior distribution as a function of the S/N ratio for Δ = 0.5. Triangles and inverted triangles are respectively the STDs of the parameters for the first and second peaks.

Figure 6.

STD as a function of the peak-to-peak distance for S/N = 100.0. Triangles and inverted triangles are respectively the STDs of the parameters for the first and second peaks.

STD of the posterior distribution as a function of the S/N ratio for Δ = 0.5. Triangles and inverted triangles are respectively the STDs of the parameters for the first and second peaks. STD as a function of the peak-to-peak distance for S/N = 100.0. Triangles and inverted triangles are respectively the STDs of the parameters for the first and second peaks. As a result of simulations for various artificial spectral data, we found the following features. First, when the peak-to-peak distance is large (especially ), the STD of any parameter has the relation . Next, when was plotted against the peak-to-peak distance for each parameter, we found that any curve with an arbitrary noise level can be approximated by a single curve. Also, as decreases, diverges to positive infinity. Although a function with such characteristics is not unique, it can be expressed as a power function + baseline as one of the candidates. However, it is necessary to adjust the position of the asymptote where the values diverge, the curvature of the curve, and the position where it reaches the baseline. Considering such requirements, we suggest an approximated formula as follows: where is the true peak height and is the peak-to-peak distance scaled by the HWHM of the true peak . Considering the characteristics of each peak parameter , we define the scaled STDs as and performed regression using Eq. (19). The fitting parameters in Eq. (19) are . Figure 7 shows schematic diagrams of values obtained using Eq. (19). Figure 7(a) shows as a function of S/N for and , where the peak-to-peak distance is sufficiently large. Figure 7(b) shows as a function of the peak-to-peak distance for several conditions of , when we set the prefactor of Eq. (19) . By adjusting the parameters and , we can express the STD of any peak parameter. A description of how to optimize the features and parameters of these functions is given in Appendix D. In the present experiments, we fix . Although there is no reason to fix the value of and it was decided heuristically, we confirmed that the fitting result was very good as shown in Figure 8. Using the same value for all parameters simplifies the formula and improves usability. The remaining three features and are used for regression, and we obtained them as shown in Table 2. The regression results are shown in Figure 8. The results show that approximate regression is achieved under all conditions. In addition, the range of operable S/N ratios of this equation differs between peak parameters; the ranges are presented in Appendix D.

Figure 7.

Schematic diagrams of values obtained using Eq. (19). (a) Scaled STD as a function of S/N for and , where the peak-to-peak distance is sufficiently large: and . (b) as a function of the peak-to-peak distance for several conditions of when we set the prefactor of Eq. (19) .

Figure 8.

Results of regression using the fitting function in Eq. (19) of the STDs of the posterior distributions for each parameter. Parameters are the (a) Lorentz–Gauss mixing ratio , (b) peak position , (c) peak height , and (d) HWHMs of the peaks .

Table 2.

Values of the fitted parameters 2 in Eq. (19).

Parameter	B_j	μ_j	D_j (fixed)	E_j
r	1.708	−4.626	2.5	−1.394
μ	0.324	−3.271	2.5	−0.216
h	0.355	−5.756	2.5	−0.540
w	0.504	−3.158	2.5	−0.760

Values of the fitted parameters 2 in Eq. (19). Schematic diagrams of values obtained using Eq. (19). (a) Scaled STD as a function of S/N for and , where the peak-to-peak distance is sufficiently large: and . (b) as a function of the peak-to-peak distance for several conditions of when we set the prefactor of Eq. (19) . Results of regression using the fitting function in Eq. (19) of the STDs of the posterior distributions for each parameter. Parameters are the (a) Lorentz–Gauss mixing ratio , (b) peak position , (c) peak height , and (d) HWHMs of the peaks . As will be described later in detail, even when the Bayesian EMC method is not used, the confidence interval of the peak parameter can be estimated by using Eq. (19), after we obtain a fitted spectrum by an optimization method such as the BIC-fitting method. As a further use of this formula, when the S/N ratio of the measured spectrum is known, we can estimate the peak-to-peak distance that achieves a certain confidence interval. Alternatively, when the peak-to-peak distance of the measured spectrum is known from the chemical shift, we can estimate the S/N ratio required to obtain the desired confidence interval of the parameter. This makes it possible to use Eq. (19) in experimental planning such as the setting of measurement time and energy resolution for individual measurements.

Simulation using a real spectrum

In this section, we analyze real XPS spectra to confirm the practicability of the approximated formula in the previous section. As an example of a real spectrum, we select a valence spectrum of SiO2 from the XPS spectrum databases provided in COMPRO software [15] (Figure 9). The binding energy varies from −10 to 40 eV with an energy step of 0.1 eV, resulting in 501 data points. There is a strong peak assigned to O2s in the vicinity of eV. There is a peak structure derived from the hybridization of O2p, Si3s, and Si3p at eV. We apply the Bayesian EMC method and BIC-fitting to this spectrum.

Figure 9.

Fitted spectra from Bayesian estimation (a) and BIC-fitting (b) for the experimental valence spectrum of SiO2. Open circles are the experimental spectrum, the orange line is the fitted spectrum, the green line is the background, and the black lines are all peaks above the background.

Model function with background

A real spectrum has a background. In addition to the superposition of the peaks of the pseudo-Voigt function of Eq. (1), the Shirley background is added to the model function [5]: where and are respectively the intensity of the spectrum on the high-binding-energy side and the low-binding-energy side in the analysis range. and are respectively the areas of peak intensity from the high-binding-energy side to and the area of the peak intensity from the low-binding-energy side to .

Settings of the Bayesian EMC method and BIC-fitting

In carrying out the Bayesian EMC method, we set the candidate number of peaks as . We set the prior distributions of each peak parameter as The values of the inverse temperature in the EMC method are and in Eq. (18). As a setting of the Bayesian EMC method, 80,000 MCSs are used for the burn-in and a subsequent 80,000 MCSs for the sampling. The computational conditions in BIC-fitting are the same as those in the literature [5].

Results and discussion

By the Bayesian EMC method, we can estimate the number of peaks as previously mentioned. By plotting the marginal likelihood and free energy as a function of the number of peaks , as shown in Figure 10(a), we estimated that with a probability of 97%. BIC-fitting can also be used to estimate the number of peaks. Figure 10(b) shows the BIC values as a function of the number of peaks , and the model with the smallest BIC is found at . Considering the properties of the singular model, the results of model selection using the BIC and free energy are not always in agreement [12,13], but in the case of this real spectrum, the same number of peaks are selected with the two methods. Note that the computation by the Bayesian EMC method takes 10.5 h, whereas BIC-fitting is completed in less than 3 min.

Figure 10.

Results of model selection through Bayesian estimation (a) and BIC-fitting (b) for the experimental valence spectrum of SiO2. The red circle in (b) indicates the model with the minimum BIC.

Results of model selection through Bayesian estimation (a) and BIC-fitting (b) for the experimental valence spectrum of SiO2. The red circle in (b) indicates the model with the minimum BIC. Figure 9(a) shows the fitted spectrum of the optimum solution obtained from the sampling results for by the Bayesian EMC method. The spectrum obtained by BIC-fitting is shown in Figure 9(b). We find that both methods reproduce the original spectrum well, and that the positions and shapes of individual peaks are similar. It is note that BIC-fitting derived models equivalent to the optimal solutions obtained using the Bayesian EMC method, despite the limited search space for the solutions. Using the sampling results of the Bayesian EMC method, we obtain the confidence interval from a posterior probability of each parameter. Principal component analysis (PCA) is performed to find the trend of sampled model groups. Figure 11 shows a two-dimensional heat map of the first and second principal components obtained by PCA with respect to the peak positions of the four peaks. Most models belong to the group enclosed by a red circle in the figure, and optimum solutions are included in this group. Models with features different from those of the optimal solution also appear with a posterior probability as high as 0.2%. We decide to exclude such minority models obtained using the PCA results and then evaluate the posterior probability.

Figure 11.

Two-dimensional histogram of PC1 and PC2 obtained by PCA of EMC sampling.

Two-dimensional histogram of PC1 and PC2 obtained by PCA of EMC sampling. We here focus on two peaks near the binding energies and eV. These two peaks are similar in height and overlap each other. We set the IDs of the two peaks to and . Specifically, Figure 12 shows the posterior probability densities of their peak parameters . Looking at Figure 12, the shapes of the distributions are almost Gaussian for the peak position, height, and width, indicating that the sampling was performed appropriately. The distribution of the Lorentz–Gauss mixing ratio is widely scattered within the defined region, which suggests that the ratio is difficult to estimate. The mixing ratio is related to the shape of the tail of the pseudo-Voigt function. Our results suggest that the shape of the tail of the peak is difficult to estimate, because the noise of the target spectrum is relatively high. The results of evaluating the STDs of the posterior probability distributions of these peaks are given in Table 3.

Figure 12.

Posterior probability densities of peak parameters for two peaks located at about and eV for the valence spectrum of SiO2.

Table 3.

Calculated confidence intervals of the posterior distribution for all peak parameters and those estimated using Eq. (19).

Parameter	Confidence interval for peak k = 1	Confidence interval for peak k = 2	Confidence interval estimated using Eq. (19) and the BIC-fitting model
μ	0.22	0.21	0.34
h	2.35	3.69	5.48
w	0.33	0.32	0.40
r	0.12	0.27	−

Calculated confidence intervals of the posterior distribution for all peak parameters and those estimated using Eq. (19). Posterior probability densities of peak parameters for two peaks located at about and eV for the valence spectrum of SiO2. The confidence interval of the parameters of the two peaks is obtained, using the peak parameters obtained by the approximated formula (19) and BIC-fitting. It is only necessary to know the estimated peak parameters without using the Bayesian EMC method. The approximated formula assumes that the heights and widths of the two peaks are identical, but this is not so for real spectra. In applying the approximated formula to the real spectrum, we decide to use the average of the two peak heights for and the average of peak widths for . The STD of noise is estimated as the root mean square of the residual between the original spectrum and the fitted spectrum. We further use the peak-to-peak distance obtained from the fitted spectrum. From the peak parameters of the two peaks obtained by BIC-fitting, we have . The estimated S/N ratio is therefore . The estimated confidence interval is given in the right-hand column of Table 3. The approximated formula reproduces well the actual confidence interval obtained from the posterior distribution. The S/N ratio is less than 10, and the Lorentz–Gauss mixing ratio is thus outside the applicable range of the approximated formula (19), and the confidence interval of cannot be calculated. The results confirm that we can estimate a confidence interval comparable to that estimated by the Bayesian EMC method using the gradient method and the approximated formula (19) when the fitting is good. The approximated formula (19) is also applicable to the case where the heights of the two peaks are more different. Details are shown in Appendix E. Approximated formula (19) was derived assuming that the spectrum consists of two pseudo-Voigt functions. These simplified two functions mean the nearest neighbor two peaks of the spectrum consists of more than two peaks. Whereas, we must be careful when a target peak is sandwiched between two peaks that are almost equally spaced because the tails of the target peak are overlapping with those of the other two peaks. The Lorentz–Gauss mixing ratio contributes to the shape of the tail of the peak. Therefore, it is difficult to estimate the Lorentz–Gauss mixing ratio of an inner target peak whose both tails are not clearly distinguished. By using the Bayesian EMC method, in principle, we may be able to obtain the confidence intervals of arbitrary parameters even when a target peak is sandwiched between two peaks and we have to consider what parameter should be used to make a model formula of STD. This will be a future work.

Conclusions

We developed a BIC-fitting method with confidence-interval estimation in spectral decomposition. By adopting the Bayesian EMC method, we may be able to not only estimate the number of peaks but also optimize peak parameters, such as the Lorentz–Gauss mixing ratio, in addition to the peak intensity, peak position, and peak width. Using this method, we may also be able to obtain the confidence interval through the STD of the Bayesian posterior distribution. We set various peak-to-peak distances and S/N ratios to generate data on a computer, and then applied Bayesian estimation to obtain the behavior of model selection and the STD of Bayesian posterior distributions for each peak parameter through computer simulation. As a result, an approximated formula expressing the relationship between the obtained STD and the peak-to-peak distance or S/N ratio was derived. In terms of practical use, we confirmed the usefulness of the approximated formula for a real valence spectrum of SiO2. The confidence interval of each parameter was estimated using the peak parameter obtained by BIC-fitting, and it was confirmed that the value agreed well with the confidence interval obtained directly from the posterior probability obtained using the Bayesian EMC method. In short, even with low-cost optimization methods such as BIC-fitting, we can now estimate confidence intervals of fitting parameters that are comparable to those estimated by high-cost Bayesian EMC methods. Using the approximated formula derived in this study, we may be able to estimate the S/N ratio required to obtain the desired parameters with the desired confidence interval, which will be useful in experimental planning such as the setting of measurement time and energy resolution for individual measurements. In this study, we treated the peak shapes of the XPS spectra as pseudo-Voigt functions. In practice, the suitable basis function is a Voigt function defined by the convolution of a Lorentzian function derived from the natural widths and a Gaussian function derived from a device. We will consider peak fitting based on convoluted Voigt functions in our future work.

6 in total