Literature DB >> 28357910

Reverse Intensity Correction for Raman Spectral Library Search.

Jun Zhao1, Kristen Frano1, Jack Zhou1.   

Abstract

A reverse intensity correction method was developed for spectral library searches to correct for instrument response without the side effect of magnifying the noise in the low responsivity region of test spectra. Instead of applying relative intensity correction to the sample test spectra to match the standardized library spectra, a reverse intensity correction is applied to the standardized library spectra to match the uncorrected sample spectrum. This simple procedural change improves library search performance, especially for dispersive charge-coupled device Raman analyzers using near-infrared excitations, where the instrument response often varies greatly across the spectral range, and signal-to-noise ratio in the low responsivity regions is typically poor.

Entities:  

Keywords:  CCD; Intensity correction; Raman spectroscopy; SEP; charge-coupled device; handheld Raman spectrometer; hit quality index; reverse intensity correction; separating power; smooth reverse intensity correction; spectral library search

Year:  2017        PMID: 28357910      PMCID: PMC5542133          DOI: 10.1177/0003702817694381

Source DB:  PubMed          Journal:  Appl Spectrosc        ISSN: 0003-7028            Impact factor:   2.388


Introduction

Raman spectroscopy is gaining widespread acceptance as a chemically specific identification tool, as manifested by the rapid growth of handheld analyzers in the past decade. A commonly used method to identify unknown material is searching through spectral libraries by means of hit quality index (HQI), which is a measure of similarity between a pair of spectra. There are a multitude of published algorithms to compute the HQI, for example, methods based on spectral correlation, Euclidean distance, least square, sum of absolute difference, and vector dot product.[1,2] Frequently, the test spectra and the library spectra are acquired on different instruments. For meaningful comparison, such spectra must be intensity corrected to calibrate out the unit-to-unit variations in spectral response. Earlier methods developed to standardize Raman spectral relative intensities used white light irradiance sources.[3-5] McCreery’s group pioneered the work of using luminescent glass as a convenient source that can easily reproduce the sampling condition.[6-8] This methodology was later adopted by the National Institute of Standards and Technology (NIST),[9] which now produces certified luminescent glass relative intensity standards for various excitation wavelengths. Both white light and luminescence standards are used presently in research and commercial instrumentation and have facilitated meaningful comparison of Raman spectra acquired on different instruments, even of different excitation wavelengths. Consequently, commercial Raman spectral libraries are all standardized with relative intensity correction. To match the standardized library spectra, the sample spectrum acquired on a test instrument must also be corrected for the instrument response. This is a widely accepted practice, and is incorporated in many commercial Raman library search algorithms. However, there is a side effect with this approach, that is, the intensity correction magnifies the contribution to the HQI from spectral regions of lower responsivity, where the signal-to-noise (S/N) ratio is often poor. As will be demonstrated in this manuscript, this results in reduced specificity. This effect is particularly prominent in handheld analyzers using near-infrared (NIR) excitations and charge-coupled device (CCD) detectors, where the instrument response often varies greatly across the spectral range, and S/N ratios in the low responsivity regions are typically poor. In this paper, we propose an alternative approach of spectral intensity correction toward HQI computation. Instead of applying the NIST-recommended intensity correction to the test spectra to match the standardized library spectra, we apply the instrument’s response curve in reverse to the standardized library spectra to match the uncorrected test spectra. It will be demonstrated that this reverse intensity correction (RIC) method and its variant smooth reverse intensity correction (SRIC) significantly improves spectral search performance for instruments with relatively high hardware noise and highly varying response curves, which is typical of handheld CCD Raman analyzers with NIR excitations.

Theory

The instrument response curve of a Raman spectrometer, represented here by a vector , is obtained by measuring a spectrum of a standard source of known spectral shape : Here the division is operated point-to-point for each wavelength element, as are all other division and multiplication operations between spectral vectors in subsequent equations, and the wavelength elements are obviated for the sake of simplicity. An intensity corrected Raman spectrum of a sample is obtained by measuring its uncorrected Raman spectrum , and multiplying the response curve . This is compared with intensity corrected library spectrum to yield a hit quality index for library searching. In practice, the derivative of the spectra is normally used, to remove the influence of backgrounds.[10] The RIC algorithm works in a similar way, except that the standardized library spectrum is corrected with the inverse of the response curve of the test instrument (not necessarily the instrument that measured the library spectrum), and the hit quality index for the pair (, ) is used to rank the library candidates. Typically, the spectra in commercial libraries are collected on high performance instruments with sufficient intensity to provide good S/N ratio over the entire spectral range, such that applying RIC to does not produce significant side effect due to noise magnification. The spectrum represents what would be obtained if the library spectrum was acquired on the test instrument without intensity correction, which is now directly comparable with test spectrum . Some instruments incorporate filters or back thinned CCD sensors that have etaloning effects which cause periodic modulations to the response curve. The standard intensity correction process when done properly has the benefit of removing such modulations, which would be difficult to deal with by other means. The RIC method in this case would suffer from unremoved modulation that may negatively impact the results. We realize that whatever correction curve is applied to the test or library spectra, the end effect is the assignment of weighting factors to the spectral elements toward HQI calculation. The standard intensity correction applies higher weights to the lower responsivity (and frequently a low S/N ratio) region of the test spectrum, while the RIC method keeps the test spectrum unchanged, i.e., applies a flat curve to avoid magnifying the noisy region. Therefore, the straight RIC procedure can be modified such that it first performs the standard intensity correction which will remove the modulation, followed by the division of a weighting curve that is free of such modulations yet models the overall response curve shape. If we fit the measured spectrum with a polynomial of sufficiently low order, or smooth it by removing the high frequency details, for example with Fourier smoothing, we can obtain a modulation free spectrum ′. The choice of polynomial order or degree of smoothing can be somewhat subjective, so long as the overall shape of is retained. A smooth response curve ′ that reflects the overall shape of the response curve can be obtained by The SRIC works by forward correcting the test spectrum with , and then reverse correcting the result, along with the library spectrum , with ′: We call ′ the smooth uncorrected test spectrum and ′ the smooth reverse intensity corrected library spectrum. The HQI between the pair (′, ′) is then used to quantify their similarity. It will be demonstrated through simulation that the SRIC procedure achieves the dual purpose of removing the modulation and maintaining the overall shape of the instrument response curve. The performance of an HQI algorithm can be measured by how well it can separate spectrally similar compounds. If a large number of tests are performed using each of two compounds A and B, and the HQI is calculated against their library spectra, then we can define a parameter separating power (SEP) as: where and are the mean and standard deviation of the HQI values of test compound A against library candidate B. Normally, , and SEP is non-negative. In general, A and B are not commutable. The higher the SEP, the better the separation. Given the same test data set, the algorithm that yields higher SEP will in general produce a higher specificity.

Experimental

All chemicals used were purchased from Sigma Aldrich. National Institute of Standards and Technology SRM 2241 luminescence standard was purchased from NIST. Solid materials were placed inside 0.057 mm thick low-density polyethylene bags and their Raman spectra acquired through the bag. Liquid samples were placed inside glass vials and measured using a glass vial adaptor. Being able to identify contents without opening the container is a key advantage of Raman over some other analytical techniques and this is how handheld Raman analyzers are typically used by end users. The instrument performance, including the Raman shift and the overall Raman intensity was validated prior to use with the factory supplied polystyrene reference sample and the built-in validation function. Benzonitrile spectra were measured using a handheld analyzer TacticID (B&W Tek) and a liquid vial adaptor. The analyzer is equipped with a CCD sensor and a CW 785 nm laser with an unpolarized output of 300 mW at full power. The CCD temperature was not controlled and was well above the ambient during operation. All other spectra were measured using a handheld analyzer NanoRam (B&W Tek), which is equipped with a temperature controlled CCD sensor and a laser of the same specification as the TacticID. For both instruments, the wavelength axis was calibrated with emission lines from an argon pen lamp,[11] and the laser wavelength was calibrated using the cyclohexane 801.3 cm−1 Raman peak.[12] Both analyzers have the same specified spectral range of 176–2900 cm−1, but the actual spectral coverages extend beyond 3200 cm−1. The spectrograph resolution measured as the full width half-maximum (FWHM) of the 912 nm argon emission line was 7.3 cm−1 and 8.6 cm−1 for the TacticID and NanoRam, respectively, and the Raman resolution measured using the 1085 cm−1 calcite peak according to method developed by Bowie et al. and adopted by ASTM E2529-06 was 9.6 cm−1 and 9.1 cm−1, respectively (this is different from the spectrograph resolution because a, it incorporates the laser linewidth, and b, it is measured at a different wavelength).[13,14] The laser power was set to 90% for all measurements, and the integration time was automatically controlled to produce at least 20 000 counts at the pixel of strongest intensity. For benzonitrile, this was typically 0.5–1.0 s. For the stearic acid and magnesium stearate samples, this was typically 10–20 s. After each Raman acquisition, the laser was immediately turned off and a dark spectrum automatically acquired with the same integration time. The dark spectrum was subtracted from the Raman spectrum to yield the pure Raman spectrum. The intensity correction curve for each instrument was obtained by measuring the spectrum of the luminescence standard, and applying the intensity polynomial supplied by the NIST certificate. For HQI calculation, spectra are first interpolated to 4 cm−1 spacing, and then their second derivatives are calculated using the Savitzky–Golay filters.[15,16] The Pearson Correlation Coefficient, following the definition of Baer et al.,[17] was used as the HQI between the two derivative spectra. Results in Figures 1–4 are obtained by downloading the spectra from the analyzers and processing in Excel using macro programs written in Visual Basic language. The library spectra in these figures were obtained by averaging 20 scans and the test spectra were all single scans. Results in Table 1 were obtained using the Investigation mode of the NanoRam’s onboard software NOS. The standard intensity correction algorithm was implemented in NOS V5.12. A library containing the test compounds was created in NOS V5.12 using the onboard Library utility, where each library spectra was the average of three scans. Twenty scans of each test compound were performed using the Investigation mode and the HQI values recorded. The NOS was then upgraded to v5.14, which implemented the RIC algorithm. Another set of 20 scans was taken for each test sample after the software upgrade.
Figure 1.

Reduced precision in HQI caused by magnified noise as a result of the relative intensity correction of the sample spectrum of benzonitrile. (a) Uncorrected spectrum; (b) instrument response curve; (c) intensity corrected spectrum; (d) intensity corrected benzonitrile library spectrum; (e) HQI value distribution of 24 consecutive scans using the spectral range of 200–3200 cm−1; and (f) HQI value distribution of the same 24 scans using the spectral range of 200–2800 cm−1. Y-axes for (a)–(d) are intensities in arbitrary units.

Figure 2.

Spectral comparison of the standard intensity correction and the reverse intensity correction of two weak Raman scatterers, magnesium stearate (a)–(d), and stearic acid (e)–(h), measured through 0.057-mm thick polyethylene bags: (a) and (e) uncorrected sample spectra, with the circled area exhibiting differences; (b) and (f) intensity corrected sample spectra; (c) and (g) library spectra; and (d) and (h) reverse intensity corrected library spectra. Y-axes in all figures are spectral intensities in arbitrary units.

Figure 3.

Hit quality index values of 20 magnesium stearate spectra and 20 stearic acid spectra using the standard intensity correction of sample spectra at three different spectral ranges (a)–(c), in comparison with using the RIC of library spectra (d)–(f). Note the difference in scale: (a) and (d) 200–3200 cm−1; (b) and (e) 200–2800 cm−1; and (c) and (f) 200–2500 cm−1. The markers are slightly offset in the x-axis for the two reference materials for clarity.

Figure 4.

Separating power of stearic acid, magnesium stearate shown in (a), and SEP of magnesium stearate, stearic acid in (b), using the standard intensity correction and RIC at three different spectral ranges. The high limits of the three spectral ranges in Raman shift (cm−1) are indicated by the x-axis labels.

Table 1.

Performance comparison of standard intensity correction and reverse intensity correction for Raman spectral library search, listing the mean and standard deviation (STD) of HQI values of 20 scans for seven compounds.

Intensity corrected
Reverse intensity corrected (RIC)
Improvement (%)
MeanSTDMean/STDMeanSTDMean/STD
Hypromellose0.9710.007461300.9860.00371266104
Cellulose0.6300.058210.80.8520.021240.2271
Caffeine0.9960.001357380.9990.000185.55 × 103652
Sucrose0.9710.020347.70.9770.014965.537
Sodium bicarbonate0.9980.000741.35 × 10310.00011.00 × 104641
Cyclohexane0.9940.002084780.9990.0000911 1002.22 × 103
Ethanol0.9980.000631.58 × 10310.000081.25 × 104689
Reduced precision in HQI caused by magnified noise as a result of the relative intensity correction of the sample spectrum of benzonitrile. (a) Uncorrected spectrum; (b) instrument response curve; (c) intensity corrected spectrum; (d) intensity corrected benzonitrile library spectrum; (e) HQI value distribution of 24 consecutive scans using the spectral range of 200–3200 cm−1; and (f) HQI value distribution of the same 24 scans using the spectral range of 200–2800 cm−1. Y-axes for (a)–(d) are intensities in arbitrary units. Spectral comparison of the standard intensity correction and the reverse intensity correction of two weak Raman scatterers, magnesium stearate (a)–(d), and stearic acid (e)–(h), measured through 0.057-mm thick polyethylene bags: (a) and (e) uncorrected sample spectra, with the circled area exhibiting differences; (b) and (f) intensity corrected sample spectra; (c) and (g) library spectra; and (d) and (h) reverse intensity corrected library spectra. Y-axes in all figures are spectral intensities in arbitrary units. Hit quality index values of 20 magnesium stearate spectra and 20 stearic acid spectra using the standard intensity correction of sample spectra at three different spectral ranges (a)–(c), in comparison with using the RIC of library spectra (d)–(f). Note the difference in scale: (a) and (d) 200–3200 cm−1; (b) and (e) 200–2800 cm−1; and (c) and (f) 200–2500 cm−1. The markers are slightly offset in the x-axis for the two reference materials for clarity. Separating power of stearic acid, magnesium stearate shown in (a), and SEP of magnesium stearate, stearic acid in (b), using the standard intensity correction and RIC at three different spectral ranges. The high limits of the three spectral ranges in Raman shift (cm−1) are indicated by the x-axis labels. Performance comparison of standard intensity correction and reverse intensity correction for Raman spectral library search, listing the mean and standard deviation (STD) of HQI values of 20 scans for seven compounds.

Simulation

Etaloning effect is simulated by multiplying a sine wave ) to the measured Raman spectra and the luminescence spectrum of a NIST 2241 standard material. Modulation-free spectrum ′ of the NIST 2241 standard is obtained by fitting its modulated spectrum with a fourth order polynomial. The response curve c and the modified smooth response curve c′ are then obtained via Eqs. 1 and 4, respectively. The parameter depth is varied and HQIs are calculated using both the RIC and SRIC algorithms. While actual modulation in real devices are usually more complex, the sine wave simulation provides an easy way to gauge the effectiveness of the two algorithms.

Results and Discussion

Figure 1 illustrates the noise magnification effect of the standard intensity correction procedure. Figure 1a and 1c are a representative spectrum of benzonitrile before and after intensity correction, respectively. Compared with the library spectrum in Figure 1d, it is obvious that the response toward the higher Raman shift end is significantly lower than the fingerprint region, as manifested by the response curve in Figure 1b. The intensity correction magnified the 3072 cm−1 peak and made the overall spectral shape similar to the library, and results in a correlation coefficient improvement from 0.59 between Figure 1a and 1d to 0.96 between Figure 1c and 1d. However, noise in this region is equally magnified. This results in a large scattering in the HQI of the 24 scans, as shown in Figure 1e. If the noisiest region between 2800 and 3200 cm−1 is excluded from the calculation, then the precision is greatly improved, as shown in Figure 1f. The standard deviation of the HQI in Figures 1e and 1f are 0.012 and 0.0004, respectively. The average HQI also sees a moderate improvement, from 0.972 to 0.989. As library search relies on clear separation in HQI to distinguish search candidates, it can be expected that the poor precision induced by the magnified noise will cause degradation in the specificity of the search, and this would be particularly problematic for weak Raman scattering samples. Figure 2 illustrates this effect. The spectra in Figure 2a and 2e are spectra of magnesium stearate and stearic acid powders contained in 0.057 mm thick low density polyethylene bags. Although polyethylene itself has a spectrum that is very similar to the two compounds of interest, its intensity contribution to the measured spectra is estimated to be less than 5%. Visual inspection of the uncorrected spectra in Figure 2a and 2e indicates the two compounds have similar but distinguishable Raman features. However, their difference is diminished after the intensity correction, as shown in both test spectra in Figure 2b and 2f, as well as library spectra in Figure 2c and 2g. This is mainly due to their very similar features in the CH stretch region, which is magnified significantly along with the noise. Indeed, the HQI values of the total of 40 intensity corrected scans against the library spectra in Figure 2c and 2g are thoroughly mixed, as shown in Figure 3a. In contrast, the HQI values of the same 40 scans are clearly separated into two groups for each of the two reverse corrected library spectra in Figure 2d and 2h, as shown in Figure 3d. If the poor results in Figure 3a are due to the magnified noise, then the SEP of the standard algorithm should improve as the high end of the spectral range is successively reduced, although this would come with the risk of losing useful information for unknown samples and is therefore undesirable. Nevertheless, this is verified in Figure 3b and 3c. In contrast, the separating power of the RIC algorithm is largely invariant, as shown in Figure 3d–f, and further quantified in Figure 4. The effect of periodic modulation in the response curve is simulated and presented in Figure 5. Figure 5a is the measured NIST SRM 2241 spectrum. A sine curve is multiplied in Figure 5a to obtain a simulated NIST SRM 2241 spectrum in Figure 5c with a modulation depth of 0.25. A fourth order polynomial fitting of provides the modified NIST SRM 2241 spectrum , also shown in Figure 5c. The same sine curve is multiplied with a measured spectrum of stearic acid to obtain the simulated sample spectrum as in Figure 5b. The response curve and the smooth response curve are calculated according to Eqs. 1 and 4, and shown in Figure 5d. The smooth uncorrected sample spectrum ′ is calculated using Eq. 5, and shown in Figure 5e. The smooth reverse intensity corrected library spectrum shown in Figure 5f is calculated using Eq. 6, where the library spectrum in Figure 2g is used as . HQI with the straight RIC method is calculated using and (not shown). HQI with the SRIC method is calculated using and . The resulting SEP (stearic acid, magnesium stearate) is displayed in Figure 6. Clearly, as the modulation depth increases, the performance of the RIC method degrades, while the SRIC remains constant, and always better than RIC. Even without the added modulation, the SRIC method yields SEP = 5.45, which is slightly better than the RIC’s 5.30. This is because the as measured NIST SRM 2241 spectrum (Figure 5a) already contains a slight modulation. Although not shown, SEP (magnesium stearate, stearic acid) has a similar trend.
Figure 5.

Demonstration of periodic modulation in response curve and SRIC, with depth = 0.25. (a) Measured NIST SRM 2241 luminescence spectrum; (b) simulated uncorrected spectrum of stearic acid; (c) simulated NIST SRM 2241 spectrum (solid curve) and the fourth-order polynomial fit (dotted smooth curve); (d) response curve (solid curve) and modified response curve (dotted smooth curve), with the insert showing the details between 200 and 1400 cm−1; (e) , and (f) .

Figure 6.

Separating power as a function of response curve modulation depth for both RIC and SRIC algorithms.

Demonstration of periodic modulation in response curve and SRIC, with depth = 0.25. (a) Measured NIST SRM 2241 luminescence spectrum; (b) simulated uncorrected spectrum of stearic acid; (c) simulated NIST SRM 2241 spectrum (solid curve) and the fourth-order polynomial fit (dotted smooth curve); (d) response curve (solid curve) and modified response curve (dotted smooth curve), with the insert showing the details between 200 and 1400 cm−1; (e) , and (f) . Separating power as a function of response curve modulation depth for both RIC and SRIC algorithms. Although not presented, changing the polynomial order of fitting the spectrum from 4 to 5 or 6 made no visible difference to the result in Figure 6. Table 1 compares the HQI values of several compounds measured with the standard intensity correction algorithm and with RIC using a spectral range of 176–2500 cm−1. These include very strong scatterers with low fluorescence such as cyclohexane and very weak scatters with high fluorescence such as cellulose. As shown, the precision is improved for all samples using the RIC method, as demonstrated by the varying amount of reduction in the standard deviation. The average HQI also increase slightly, with cellulose being a notable exception. The large increase in HQI average for cellulose is due to its high fluorescence background, which resulted in low S/N ratio. Using the ratio of mean HQI over its standard deviation as a measure of resolving power, this is improved by 37% for sucrose to a factor of 22.2 for cyclohexane. Although the methods of RIC and SRIC are demonstrated using test spectra and library spectra acquired on the same instrument, the same can be applied to library transfer between different instruments. In this case, other factors of standardization may need to be considered. Difference in spectral frequency axis starting point and data interval can be easily resolved by interpolating either or both test and library spectra onto the same axis; Raman shift calibration errors can be treated using Raman shift correction, and difference in spectral resolution can be accommodated by resolution matching.[18] Because the response curve and the smooth response curve are both slow varying curves, the reverse intensity correction of the library spectra can be performed either before or after these additional steps.

Conclusion

The standard intensity correction method gives more weight to spectral regions where the instrument responsivity is lower, which results in reduced SEP that varies with spectral coverage and therefore poor search performance. The performance can be improved by sacrificing the spectral region of lower responsivity. The RIC method including its modified version SRIC achieves improved HQI precision and SEP over the standard intensity correction method without sacrificing the spectral range. It does so by giving roughly equal weights to all spectral elements. The RIC method requires no extra information, can be easily implemented, and is useful for library searching using commercial spectral libraries.
  5 in total

1.  Determination of the resolution of a multichannel Raman spectrometer using Fourier transform Raman spectra.

Authors:  Bryan T Bowie; Peter R Griffiths
Journal:  Appl Spectrosc       Date:  2003-02       Impact factor: 2.388

2.  Smoothing and differentiation of data by simplified least square procedure.

Authors:  J Steinier; Y Termonia; J Deltour
Journal:  Anal Chem       Date:  1972-09-01       Impact factor: 6.986

3.  Relative intensity correction of Raman spectrometers: NIST SRMs 2241 through 2243 for 785 nm, 532 nm, and 488 nm/514.5 nm excitation.

Authors:  Steven J Choquette; Edgar S Etz; Wilbur S Hurst; Douglas H Blackburn; Stefan D Leigh
Journal:  Appl Spectrosc       Date:  2007-02       Impact factor: 2.388

4.  Novel search algorithms for a mid-infrared spectral library of cotton contaminants.

Authors:  J Brian Loudermilk; David S Himmelsbach; Franklin E Barton; James A de Haseth
Journal:  Appl Spectrosc       Date:  2008-06       Impact factor: 2.388

5.  Standardization of Raman spectra for transfer of spectral libraries across different instruments.

Authors:  Jason D Rodriguez; Benjamin J Westenberger; Lucinda F Buhse; John F Kauffman
Journal:  Analyst       Date:  2011-08-26       Impact factor: 4.616

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.