Literature DB >> 35518240

The statistical fusion identification of dairy products based on extracted Raman spectroscopy.

Zheng-Yong Zhang1,2.   

Abstract

At present, practical and rapid identification techniques for dairy products are still scarce. Taking different brands of pasteurized milk as an example, they are all milky white in appearance, and their Raman spectra are very similar, so it is not feasible to identify them directly using the naked eye. In the current work, a clear feature extraction and fusion strategy based on a combination of Raman spectroscopy and a support vector machine (SVM) algorithm was demonstrated. The results showed a 58% average recognition accuracy rate for dairy products as based on the original Raman full spectral data and up to nearly 70% based on a single spectral interval. Data normalization processing effectively improved the recognition accuracy rate. The average recognition accuracy rate of dairy products reached 91% based on the normalized Raman full spectral data or nearly 85% based on a normalized single spectral interval. The fusion of multispectral feature regions yielded high accuracy and operation efficiency. After screening and optimizing based on SVM algorithm, the best spectral feature intervals were determined to be 335-354 cm-1, 435-454 cm-1, 485-540 cm-1, 820-915 cm-1, 1155-1185 cm-1, 1300-1414 cm-1, and 1415-1520 cm-1 under the experimental conditions, and the average identification accuracy rate here reached 93%. The developed scheme has the advantages of clear feature extraction and fusion, and short identification time, and it provides a technical reference for food quality control. This journal is © The Royal Society of Chemistry.

Entities:  

Year:  2020        PMID: 35518240      PMCID: PMC9056169          DOI: 10.1039/d0ra06318e

Source DB:  PubMed          Journal:  RSC Adv        ISSN: 2046-2069            Impact factor:   3.361


Introduction

The quality and safety of dairy products have, due to their importance as food products, always been major concerns of ordinary consumers, regulators and scientists. The evaluation of the safety of dairy products mainly involves the analysis of any illegal additives (such as melamine and sodium thiocyanate),[1,2] pollutants (such as lead, mercury, arsenic, and chromium, nitrate),[3] mycotoxins (such as aflatoxin B1 and aflatoxin M1),[4,5] microorganisms (such as Staphylococcus aureus and Salmonella),[6]etc. The evaluation of the quality of dairy products mainly involves evaluating the fluctuation and conformity of the products; that is, the products should maintain high stability.[7] In the previous published reports, the research has mainly focused on the safety factors of dairy products, and trace harmful substances can be determined qualitatively and quantitatively.[8] However, the existing research methods also face some challenges, especially in the field of quality evaluation.[9] For example, in recent years, law enforcement officers in China have found many cases of the manufacturing and selling of fake dairy products. Recently these counterfeit products have been low-quality and low-price versions of brand products; and while in line with the national standards and harmless to the human body, the counterfeiters use them as fake mimics of the high-quality and high-prices products in order to deceive consumers and make illegal profits.[10] There are many kinds of dairy products, and different dairy products show considerably different appearances and contents of their main components, so it is relatively easy to distinguish them. However, the dairy products of the different brands do have similar appearances and compositions, which has become one of the challenges in efficiently identifying them.[11-13] In the work described in the current paper, different brands of pasteurized milk were taken as examples. Their all having milky white appearances make their direct identifications by the naked eye not feasible. Thus we set out to conduct research into the more intelligent identification of the quality of dairy products. This research included two aspects: quality characterization and statistical identification. The technologies available for characterizing the quality of dairy products include chromatography, mass spectrometry and spectroscopy. Chromatography and mass spectrometry detection generally requires sample pretreatment, which is time consuming and laborious.[14] The spectral detection methods include ultraviolet spectroscopy, fluorescence spectroscopy, infrared spectroscopy and Raman spectroscopy. The ultraviolet spectroscopy and fluorescence spectroscopy techniques provide relatively little information about dairy products.[15] Infrared spectroscopy is sensitive to water molecules. Raman spectroscopy shows many advantages: (1) it can be used to detect the sample directly; (2) the cross section of Raman scattering by water molecules is small, and hence does not affect the sample detection; (3) a portable Raman spectrometer can be used to realize on-site sample detection; and (4) Raman spectroscopy can be used to characterize the rich molecular vibration information of a sample, and can thus provide a “fingerprint” of the sample.[16-18] Therefore, research on the quality identification of dairy products based on Raman spectroscopy has become a major pursuit. The Raman spectrum of each sample can be formulated as a numerical matrix, which can be used as a sample quality characteristic to input into the subsequent statistical model.[19] Statistical identification is an important technology for the high-efficiency, in-depth mining of the internal relationships between characterization data and for realizing the scientific quality control of dairy products. The existing corresponding research methods mainly include principal component analysis (PCA), etc. Because PCA is a mathematical transformation feature extraction method, it can achieve a reduction in data dimensionality, but it ignores the chemical information of Raman spectra of samples, and the practical application is not particularly convenient.[20-22] Therefore, on the basis of collecting and analyzing the Raman spectra of dairy products, in the work described in this paper, we comprehensively developed a support vector machine (SVM) algorithm for the feature extraction and fusion identification of Raman spectral characterization data, and developed a novel identification method approach.

Experimental

Samples and instruments

Momchilovtsi (MS)-pasteurized heat-treated flavored yoghurt products were obtained from Bright Dairy & Food Co., Ltd. Ambpoeial (AM)-pasteurized heat-treated flavored yoghurt products were obtained from Inner Mongolia Yili Industrial Group Co., Ltd. ChunZhen (CZ)-pasteurized heat-treated flavored yoghurt products were obtained from Mengniu Dairy Group Co. Ltd. The samples of the dairy products of each brand were original flavors; that is, they did not include any additional chemical. These samples all appeared milky white. Each dairy brand was tested 30 times. Information on the contents of the main components of these samples is shown in Table S1.† The Raman spectra of the samples were obtained by using a Prott-ezRaman-D3 portable laser Raman spectrometer (Enwave Optronics, U.S.A.). The excitation wavelength of the laser was 785 nm, the laser power was 450 mW, the fiber coupled laser output was 100 μm, and the integration time was 150 s. The spectrometer was operated from 255 to 1974 cm−1 with a resolution 1 cm−1. Raman spectra of these samples were obtained without any physical and chemical pretreatments except that the samples were fully shaken.

Data analysis

The baseline calibration of the collected Raman spectra was carried out by using SLSR Reader V8.3.9 software (Enwave Optronics, U.S.A.). The support vector machine (SVM) algorithm used for feature extraction and fusion identification of Raman spectral characterization data was run using MATLAB software (MathWorks, Natick, MA, U.S.A.). A brief description of the basic idea of the SVM classification algorithm is provided here.[23-25] First, a known training set (T) can be described using the formulawhere x ∈ X = R, y ∈ Y = {1,−1}(i = 1, 2, …, m). Here x is the Raman spectrum data, and y is the category label. A two-classification model is described here. A multiple classification problem can be solved by constructing a multiple two-classification SVM. By selecting the appropriate kernel function K(x,x′) and parameter C, the optimization problem may be constructed and solved using the termand equation To obtain the optimal solution . A positive component of α* is selected as , and the threshold is calculated according to the equation . Also constructed is the decision function . In the current work, the radial basis function (RBF) was applied as kernel functions using the equationwhere σ represents the kernel width of the RBF.

Results and discussion

Characterization of dairy products using Raman spectroscopy

As shown in Fig. 1, Raman spectra of three dairy products were collected, and their main Raman peak assignments were attributed to the corresponding molecular vibrations according to previous reports.[26-28] The Raman band at 1759 cm−1 was assigned to the ester CO stretching of fatty acids. The strong Raman band at 1468 cm−1 was attributed to CH2 deformation of fats and sugars. And the strong band at 1016 cm−1 was assigned to the ring-breathing mode, which was mainly derived from the phenylalanine of proteins. More information on Raman peak attribution is shown in Table 1. In Fig. 1, MS represents the momchilovtsi-pasteurized heat-treated flavored yoghurt products from Bright Dairy & Food Co., Ltd. Also, AM represents the ambpoeial-pasteurized heat-treated flavored yoghurt products from Inner Mongolia Yili Industrial Group Co., Ltd., and CZ represents the ChunZhen-pasteurized heat-treated flavored yoghurt products from Mengniu Dairy Group Co. Ltd., respectively. Nearly 20 main narrow peaks were identified in the Raman spectrum of each experimental sample. The results showed that the samples were rich in fat, sugar and protein and yielded strong Raman fingerprints. Yet the Raman spectra of the experimental samples were very similar. Fig. S1–S3† show the Raman spectra of 30 samples of MS, AM and CZ dairy product brands, respectively. As shown in Fig. S4,† the dairy products of the various brands were milky white in appearance.
Fig. 1

Raman spectra of the different dairy products. (MS represents the momchilovtsi-pasteurized heat-treated flavored yoghurt products from Bright Dairy & Food Co., Ltd. AM represents the ambpoeial-pasteurized heat-treated flavored yoghurt products from Inner Mongolia Yili Industrial Group Co., Ltd. CZ represents the ChunZhen-pasteurized heat-treated flavored yoghurt products from Mengniu Dairy Group Co. Ltd.).

The wavenumbers (cm−1) of the main Raman peaks and their respective tentative assignments[26–28]a

Wavenumber (cm−1)Assignment
1759 ν(CO)ester
1670 ν(CO) amide I; ν(CC)
1615 ν(C–C)ring
1566 δ(N–H); ν(C–N) amide II
1468 δ(CH2)
1315 τ(CH2)
1279 γ(CH2)
1139 ν(C–O) + ν(C–C) + δ(C–O–H)
1085 ν(C–O) + ν(C–C) + δ(C–O–H)
1046 ν(C–O) + ν(C–C) + δ(C–O–H)
1016Ring-breathing (phenylalanine); ν(C–C) ring
953 δ(C–O–C) + δ(C–O–H) + ν(C–O)
851 δ(C–C–H) + δ(C–O–C)
807 δ(C–C–O)
633 δ(C–C–O)
502Glucose
445 δ(C–C–H) + τ(C–O)
389Lactose

ν—stretching vibration, δ—deformation vibration, τ—twisting vibration, γ—out-of-plane bending vibration.

ν—stretching vibration, δ—deformation vibration, τ—twisting vibration, γ—out-of-plane bending vibration. Furthermore, a single value moving range control chart was used to evaluate the quality fluctuation of the spectral data of each brand sample based on Raman peak intensities (at 1468 cm−1), as shown in Fig. S5–S7.† The results showed that the spectra of each brand sample show statistically explainable quality fluctuations; the Raman spectral intensities of MS, AM and CZ were found to be located in the ranges 462.6–757.8, 397.4–682.2 and 482.5–739.8, respectively.[7,29] The experimental results suggested that the spectral data conformed to certain statistical rules. At the same time, analysis also revealed spectral intensity overlap among brands, and therefore spectral intensity analysis alone cannot achieve brand identification. The nutritional components of the three dairy products were also similar (Table S1†), so the traditional identification methods would have been difficult to carry out and the compositional analysis would have been time-consuming and laborious. Therefore, taking their identification as an example, a new statistical identification method was provided.

Data preprocessing and feature extraction of the Raman spectra of the dairy products

First we investigated identification based on original Raman spectral data of the dairy products. The average recognition rate of the SVM model based on the original Raman full spectral data was measured to be 58%. The test condition was that two-thirds of the samples were randomly selected as the training set, and the remaining samples were employed as the validation set. Ten runs of random tests were conducted to obtain the average accuracy rate. The purpose of feature extraction and analysis in general is to reduce the effects of interference of redundant information on the identification model, improve the detailed differences between samples, and then realize accurate identifications of the different samples and save running time. In the current work, the moving window method and spectral band selection method were employed to screen Raman spectral intervals. The width of the moving window was 20 wavenumbers. The experimental results are shown in Tables S2 and S3.† The highest average recognition rate reached 69%, and did so for 1755–1774 cm−1. This recognition rate was higher than the recognition rate based on a full spectrum, indicating an expected recognition effect improvement to result from feature extraction. Then we investigated identification based on preprocessing Raman spectral data of dairy products. Spectral preprocessing was expected to help us highlight data differences and improve the recognition rate of models. Calculations of the first, second, third, and fourth derivatives and normalization processing were carried out on the Raman spectral data, as shown in Fig. S8–S12,† respectively. Mere visual inspection of these spectra did not show any very obvious difference between them. However, the average recognition rates calculated using the SVM model did show some differences, and the results are shown in Table 2. The recognition rates based on derivative processing decreased, while the recognition rates based on normalization processing increased significantly. These results suggested a great impact of the normalization operation on the SVM model and suggested an effective removal of the dimensional influence of the original spectral data by the normalization.[30-32] Furthermore, two thirds of the sample data were randomly selected as the training set, and the remaining data were used as the verification set. The average recognition effect of the moving window selection and spectral band feature selection was investigated in ten random experiments. The identification results are shown in Tables S4 and S5.† Each of the identification accuracy rates in the Raman spectral ranges 335–354 cm−1, 435–454 cm−1, and 835–854 cm−1 for moving window method was greater than 70%. The average of the recognition rates based on the Raman ranges 485–540 cm−1, 1155–1185 cm−1, 1300–1415 cm−1 and 1415–1520 cm−1 was also greater than 70%, and the recognition rate based on the 820–915 cm−1 interval reached 83%. The results suggested that compared with the corresponding original data band mentioned above (Tables S2 and S3†), most of the recognition rates based on the feature band were improved after normalization, the contribution of different spectral bands to the accurate recognition of the SVM model were different, and the recognition rate was lower than that of the full spectral data (91%, Table 2), which may have been due to the lack of spectral data input from the SVM algorithm. Obvious differences between the samples were identified in the spectral feature intervals, as shown in Fig. 2.

The identification results of dairy products based on different pretreatment methods and the use of the SVM recognition algorithm. The Raman spectroscopy range was 255–1974 cm−1

Spectral pretreatment methodAverage accuracy rate (%)Running time (seconds)
Original58820
First derivative53833
Second derivative51841
Third derivative40831
Fourth derivative53838
Normalization91823
Fig. 2

Extracted Raman spectral feature intervals of dairy products. (MS represents the momchilovtsi-pasteurized heat-treated flavored yoghurt products from Bright Dairy & Food Co., Ltd. AM represents the ambpoeial-pasteurized heat-treated flavored yoghurt products from Inner Mongolia Yili Industrial Group Co., Ltd. CZ represents the ChunZhen-pasteurized heat-treated flavored yoghurt products from Mengniu Dairy Group Co. Ltd., respectively).

Statistical fusion identification of dairy products based on Raman spectra

The data fusion methods reported in the literature mainly include data layer fusion, feature layer fusion and model level fusion. These methods are used with the expectation that they improve the efficiency of data utilization and improve the accuracy of model discrimination. Direct data fusion often involves redundant information. Model level fusion requires multiple classifiers, its operation is relatively complex, and the chemical information of the data tends to be fuzzy, which is not conducive to practical applications. However, the data fusion strategy based on clear chemical feature extraction is simple, intuitive and practical.[33-35] The above-described research revealed different feature regions of Raman spectra of dairy products making different contributions to the identification methods of sample categories. After normalization, the recognition accuracy of the model improved, and at the same time more than one spectral interval with an accuracy rate of greater than 70% was found. The results suggested that we can further improve the accuracy rate of the identification model through the fusion of feature spectral intervals. To arrive at the test conditions, all spectral data were normalized, and then two-thirds of the spectral data were randomly selected as the training set, and the remaining one-third of the spectral data was employed as the validation set. Ten runs of random tests were conducted to obtain the average accuracy rate. As shown in Table 3, the average recognition rate of the model was higher than that based on the corresponding single feature spectral interval through different combinations of fusion. The results show that the best spectral feature fusion intervals included 335–354 cm−1, 435–454 cm−1, 485–540 cm−1, 820–915 cm−1, 1155–1185 cm−1, 1300–1414 cm−1 and 1415–1520 cm−1, and the average accuracy rate was increased from 58% (based on full spectral data without normalization) to 91% (based on full spectral data with normalization) and finally here to 93% with a savings of the running time (from about 820 s to 208 s). The best classification result based on the feature interval fusion reached 100%, as shown in Fig. 3.

The identification results of dairy products based on the fusion of their Raman spectroscopy features and the use of the SVM recognition algorithm

Raman spectroscopy interval ranges (cm−1)Average accuracy rate (%)Running time (seconds)
335–354, 435–4547630
335–354, 435–454, 835–8548438
485–540, 820–9158478
485–540, 820–915, 1115–11858590
485–540, 820–915, 1115–1185, 1300–141588143
485–540, 820–915, 1115–1185, 1300–1414, 1415–152089192
485–540, 820–915, 1115–1185, 1415–152087138
335–354, 435–454, 485–540, 835–8548361
335–354, 435–454, 835–854, 1115–11858351
335–354, 435–454, 835–854, 1115–1185, 1300–1415 90 102
335–354, 435–454, 835–854, 1115–1185, 1300–1414, 1415–1520 90 151
335–354, 435–454, 485–540, 835–854, 1115–1185, 1300–1414, 1415–1520 91 174
335–354, 435–454, 485–540, 820–915, 1115–1185, 1300–1414, 1415–1520 208
Fig. 3

Classification results for dairy products based on the use of the SVM. (MS represents momchilovtsi-pasteurized heat-treated flavored yoghurt products from Bright Dairy & Food Co., Ltd; AM represents ambpoeial-pasteurized heat-treated flavored yoghurt products from Inner Mongolia Yili Industrial Group Co., Ltd; and CZ represents ChunZhen-pasteurized heat-treated flavored yoghurt products from Mengniu Dairy Group Co. Ltd, respectively).

Conclusions

In this work, a new and convenient research strategy for the feature extraction and fusion identification of dairy products was proposed. Through the comprehensive application of Raman spectroscopy and SVM algorithm techniques, the Raman spectroscopy feature intervals could be extracted clearly and effectively, and the best spectral feature intervals in the experimental system were determined to be 335–354 cm−1, 435–454 cm−1, 485–540 cm−1, 820–915 cm−1, 1155–1185 cm−1, 1300–1414 cm−1, and 1415–1520 cm−1. The identification accuracy rate could be further improved through normalization treatment and feature fusion, and increased to 93% from 58% (based on full spectral data without normalization) and 91% (based on full spectral data with normalization). The total running time of the model was greatly reduced from about 820 s to 208 s. The scheme has many advantages, such as the simplicity of sample signal acquisition, the high speed of analysis, the portability of the equipment, etc. This approach has potential application prospects in the fields of food anti-counterfeiting and quality control.

Conflicts of interest

There are no conflicts to declare.
  19 in total

1.  Raman spectroscopic quantification of milk powder constituents.

Authors:  C M McGoverin; A S S Clark; S E Holroyd; K C Gordon
Journal:  Anal Chim Acta       Date:  2010-05-20       Impact factor: 6.558

2.  Single-Drop Raman Imaging Exposes the Trace Contaminants in Milk.

Authors:  Zong Tan; Ting-Ting Lou; Zhi-Xuan Huang; Jing Zong; Ke-Xin Xu; Qi-Feng Li; Da Chen
Journal:  J Agric Food Chem       Date:  2017-07-19       Impact factor: 5.279

3.  Feasibility of discrimination of dairy creams and cream-like analogues using Raman spectroscopy and chemometric analysis.

Authors:  Aleksandar Nedeljkovic; Igor Tomasevic; Jelena Miocinovic; Predrag Pudja
Journal:  Food Chem       Date:  2017-04-03       Impact factor: 7.514

4.  Thin layer chromatography combined with surface-enhanced raman spectroscopy for rapid sensing aflatoxins.

Authors:  Lu-Lu Qu; Qin Jia; Chunyuan Liu; Wen Wang; Lingfeng Duan; Guohai Yang; Cai-Qin Han; Haitao Li
Journal:  J Chromatogr A       Date:  2018-10-15       Impact factor: 4.759

5.  Use of a smartphone for visual detection of melamine in milk based on Au@Carbon quantum dots nanocomposites.

Authors:  Xuetao Hu; Jiyong Shi; Yongqiang Shi; Xiaobo Zou; Muhammad Arslan; Wen Zhang; Xiaowei Huang; Zhihua Li; Yiwei Xu
Journal:  Food Chem       Date:  2018-08-07       Impact factor: 7.514

6.  Liquid chromatography-tandem mass spectrometry method for the analysis of N-(3-aminopropyl)-N-dodecylpropane-1,3-diamine, a biocidal disinfectant, in dairy products.

Authors:  Kahina Slimani; Yvette Pirotais; Pierre Maris; Jean-Pierre Abjean; Dominique Hurtaud-Pessel
Journal:  Food Chem       Date:  2018-04-25       Impact factor: 7.514

Review 7.  A review on spectroscopic methods for determination of nitrite and nitrate in environmental samples.

Authors:  Priyanka Singh; Manish Kumar Singh; Younus Raza Beg; Gokul Ram Nishad
Journal:  Talanta       Date:  2018-08-08       Impact factor: 6.057

8.  Assessment of infant formula quality and composition using Vis-NIR, MIR and Raman process analytical technologies.

Authors:  Xiao Wang; Carlos Esquerre; Gerard Downey; Lisa Henihan; Donal O'Callaghan; Colm O'Donnell
Journal:  Talanta       Date:  2018-03-19       Impact factor: 6.057

9.  Kernel functions embedded in support vector machine learning models for rapid water pollution assessment via near-infrared spectroscopy.

Authors:  Huazhou Chen; Lili Xu; Wu Ai; Bin Lin; Quanxi Feng; Ken Cai
Journal:  Sci Total Environ       Date:  2020-01-17       Impact factor: 7.963

10.  Improvement of the prediction ability of multivariate calibration by a method based on the combination of data fusion and least squares support vector machines.

Authors:  Shouxin Ren; Ling Gao
Journal:  Analyst       Date:  2011-01-18       Impact factor: 4.616

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.