| Literature DB >> 35155366 |
Xien Yang1, Zhongyu Wu1, Quanhong Ou1, Kai Qian2, Liqin Jiang1, Weiye Yang3, Youming Shi4, Gang Liu1.
Abstract
Lung cancer is a fatal tumor threatening human health. It is of great significance to explore a diagnostic method with wide application range, high specificity, and high sensitivity for the detection of lung cancer. In this study, data fusion and wavelet transform were used in combination with Fourier transform infrared (FTIR) spectroscopy and Raman spectroscopy to study the serum samples of patients with lung cancer and healthy people. The Raman spectra of serum samples can provide more biological information than the FTIR spectra of serum samples. After selecting the optimal wavelet parameters for wavelet threshold denoising (WTD) of spectral data, the partial least squares-discriminant analysis (PLS-DA) model showed 93.41% accuracy, 96.08% specificity, and 90% sensitivity for the fusion data processed by WTD in the prediction set. The results showed that the combination of FTIR spectroscopy and Raman spectroscopy based on data fusion and wavelet transform can effectively diagnose patients with lung cancer, and it is expected to be applied to clinical screening and diagnosis in the future.Entities:
Keywords: FTIR spectroscopy; Raman spectroscopy; data fusion; lung cancer; wavelet transform
Year: 2022 PMID: 35155366 PMCID: PMC8825776 DOI: 10.3389/fchem.2022.810837
Source DB: PubMed Journal: Front Chem ISSN: 2296-2646 Impact factor: 5.221
FIGURE 1FTIR spectra of serum from patients with lung cancer (A) and healthy people (B). Raman spectra of serum from patients with lung cancer (C) and healthy people (D). (The corresponding average spectra are shown in bold).
Peaks of FTIR and Raman spectra and their assignments.
| Wavenumber/Raman shift | Peak assignments | References |
|---|---|---|
| Wavenumber (cm−1) | — | — |
| 2,959 | C-H asymmetric stretching vibration of CH3 in lipid | — |
| 2,930 | C-H asymmetric stretching vibration of CH2 in lipid | — |
| 1740 | C=O stretching vibration from ester carbonyl in triglycerides | — |
| 1,646 | α-helix structure in amide I protein | — |
| 1,542 | N-H functional group in amide II protein | — |
| 1,243 | P=O asymmetric stretching vibration of PO2 − in nucleic acids | — |
| 1,079 | P=O symmetric stretching vibration of PO2 − in nucleic acids |
|
| Raman shift (cm−1) | — | — |
| 1,005 | Symmetric ring breathing mode in phenylalanine, CHO, and protein |
|
| 1,129 | C-N stretching in protein |
|
| 1,155 | C-C stretching in glucose, CHO, and protein |
|
| 1,302 | C-H vibration in triglycerides |
|
| 1,448 | CH3-CH2 bending of phospholipids and the protein side chains |
|
| 1,520 | C=C stretching in porphyrin |
|
| 1,656 | C=C stretching in lipid and amide I protein |
|
FIGURE 2Acccv of the PLS-DA model using FTIR spectral data processed by WTD in four thresholds: heursure (A), minimaxi (B), rigrsure (C), and sqtwolog (D).
Calibration results for FTIR spectral data processed by different WTD algorithms. (The best threshold, wave function, and DL for the PLS-DA model are presented in bold).
| Threshold | Wave function | DL | Calibration | ||
|---|---|---|---|---|---|
| Acccv (%) | Specv (%) | Sencv (%) | |||
|
|
|
|
|
|
|
| minimaxi | fk4 | 7 | 95.27 | 99.05 | 88.75 |
| rigrsure | fk8 | 8 | 95.27 | 98.81 | 85.12 |
| sqtwolog | fk8 | 5 | 95.30 | 98.21 | 90.79 |
The meaning of the bold values is the best processing method for PLS-DA model.
FIGURE 4Acccv of the PLS-DA model using FTIR and Raman spectral data processed by WTD in different DLs.
FIGURE 3Acccv of the PLS-DA model using Raman spectral data processed by WTD in four thresholds: heursure (A), minimaxi (B), rigrsure (C), and sqtwolog (D).
Model performances of the PLS-DA model using FTIR spectral data with different preprocessing methods. (The best processing method for the PLS-DA model are presented in bold).
| Method | Calibration | Prediction | ||||
|---|---|---|---|---|---|---|
| Acccv (%) | Specv (%) | Sencv (%) | Accp (%) | Spep (%) | Senp (%) | |
| No | 93.26 | 96.78 | 86.51 | 94.95 | 98.31 | 90.00 |
| Normalization | 94.62 | 97.80 | 88.34 | 93.94 | 98.31 | 87.50 |
| SNV | 94.65 | 100.00 | 85.56 | 97.98 | 100.00 | 95.00 |
| SG | 93.91 | 96.73 | 89.09 | 94.95 | 98.31 | 90.00 |
| SG + Normalization | 93.94 | 98.98 | 84.61 | 96.97 | 96.61 | 97.50 |
| SG + Normalization + SNV | 93.91 | 99.16 | 81.30 | 94.95 | 100.00 | 84.50 |
| FD | 92.55 | 98.41 | 79.59 | 92.93 | 100.00 | 82.50 |
| FD + Normalization | 87.23 | 94.62 | 73.66 | 89.90 | 96.61 | 80.00 |
| FD + Normalization + SNV | 87.20 | 95.49 | 72.24 | 91.92 | 96.61 | 85.00 |
| SD | 77.68 | 75.17 | 85.20 | 79.80 | 72.80 | 90.00 |
| SD + Normalization | 77.74 | 81.59 | 69.99 | 81.82 | 86.44 | 75.00 |
| SD + Normalization + SNV | 78.39 | 82.44 | 74.94 | 83.84 | 88.14 | 77.50 |
|
|
| 96.76 |
|
|
|
|
| WTD + Normalization | 93.23 | 98.81 | 82.63 | 94.95 | 100.00 | 87.50 |
| WTD + Normalization + SNV | 95.27 | 97.08 | 92.36 | 95.96 | 94.92 | 97.50 |
No: No preprocessing; SNV: standard normal variate; SG: Savitzky–Golay filter; FD: first derivative; SD: second derivative; WTD: wavelet threshold denoising.
The meaning of the bold values is the best processing method for PLS-DA model.
Model performances of the PLS-DA model using Raman spectral data with different preprocessing methods. (The best processing method for the PLS-DA model are presented in bold).
| Method | Calibration | Prediction | ||||
|---|---|---|---|---|---|---|
| Acccv (%) | Specv (%) | Sencv (%) | Accp (%) | Spep (%) | Senp (%) | |
| No | 82.38 | 58.65 | 75.09 | 69.70 | 49.02 | 91.67 |
| Normalization | 85.13 | 91.13 | 74.60 | 72.73 | 76.47 | 68.75 |
| SNV | 83.83 | 85.59 | 80.29 | 69.70 | 64.71 | 75.00 |
| SG | 83.02 | 60.99 | 68.46 | 67.68 | 47.06 | 89.58 |
| SG + Normalization | 86.43 | 89.39 | 79.76 | 76.77 | 76.47 | 77.08 |
| SG + Normalization + SNV | 85.78 | 87.47 | 81.91 | 74.75 | 76.47 | 72.92 |
| FD | 79.72 | 90.28 | 55.03 | 54.55 | 88.24 | 18.75 |
| FD + Normalization | 82.53 | 92.90 | 58.30 | 80.81 | 84.31 | 77.08 |
| FD + Normalization + SNV | 83.15 | 93.28 | 60.80 | 83.84 | 84.31 | 83.33 |
| SD | 74.40 | 68.06 | 88.21 | 65.66 | 58.82 | 72.92 |
| SD + Normalization | 81.08 | 91.23 | 58.24 | 71.72 | 76.47 | 66.67 |
| SD + Normalization + SNV | 80.43 | 90.15 | 58.00 | 70.71 | 70.59 | 70.83 |
|
|
| 90.89 |
|
|
|
|
| WTD + Normalization | 89.18 | 93.65 | 80.44 | 69.70 | 64.71 | 75.00 |
| WTD + Normalization + SNV | 87.14 | 90.51 | 79.57 | 69.70 | 64.71 | 75.00 |
No: No preprocessing; SNV: standard normal variate; SG: Savitzky–Golay filter; FD: first derivative; SD: second derivative; WTD: wavelet threshold denoising.
The meaning of the bold values is the best processing method for PLS-DA model.
Performances of the PLS-DA model using data fusion combined with different preprocessing methods. (The best results for the PLS-DA model are presented in bold).
| Method | Calibration | Prediction | ||||
|---|---|---|---|---|---|---|
| Acccv (%) | Specv (%) | Sencv (%) | Accp (%) | Spep (%) | Senp (%) | |
| No | 92.86 | 96.99 | 84.67 | 93.41 | 98.04 | 87.50 |
| Normalization | 93.57 | 98.21 | 83.96 | 91.21 | 94.12 | 87.50 |
| SNV | 94.29 | 95.75 | 86.91 | 91.21 | 94.12 | 87.50 |
| SG | 95.71 | 100.00 | 87.69 | 92.31 | 98.04 | 85.00 |
| SG + Normalization | 92.86 | 97.19 | 84.13 | 91.21 | 90.20 | 92.50 |
| SG + Normalization + SNV | 91.43 | 98.10 | 76.62 | 91.21 | 90.20 | 92.50 |
|
|
| 97.80 |
|
|
|
|
| WTD + Normalization | 92.86 | 94.73 | 87.14 | 81.32 | 80.39 | 82.50 |
| WTD + Normalization + SNV | 92.86 | 95.95 | 85.56 | 81.32 | 80.39 | 82.50 |
No: No preprocessing; SNV: standard normal variate; SG: Savitzky–Golay filter; WTD: wavelet threshold denoising.
The meaning of the bold values is the best processing method for PLS-DA model.
FIGURE 5Score plot of the PLS-DA model using data fusion combined with WTD.