| Literature DB >> 32083159 |
Purwana Satriyo1, Agus Arip Munawar1,2.
Abstract
Presented manuscript described data analysis on near infrared spectroscopy used as adopted and portable technology for cocoa farmers in Aceh Province, Indonesia. The near infrared spectroscopy (NIRS) assisted farmers in post-harvest handling especially for cocoa quality evaluation. This technology was used to determine moisture content (MC) and fat content (FC) of intact cocoa bean samples rapidly and simultaneously. Near infrared spectra data were acquired as absorbance spectrum in wavelength range from 1000 to 2500 nm with co-added of 32 scans for a total of 72 intact bulk cocoa bean samples. Spectra data can be used to predict MC and FC of intact cocoa beans by establishing prediction models and validate with actual MC and FC measured by means of standard laboratory procedures. Prediction performances were evaluated using several statistical indicators: coefficient correlation (r), coefficient of determination (R2), root mean square error (RMSE) and residual predictive deviation (RPD) index. Near infrared spectra data can be enhanced using spectra pre-treatment methods to improve prediction performances. Moreover, prediction models can be developed using principal component regression (PCR), partial least squares regression (PLSR) and other regression approaches. Ideal prediction models should have r and R2 above 0.75, RPD index above 2.0 and RMSE lower than its standard deviation (SD). Dataset were available as raw MS Excel format and The Unscrambler files as *.unsb extension.Entities:
Keywords: Cocoa; NIRS; Post-harvest; Spectroscopy; Technology
Year: 2020 PMID: 32083159 PMCID: PMC7021542 DOI: 10.1016/j.dib.2020.105251
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Typical near infrared absorbance spectrum of intact cocoa bean sample.
Fig. 2Spectra data projected onto principal component analysis (PCA) and Hotelling T2 ellipse.
Prediction performance between principal component regression (PCR) and partial least squares regression (PLSR) for moisture content prediction.
| Regression approach | Statistical indicators | ||||
|---|---|---|---|---|---|
| Factor | R2 | r | RMSE | RPD | |
| PCR | 7 | 0.82 | 0.90 | 0.54 | 2.37 |
| PLSR | 7 | 0.88 | 0.94 | 0.42 | 3.05 |
R2: coefficient of determination, r: correlation coefficient, RMSE: the root mean square error, RPD: residual predictive deviation.
Prediction performance between principal component regression (PCR) and partial least squares regression (PLSR) for fat content prediction.
| Regression approach | Statistical indicators | ||||
|---|---|---|---|---|---|
| Factor | R2 | r | RMSE | RPD | |
| PCR | 7 | 0.73 | 0.85 | 1.11 | 1.94 |
| PLSR | 7 | 0.84 | 0.91 | 0.82 | 2.62 |
R2: coefficient of determination, r: correlation coefficient, RMSE: the root mean square error, RPD: residual predictive deviation.
Fig. 3Prediction performance of moisture content (a) and fat content (b) by means of principal component regression (PCR) approach.
Fig. 4Prediction performance of moisture content (c) and fat content (d) by means of partial least squares regression (PLSR) approach.
Comparison among different spectra correction methods to the prediction performance of inner quality parameters in cocoa bean samples using partial least squares regression approach.
| Quality parameters | Correction method | Statistical indicators | ||||
|---|---|---|---|---|---|---|
| Factor | R2 | r | RMSE | RPD | ||
| Moisture content | MN | 5 | 0.90 | 0.95 | 0.40 | 3.18 |
| Smoothing | 5 | 0.91 | 0.95 | 0.39 | 3.27 | |
| SNV | 5 | 0.91 | 0.95 | 0.38 | 3.37 | |
| EMSC | 5 | 0.92 | 0.95 | 0.37 | 3.46 | |
| Fat content | MN | 5 | 0.95 | 0.98 | 0.45 | 4.78 |
| Smoothing | 5 | 0.90 | 0.95 | 0.68 | 3.16 | |
| SNV | 5 | 0.97 | 0.98 | 0.38 | 5.66 | |
| EMSC | 5 | 0.98 | 0.99 | 0.27 | 7.96 | |
EMSC: extended multiplicative scatter correction, MN: mean normalization, R2: coefficient of determination, r: correlation coefficient, RMSE: the root mean square error, RPD: residual predictive deviation. SNV: standard normal variate.
Fig. 5Prediction performance for moisture content (MC) determination using enhanced EMSC spectra data and partial least square approach.
Fig. 6Prediction performance for fat content (FC) determination using enhanced EMSC spectra data and partial least square approach.
Descriptive statistics of actual measured quality parameters of cocoa bean samples.
| Moisture Content (%) | Fat Content (%) | |
|---|---|---|
| # of Sample | 72 | 72 |
| Mean | 9.04 | 40.32 |
| Max | 12.08 | 45.75 |
| Min | 6.74 | 35.26 |
| Range | 5.34 | 10.49 |
| Std. Deviation | 1.28 | 2.15 |
| Variance | 1.63 | 4.64 |
| RMS | 9.12 | 40.38 |
| Skewness | 0.84 | 0.10 |
| Kurtosis | 0.07 | −0.46 |
| Median | 8.79 | 40.13 |
| Q1 | 8.09 | 38.55 |
| Q3 | 9.56 | 41.93 |
Specifications Table
| Subject | Agricultural and Biological Sciences |
| Specific subject area | Spectroscopy, technology adoption for farmer, non-destructive technology for cocoa quality evaluation |
| Type of data | Table |
| How data were acquired | Near infrared spectral data of intact cocoa bean samples were collected and acquired using a self-developed portable near infrared spectroscopy (FTIR PSD i15). A total of 72 bulk intact cocoa bean amounted 50g per bulk were obtained from cocoa farmers in |
| Data format | Raw |
| Parameters for data collection | In cocoa trade markets, two main quality parameters considered are moisture content (MC) and fat content (FC). Both quality parameters were used as parameters for data collection and were predicted simultaneously using adopted NIRS technology. |
| Description of data collection | Spectra data were firstly subjected onto principal component analysis (PCA) and |
| Data source location | Spectra data, actual moisture and fat content of intact cocoa beans were collected at the Department of Agricultural Engineering, Faculty of Agriculture Syiah Kuala University, Banda Aceh – Indonesia. |
| Data accessibility | Combined dataset are presented as MS Excel (.xlsx) and Unscrambler (.unsb) extension formats and available on this article. Dataset also can be found in Mendeley repository data: |
Spectra data of intact cocoa bean samples can be used to predict several quality parameters of cocoa beans simultaneously and rapidly bypassing standard laboratory procedures. Provided dataset can be reanalysed and remodelled using different regression or spectra correction approaches. Obtained models were benefited for cocoa manufacturers and industries for fast quality inspection of their cocoa products. Spectral dataset can be corrected, enhanced using different pre-processing methods and applied onto prediction models. Data generated from adopted NIRS technology proven to be useful for cocoa farmers in evaluating quality parameters. |