Products with a Protected Denomination of Origin (PDO) are vulnerable to misdescription of their true geographical origin. In this work a method has been developed that allows the authentication of La Vera paprika powder (Pimentón de la Vera), a PDO product from the central-west Spanish region, Extremadura. The mass fractions of Br, Ca, Cr, Cl, Cu, Fe, K, Mn, Ni, P, Rb, S, Sr and Zn determined by energy dispersive X-ray fluorescence (ED-XRF) are used for classification purposes by multivariate analysis using Soft Independent Modelling of Class Analogy (SIMCA) (PCA-Class) and Partial Least Square-Discriminant Analysis (PLS-DA). Sixty-seven paprika samples purchased in supermarkets around Europe and on-line via the official web-site of Pimentón de La Vera, were used to build up the models for prediction purposes. The PCA-class model of La Vera paprika powder had a sensitivity of 82%, a specificity of 100% and an accuracy of 91%, whereas the PLS-DA model had a sensitivity of 100%, a specificity of 91% and an accuracy of 96%.
Products with a Protected Denomination of Origin (PDO) are vulnerable to misdescription of their true geographical origin. In this work a method has been developed that allows the authentication of La Vera paprika powder (Pimentón de la Vera), a PDO product from the central-west Spanish region, Extremadura. The mass fractions of Br, Ca, Cr, Cl, Cu, Fe, K, Mn, Ni, P, Rb, S, Sr and Zn determined by energy dispersive X-ray fluorescence (ED-XRF) are used for classification purposes by multivariate analysis using Soft Independent Modelling of Class Analogy (SIMCA) (PCA-Class) and Partial Least Square-Discriminant Analysis (PLS-DA). Sixty-seven paprika samples purchased in supermarkets around Europe and on-line via the official web-site of Pimentón de La Vera, were used to build up the models for prediction purposes. The PCA-class model of La Vera paprika powder had a sensitivity of 82%, a specificity of 100% and an accuracy of 91%, whereasthe PLS-DA model had a sensitivity of 100%, a specificity of 91% and an accuracy of 96%.
The consumption of spices in the European Union has increased over the last years (Van Asselt, Banach, & van der Fels-Klerx, 2018). Their presence in processed food and ethnic ready-to-eat products, whoseconsumption hasalso increased in parallel, certainly contributes to the upward trend in consumption of spices. Asia and Europe are the main consumers (Galvin-King, Haughey, & Elliot, 2018), with paprika powder and pepper being the spices more widely consumed in Europe (CBI Market Intelligence, 2011).Spices are one of the food commodities frequently affected by fraudulent practises. In the particular case of paprika powder, special attention has been paid to adulteration by addition of colorants, such as Sudan dyes, to increase their appearance (colour enhancement). Many methods were developed and validated to detect that type of fraud using different approaches such as Nuclear Magnetic Resonance (NMR) (Hu, Wang, & Lu, 2017), Infrared spectroscopy (IR) (Horn, Esslinger, Pfister, Fauhl-Hassek, & Riedl, 2018), Raman spectroscopy (Monago-Maraña et al., 2019) and UV–Vis determination (Vera, Ruisánchez, & Callao, 2018). A review on analytical methods used to determine Sudan dyes in food was published by Rebane et al. (Rebane, Leito, Yurchenko, & Herodes, 2010).Another type of fraud affecting spices is origin masking, which refers to false declarations of the geographical origin of the product. Frequent victims of this type of fraud are particular foods that are recognised for their quality and produced in a specific geographical region using well-defined manufacturing processes and are therefore entitled to a Protected Denomination of Origin (PDO) designation (Council Regulation (EC) No 510/2006, 2006). In Europe six paprika powders hold a PDO: Pimentón de la Vera (Spain), Pimentón de Murcia (Spain), Szegedi paprika (Hungary), Kalocsai Fűszerpaprika-őrlemény (Hungary), Žitavská paprika (Slovakia) and Piment d'Espelette (France) (Agriculture).PDO paprika powders are produced using specific botanical varieties of paprika, growing in well-defined geographical regions and applying always the same manufacturing process. From that point of view, characterisation of elemental profiles seems to be an excellent approach to differentiate PDO paprika powders from non-PDO products. Several works have been published demonstrating that elemental profiles of food are linked to geographical origin and botanical varieties in different types of food commodities such as honey, tea and potatoes (Kropf et al., 2010; Latorre, Barciela García, García martin & Peña Crecente, 2013; Rajapaksha et. 2017).Strontium isotope ratios (87Sr/86Sr) and multielement pattern (Rb, Sr, Y, Zr, Mo, Cd, Ba, Pb, Th, U, Mg, Ca, Sc, Ti, Cr, Mn, Fe, Co, Ni, Cu, Zn, As and rare elements) determined by Inductively Coupled Plasma-Mass Spectrometry (ICP-MS) had been used to fingerprint Szegedi paprika for authentication purposes (Brunner, Katona, Stefánka, & Prohaska, 2010). More recently a paper has been published about the characterisation of paprika samples according to their geographical origin using different parameters such as: moisture, totalcontent of ash, lipids, nitrogen, glucose, fructose, sucrose, ASTAcolour value, pH of water extract and the mass fractions of Ca, K, Mg, Na, Cu, Fe, P and Zn. In this case, the elementalcomposition was determined by ICPcoupled to Atomic Emission Spectrometry (ICP-AES) (Štursa, Diviš, & Pořizka, 2018). After multivariate evaluation of the results, the authors succeeded to classify paprika from Hungary, Romania, and Slovakia in one cluster and Spanish paprika in another. Both ICP-MS and ICP-AES based methods include a sample pre-treatment step based on microwave digestion withnitric acid. A paper has been published on the geographical characterisation of two Spanish PDO paprika powders, La Vera and Murcia (Palacios-Morillo, Jurado, Alcazar & de Pablo, 2014) by multivariate analysis of multielementalcontent obtained with ICP-AES, using strong acid digestion of the matrix withthree different mixtures: H2SO4 + HNO3, H2O2 and HNO3, and HClO4 and HNO3. Only the last mixture allowed quantitative recovery of the elements analysed.This work describes the outcome of a feasibility study carried out to demonstrate that Pimentón de la Vera (LV) can be differentiated from other paprika powders using its elementalcomposition determined by ED-XRF. ED-XRF which does not require any sample pre-treatment other than preparation of pellets withthe paprika powder, has been used in the multielemental profiling of 67 paprika powders: 33 Pimentón de La Vera (LV), 7 Spanish not from La Vera (SNLV) and 27 either not Spanish or of unknown origin (Rest) paprika powders without aiming at a thorough characterisation of the macro and trace element mass fractions of the analysed samples.The obtained mass fractions were evaluated by multivariate analyses, using Soft Independent Modelling of Class Analogy (SIMCA) (PCA-Class) and Partial Least Square-Discriminant Analysis (PLS-DA), to assess the ability of the approach to verify the claims made on the geographical origin of the products on their respective labels, in particular of the Pimentón de La Vera samples.
Materials and methods
Paprika powder samples
Sixty-seven paprika powder samples commercially available were included in the study. Forty-five were purchased in supermarkets in Austria, Belgium, Bulgaria, France, Greece, Hungary, Italy and Spain, and 22, all of them LV, were purchased on-line via the official web site of the Regulatory Counsel of Pimentón de La Vera. Paprika powders were bought in the period 2014–2020. The samples were classified in three different groups: 33 LV, 7 SNLV and 27 Rest. Information about the paprika powders analysed is summarised in Table 1. The SNLV and Rest paprika powders were not combined in one single population because very likely the SNLV paprika powders are more similar in elementalcomposition to LV samples than to paprika powders from other parts of the world (Štursa, V.; Diviš, P.; Pořizka, J.; 2018) and could be wrongly classified (false positives) as LV by the multivariate tools used.
Table 1
Information on the paprika powders included in the study.
Sample code
Type of paprika
Smoked
Group in the study
Country of origin
Country of purchase
Year of purchase
PAPR0001
Semisweet
Yes
LV
Spain
Spain
July 2017
PAPR0002
Hot
Yes
LV
Spain
Spain
July 2017
PAPR0003
Sweet
Yes
LV
Spain
Spain
July 2017
PAPR0005
Sweet
Yes
LV
Spain
Spain
March 2018
PAPR0006
Sweet
Yes
LV
Spain
Spain
March 2018
PAPR0007
Sweet
Yes
LV
Spain
Spain
March 2018
PAPR0010
Hot
Yes
LV
Spain
Spain
March 2018
PAPR0045
Sweet
Yes
LV
Spain
Spain
August 2018
PAPR0055
Sweet
Yes
LV
Spain
Spain
February 2019
PAPR0050
Sweet
Yes
LV
Spain
Spain
April 2019
PAPR0058
Sweet
Yes
LV
Spain
Spain
August 2019
PAPR0059
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0060
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0061
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0062
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0063
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0064
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0065
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0066
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0067
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0068
Semisweet
Yes
LV
Spain
On-line
March 2020
PAPR0069
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0070
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0071
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0072
Semisweet
Yes
LV
Spain
On-line
March 2020
PAPR0073
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0074
Semisweet
Yes
LV
Spain
On-line
March 2020
PAPR0075
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0076
Hot
Yes
LV
Spain
On-line
March 2020
PAPR0077
Semisweet
Yes
LV
Spain
On-line
March 2020
PAPR0078
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0079
Semisweet
Yes
LV
Spain
On-line
March 2020
PAPR0080
Sweet
Yes
LV
Spain
On-line
March 2020
PAPR0011
Sweet
No
SNLV
Spain
Spain
March 2016
PAPR0008
Sweet
No
SNLV
Spain
Spain
March 2018
PAPR0009
Sweet
No
SNLV
Spain
Spain
March 2018
PAPR0022
Sweet
No
SNLV
Spain
Poland
May 2018
PAPR0023
Sweet
Yes
SNLV
Spain
Belgium
May 2018
PAPR0025
Sweet
No
SNLV
Spain
Belgium
May 2018
PAPR0053
Sweet
No
SNLV
Spain
Belgium
May 2018
PAPR0004
Sweet
No
Rest
NA
Belgium
January 2014
PAPR0017
Sweet
No
Rest
NA
France
May 2018
PAPR0018
Sweet
No
Rest
NA
France
May 2018
PAPR0019
Sweet
No
Rest
NA
France
May 2018
PAPR0020
NA
NA
Rest
NA
Belgium
May 2018
PAPR0021
NA
NA
Rest
NA
Belgium
May 2018
PAPR0024
Sweet
No
Rest
Hungary & Spain
Belgium
May 2018
PAPR0026
NA
NA
Rest
NA
Belgium
May 2018
PAPR0027
NA
NA
Rest
NA
Belgium
May 2018
PAPR0028
Sweet
No
Rest
NA
Belgium
May 2018
PAPR0030
NA
NA
Rest
Hungary
Hungary
May 2018
PAPR0032
Sweet
No
Rest
NA
Italy
May 2018
PAPR0033
Hot
NA
Rest
NA
Italy
May 2018
PAPR0034
Sweet
No
Rest
NA
Bulgaria
May 2018
PAPR0035
Hot
No
Rest
NA
Bulgaria
May 2018
PAPR0036
NA
No
Rest
NA
Belgium
May 2018
PAPR0037
Sweet
No
Rest
NA
Belgium
May 2018
PAPR0038
Sweet
No
Rest
NA
Greece
August 2018
PAPR0039
Sweet
No
Rest
NA
Greece
August 2018
PAPR0040
Sweet
No
Rest
NA
Greece
August 2018
PAPR0041
NA
Yes
Rest
NA
Greece
August 2018
PAPR0042
NA
Yes
Rest
Bulgaria
Greece
August 2018
PAPR0043
Sweet
No
Rest
Hungary
France
August 2018
PAPR0044
NA
Yes
Rest
NA
Greece
August 2018
PAPR0047
Sweet
No
Rest
Israel
Belgium
April 2019
PAPR0048
NA
Yes
Rest
NA
Belgium
April 2019
PAPR0049
NA
No
Rest
NA
Belgium
April 2019
Information on pan class="Chemical">the paprika powders included in the study.
The LV population comprised paprika of twelve different brands, all of them listed in the officialsite of the Pimentón de La VeraPDO (Pimentonvera-origen), and included hot (11), sweet (16), and semi-sweet (6) paprika, covering in this way different botanical varieties. In the SNLV group all paprika powders were sweet, one of them smoked, while the Rest population was formed by 14 sweet, 2 hot, 11 with no indication about sweet or hot powders; 4 were smoked paprika powders. Some La Vera paprika contained 3% sunflower oil, which is allowed by that PDO. To test the influence of different production years, the paprika powders included in this study were purchased in the period 2014–2020.No information about geographical origin was provided for 16 paprika powders in the Rest population and so they could be totally or partially of Spanish origin. For that reason no particular emphasis was given to the correct classification of SNLV and Rest paprika powders by their respective models. Only the absence/presence of false positives in those two populations when compared to the LV models was carefully evaluated.
Instrumentation and sample preparation
An Epsilon 5 ED-XRF spectrometer (PANalytical, Almelo, The Netherlands) was used to determine the elemental mass fractions of Mg, P, Cl, S, K, Ca, Cr, Mn, Ni, Cu, Zn, As, Br, Rb, Sr, Ba, Sb, Cs and Mo in the paprika samples, using a method that had been previously validated for all the mentioned elements but Sb and Cs for which only indicative results were obtained (See chapter 3.1). A detailed description of the instrument used, the outcome of the validation study, and the performance characteristics achieved, are described elsewhere (Fiamegos & de la Calle, 2018). More than half of the paprika samples had Mg, Sb, Mo and Cs mass fractions below the quantification limit of the method, and since they were randomly distributed, those elements were not used for modelling purposes. Data for Al, Si, Ti, Zr, Nb, Cd, Pb, Hg, V, Co, Se, Sn, La, Sm, Ce and Nd were also recorded, but the mass fractions of those elements were below the limit of quantification (LoQ) of the method used for all samples.The performance of the ED-XRF was checked once a week measuring the reference sample FLX-S13 (Fluxana, Bedburg-Hau) provided by the manufacturer. Although no systematic bias was observed for any of the measured elements, the ED-XRF was recalibrated every week withthe mentioned reference sample to correct the normal drift of the instrument, as recommended by the manufacturer.Paprika powders were dried at 70 °C for 24 h, then 4 g were accurately weighed and thoroughly mixed with 1 g of wax, CEREOX® (Fluxana GmbH, Bedburg-Hau) to avoid crumbling of the pellets during measurements which could result in contamination of the detector. In order to check for possible contamination of the wax with some of the elements used for modelling purposes, a pellet was made exclusively with wax and measured using the same method than the paprika samples. None of the elements included in the study could be detected, thus blank corrections were not needed. Paprika powder and wax were thoroughly mixed manually with a metal free spatula and the resulting mixture was used to make 40 mm diameter pellets as described elsewhere (Fiamegos & de la Calle, 2018). One pellet per sample was made and each pellet was measured once.
Multivariate analysis of the results
The Student's t-tests to determine which elements have pan class="Chemical">significantly different mass fractions in the three studied populations were run with Statistica (TIBCO, Version 13.5.0.17).
The Software pan class="Chemical">SIMCA Version 15.0.2, Umetrics (Malmö, Sweden) was used to carry out the multivariate analysis of the data (Eriksson, Byrne, Johansson, Trygg, & Vikström, 2013).
PrincipalComponent Analysis (PCA) was used as non-supervised multivariate tool to visually evaluate whether LV paprika powders form a cluster separated from those of SNLV and Rest, respectively, on the basis of their elementalcomposition. Soft Independent Modelling of Class Analogy (SIMCA), hereafter refer to as PCA-Class to avoid confusion withthe name of the Software used, was applied to construct models for each one of the three populations, LV, SNLV and Rest. The amount of principalcomponents wasset to three for all models to avoid overfitting. Loading plots were used to select the variables (elements) to construct the models. Elements situated in the loading plots in the centre of the coordinates or close to them were eliminated and the PCA model was built up again. The Mahalanobis distance (DModX PS+), which is the distance between a point and the centroid of the distribution (https://www.itl.nist.gov/div898//dataplot/refman2/auxillar/matrdist.htmSoftware), was used to detect outliers in each of the three population used for modelling purposes. When the Mahalanobis distance for a certain sample was larger than Dcrit (95% confidence interval) of that model, the sample wasconsidered an outlier.Resampling was used to validate the PCA-Class models for La Vera, SNLV and Rest paprikas, by leaving each time one of the paprika samples out and using the resulting model for classification purposes. The Mahalanobis distance (DModX PS+) wasalso used to detect false positives and false negatives in the PCA-class models. Paprika powders in the SNLV and Rest populations whose DModX PS+ wassmaller than Dcrit (95% confidence interval) in LV PCA-class model, would be considered false positives. Paprika powders in the LV population with a DModX PS + larger than Dcrit in LV models, were considered false negatives.The supervised algorithm PLS-DA, was used in an attempt to increase the amount of satisfactory classifications, always after optimisation via the loading plots and using three principalcomponents to avoid overfitting. Again, to validate the PLS-DA LV-SNLV and LV-Rest models, resampling by leaving each time one of the paprika samples out and running classification tests for the left-out sample, was used.The prediction performance of the PCA-class and PLS-DA models was evaluated via sensitivity (true positives), specificity (true negatives) and accuracy, calculated as described by Barbosa et al. (Barbosa et al., 2016):Sensitivity = TP/(TP + FN)Specificity = TN/(TN + FP)Accuracy= (TP + TN)/(TP + TN + FP + FN)Where:TP: True posipan class="Chemical">tive, TN: True negative, FP: False positive, FN: False negative.
Results and discussion
Elemental characterisation of paprika powders by ED-XRF
The mass fractions of the elements used to construct the PCA, and PLS-DA models are shown in Fig. 1 a-d and given as supplementary material in Table S1, together withthe mean, the median and the standard deviation. The median is given as a robust location estimate not affected by extreme values obtained for some elements in a particular sample. In general, there is a good agreement between the mean and the median indicating that the mass fraction values are normally distributed.
Fig. 1
Mass fraction of several elements in LV, SNLV and Rest paprika populations: a) K, b) Fe, c) P, Cl, S and Ca and d) Cr, Mn, Bi, Cu, Zn, Br, Rb, Sr and Ca. Errors bars correspond to the standard deviation of all the samples in a certain population.
Mass fraction of several elements in LV, SNLV and Rest paprika populations: a) K, b) Fe, c) P, Cl, S and Ca and d) Cr, Mn, Bi, Cu, Zn, Br, Rb, Sr and Ca. Errors bars correspond to the standard deviation of all the samples in a certain population.ED-XRF is not a suitable technique for the determination of light elements, Mg being the first element that can be quantified withthe Epsilon 5 instrument used in this study, although with a high LoQ (1450 mg kg−1). More than half the paprika samples analysed had a concentration of Mg below the LoQ. Other elements such asAs, Mo, Pb and Sn, were also quantifiable in a few number of samples (less than 10). Sb and Cs were found in more than 50% of the samples but always with values slightly above the LoQ. In the validation of the method used in this work (Fiamegos & de la Calle, 2018), Sb and Cs were only analysed in inorganic matrices and the LoQ of the method for those two elements is derived from inorganic samples exclusively. For the reasons previously mentioned, Mg, As, Mo, PbSn, Sb and Cs were not used for modelling purposes and they are not included either in Table S1 nor in Fig. 1.Student's t-tests (95% CI) were carried out to comical">pare the mass fractions of P, Cl, K, Ca, Cr, S, Mn, Fe, Ni, Cu, Zn, Br, Rb, Sr and Ba in the LV, SNLV and the Rest paprika's. The mass fractions of K, Cr, Fe, Mn, Zn, Br, Sr and Ba were significantly different for the LV and SNLV paprika powders, and the mass fractions of P, Cl, S, K, Ca, Cr, Mn, Fe, Ni, Zn, Br, Sr and Ba were significantly different for LV and Rest. No significant difference was observed among the three different groups for the mass fractions of Cu and Rb.
As shown in Fig. 1, the Mn, Br and Zn mass fractions were systematically higher in LV paprika powders than in the other two populations. An opposite trend was observed for Fe and Sr, with lower mass fractions in LV than in SNLV and Rest.For the Mn, Fe and Sr mass fractions there was no overlap between the LV and any of the other two groups. The median of the Mn mass fractions in LV paprika was around twice the median values for SNLV and Rest, while the median of the Fe and Sr mass fractions in LV paprikas was half those of the other two populations. The Br mass fraction in LV population is around three times higher than in the other two populations but the large dispersion of data makes that the ranges in LV and Rest populations overlap when taking into consideration their respective standard deviations.Strictly speaking, La Vera paprika could be properly classified on the basis of their Mn, Fe and Srcontent. However, to make use of all the available information on elementalcomposition, multivariate analysis was carried out.The main advantage of ED-XRF in comparison with other multielemental analytical techniques such as ICP-MS and ICP-AES, is that no sample digestion with hazardous reagents, acids and/or bases, are required to achieve an accurate determination of various elements. Palacios-Morillo et al. (Palacios-Morillo, Jurado, Alcazar & de Pablo, 2014), made a comparison of different digestion methods: H2SO4 + HNO3, H2O2 and HNO3, and HClO4 and HNO3. Only the last one provided a quantitative recovery for all the elements analysed; in particular three different mass fractions were obtained for Sr depending on the digestion method used. The determination of Cl by ED-XRF is straightforward while a careful optimisation of the digestion is needed when ICP-based techniques are to be used (Pereira, Enders, Mello & Flores, 2018); on the contrary as described above some very light elements such as Na and B cannot be analysed withthe instrument used in this work.
Multivariate analysis of data
In a first attempt, PCA-class was used to decrease the amount of variables concentrating the information provided by the elemental mass fractions in a reduced number of components.The first steical">p was to usethe Mahalanobis distance to detect outliers in each of the three groups, Fig. 2. No outlier was detected in any of the three population, LV, SNLV and Rest.
Fig. 2
Detection of outliers in: a) LV, b) SNLV and c) Rest paprika populations making use of the Mahalanobis distance (DModX).
Detection of outliers in: a) LV, b) pan class="Chemical">SNLV and c) Rest paprika populations making use of the Mahalanobis distance (DModX).
Fig. 3 a and b show that PCA allowed the grouping of samples in clusters corresponding to the different geographical origin of the samples included in this study. The bi-plots in Fig. 3 c and d provide information about the elements that contribute most to the clustering of the samples in the PCA models, confirming the information inferred from the t-test's. Mn, Br and Zn play an important role in the clustering of samples becausethey are higher in LV than in SNLV and Rest. SNLV and Rest samples are particularly richer in Fe, Ba, Sr and Crthan LV paprikas. Ni mass fraction is clearly higher in the Rest population than in LV, while together withRb and Cu it is not relevant in the differentiation between LV and SNLV. A discrepancy withthe results of the t-tests is that Cl and S, and a bit less P, are important elements for the discrimination between the LV and SNLV clusters in PCA while the mass fractions of those elements were not found significantly different between the populations of LV and SNLV by the t-test.
Fig. 3
PCA score plots for a) LV-SNLV, b) LV-Rest, and bi-plots (score and loading) for c) LV-SNLV and d) LV-Rest.
PCA score plots for a) LV-pan class="Chemical">SNLV, b) LV-Rest, and bi-plots (score and loading) for c) LV-SNLV and d) LV-Rest.
The Mahalanobis distance wasalso used to evaluate the sensitivity (true positives) and the specificity (true negatives) of the LV PCA-class model. As shown in Fig. 4 a, all SNLV and all Rest paprikas (34 in total) have a DModX PS+ > Dcrit in the LV PCA-class model and are thus outliers. This indicates that the specificity of the model used is 100%.
Fig. 4
Detection of: a) False positives and b) false negatives in LV model, making use of the Mahalanobis distance (DModX).
Detection of: a) Fpan class="Chemical">alse positives and b) false negatives in LV model, making use of the Mahalanobis distance (DModX).
Six of the 33 LV paprika powders could not be classified as belonging to any of the three models and were considered as false negatives, which corresponded to a rate of false negatives of 18%. Fig. 4 b shows the result of the test for one of LV samples, flagged as false negative. The sensitivity of the PCA-Class model for LV samples was 82%. The accuracy of the model was 91%.Three of the six false negatives were purchased in the period August 2018–August 2019. Four samples were purchased in that period, the remaining sample was not flagged as false negative but also had a DModX PS + close to Dcrit. Among all La Vera paprika powders analysed, sample 8 in Fig. 4 b hasthe highest Mn, Sr, Rb, Zn, Ba, Ca, Cl and S contents, with a Mn content almost twice as high asthe median of the LV population. The other samples purchased in that period are also characterised by a general tendency to high elementalcontent. Different climatologic conditions during the cultivation of the paprika plants could have resulted in different element contents in the final paprika powders. The remaining three false negatives were purchased on-line in 2020 and they are characterised for having a Ni mass fraction around three times higher than the median of the LV population. Ni is a known contaminant and the increased levels could be due to contamination during processing, storage and/or transportation. No correlation was observed between false negatives and either brand or type of paprika (hot sweet and semisweet). The LV PCA model, Fig. 5, shows a) no cluster in function of the type of paprika, b) random distribution of the false negatives among all LV samples and c) confirmation by bi-plot of the trend in mass fractions of the false negatives, for instance the relatively high content in most elements of two of them, and the link with Ni content of some others.
Fig. 5
a) PCA score plot for the 33 LV paprikas coloured on the basis of their type, b) Distribution of false negatives among the LV population, c) Bi-plot (score and loading) for the LV PCA model.
a) PCA score plot for the 33 LV paprikascoloured on the basis of their type, b) Distribution of false negatives among the LV population, c) Bi-plot (score and loading) for the LV PCA model.PLS-DA, a supervised classification algorithm, was used to create models with improved classification power. Fig. 6 shows the PLS-DA models constructed for a) LV versus SNLV and b) LV versus Rest paprika powders. The two models were validated as described above for the PCA-class models, leaving one paprika out each time and constructing the PLS-DA models withthe remaining samples. All La Vera paprika powders were correctly classified as belonging to the LV group, meaning that the rate of false negatives was zero, corresponding to a sensitivity of 100%. One SNLV and two paprikas from the Rest population were wrongly classified as LV, thus out of the 34 paprikasthat did not belong to the LV population, three were false positives, what corresponds to a specificity of 91%. The accuracy of the PLS-DA model is 96%. Table 2 summarises the information about the validation of the PCA and PLS-DA models.
Fig. 6
PLS-DA score plots for: a) LV-SNLV and b) LV-Rest paprika populations.
Table 2
Performance characteristics of the PCA-class and PLS-DA models for classification of LV paprika powders.
TP
TN
FP
FN
Sensitivity
Specificity
Accuracy
PCA
27
34
0
6
82%
100%
91%
PLS-DA
33
31
3
0
100%
91%
96%
PLS-DA score plots for: a) LV-pan class="Chemical">SNLV and b) LV-Rest paprika populations.
Performance characteristipan class="Chemical">cs of the PCA-class and PLS-DA models for classification of LV paprika powders.
The performance of the PCA-Class and PLS-DA is similar, the former being less sensitive and more specific than the latter. The results previously discussed indicate that PCA-class models are more sensitive to small differences among the samples than thosebased on PLS-DA and could be more appropriate to carry out quality control analysis in the production sites. According to Dias de Lima and Barbosa (Dias de Lima & Barbosa, 2019) when the within class standard deviation is low PLS-DA could have better classification rates than PCA-Class. The La Vera paprika powders included in the study cover different commercialbrands, severalcultivation seasons, various botanical varieties and processing practises such asthe addition of sunflower oil. Likely changes in the composition of the final product due to transport and storage conditions are also taken on board because samples were purchased at different retailers in different countries or on-line separately for the different brands. The main factors that could contribute to the dispersion of results are then covered by the selection of samples. The ratio observations (33) to variables (15) is approximately 2. An increase in the amount of observations to achieve a ratio of around 10 would very likely result in an increase in the accuracy of both modelling approaches. The results obtained in this study are conclusive enough for the purpose of a feasibility study.
Conclusions
ED-XRF can be used to classify paprika powder holding the La VeraPDO. In this work, models were constructed using exclusively the information given in the labels of commercially available paprika powders, which can be the only information available for control authorities in their routine activities. Most control laboratories in the area of food fraud do not have access to a broad amount of genuine samples to useas reference either in multivariate or univariate analysis, and need to rely on samples commercially available. A way to have access to a large number of Pimenton de La Vera samples would be to collaborate with one or several producers and processing plants to obtain samples from different paprika types, growing places, cultivars and processing lots. However, in that way the effect of transportation and storage times and temperatures would not be covered, neither the variations in the mass fractions of some elements due to the long-term contact withthe packaging material, frequently metal cans in the case of La Vera paprika powders. The use of a large number of perfectly controlled samples to construct models for classification purposes might not be the right approach; the samples could not be representative of the maximum variability that can be expected, and the model would be characterised by a high rate of false negatives when applied to products at the end of the distribution chain. The approach followed in this study would overcome that problem but introduces the risk of including adulterated samples in the models, which would biasthe classification of authentic samples.Despite the reduced number of samical">ples and the wide variability associated to them, some clear tendencies allowing the correct classification of La Vera paprika powders were observed; the significantly different levels of Sr, Mn and Fe in LV samples are the most clear ones. Thosethree elements could be used as markers for classification purposes by laboratories that do not have competences in place to carry out chemometric analysis.
The construction of a database of elemental mass fractions, in samples covering all possible variations to be expected in Pimentón de La Vera, that once constructed could be used by any laboratory carrying out control analysis, was far beyond the goal of our feasibility study. Nevertheless, it could be the best way to prevent fraud on food commodities holding Protected Denomination of Origin; it would require collaboration among PDO consortia, control bodies and EU and MS authorities.May be the main advantage of the use of ED-XRF for the elemental characterisation of paprika powders is that digestion of the samples is not required, time and money are saved and production of hazardous waste is avoided. Another advantage is that XRF measurements can be carried out with hand-held devices, the use of which would be of great use for control authorities because analysis could be carried-out on-site. Nevertheless, this possibility would have to be further evaluated in detail, for instance to make sure that the precision and the sensitivity achieved for some elements allow accurate classifications.
CRediT authorship contribution statement
Yiannis Fiamegos: Formal analysis, Investigation, Writing - review & editing. Catalina Dumitrascu: Formal analysis, Investigation, Writing - review & editing. Sergej Papoci: Formal analysis. Maria Beatriz de la Calle: Conceptualization, Data curation, Investigation, Methodology, Project administration, Supervision, Validation, Writing - original draft.
Declaration of competing interest
Please check pan class="Chemical">the following as appropriate:
All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or revising it critically for important intellectualcontent; and (c) approval of the final version.This manuspan class="Chemical">cript has not been submitted to, nor is under review at, another journal or other publishing venue.
The aupan class="Chemical">thors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript
The following aupan class="Chemical">thors have affiliations with organizations with direct or indirect financial interest in the subject matter discussed in the manuscript: