| Literature DB >> 34771050 |
Amorn Slosse1,2, Filip Van Durme1, Nele Samyn1, Debby Mangelings2, Yvan Vander Heyden2.
Abstract
Cannabis sativa L. is widely used as recreational illegal drugs. Illicit Cannabis profiling, comparing seized samples, is challenging due to natural Cannabis heterogeneity. The aim of this study was to use GC-FID and GC-MS herbal fingerprints for intra (within)- and inter (between)-location variability evaluation. This study focused on finding an acceptable threshold to link seized samples. Through Pearson correlation-coefficient calculations between intra-location samples, 'linked' thresholds were derived using 95% and 99% confidence limits. False negative (FN) and false positive (FP) error rate calculations, aiming at obtaining the lowest possible FP value, were performed for different data pre-treatments. Fingerprint-alignment parameters were optimized using Automated Correlation-Optimized Warping (ACOW) or Design of Experiments (DoE), which presented similar results. Hence, ACOW data, as reference, showed 54% and 65% FP values (95 and 99% confidence, respectively). An additional fourth root normalization pre-treatment provided the best results for both the GC-FID and GC-MS datasets. For GC-FID, which showed the best improved FP error rate, 54 and 65% FP for the reference data decreased to 24 and 32%, respectively, after fourth root transformation. Cross-validation showed FP values similar as the entire calibration set, indicating the representativeness of the thresholds. A noteworthy improvement in discrimination between seized Cannabis samples could be concluded.Entities:
Keywords: alignment optimization; chromatographic fingerprint; comparison intra- and inter-location samples; data pre-processing; design of experiments
Mesh:
Substances:
Year: 2021 PMID: 34771050 PMCID: PMC8587667 DOI: 10.3390/molecules26216643
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Cannabis fingerprints derived from (A) GC–FID and (B) GC–MS (without THC peak).
Figure 2(A) GC–FID intra-correlation coefficients colour map showing three outlying samples (6, 9, and 10) in the fourth cultivation site. (B) GC–FID fingerprints of the Cannabis samples from the fourth plantation. The outlying chromatograms are plotted in red.
An overview of the total FN% and FP% from all studied pre-treatment methods for both GC–FID and GC–MS after aligning once with ACOW. The fourth root normalization, i.e., the best pre-treatment, is marked in bold.
| GC–FID | GC–MS | |||||||
|---|---|---|---|---|---|---|---|---|
| Pre-Treatment Method | 95% CL | 99% CL | 95% CL | 99% CL | ||||
| FN (%) | FP (%) | FN (%) | FP (%) | FN (%) | FP (%) | FN (%) | FP (%) | |
|
| 6 | 57 | 4 | 65 | 6 | 54 | 2 | 57 |
|
| 9 | 55 | 5 | 65 | 7 | 51 | 2 | 56 |
|
| 6 | 57 | 4 | 64 | 5 | 48 | 2 | 53 |
|
| 7 | 52 | 4 | 65 | 5 | 48 | 2 | 53 |
|
| 6 | 30 | 4 | 39 | 4 | 38 | 2 | 46 |
|
|
|
|
|
|
|
|
|
|
|
| 6 | 19 | 4 | 27 | 6 | 64 | 0 | 87 |
Figure 3Histograms of the Pearson correlation coefficients and their respective distribution overlap for (A) the reference data (GC–FID-aligned data) and (B) after fourth root normalization. The yellow bar chart corresponds to the coefficients of the inter-location samples, while the blue bar chart refers to the intra-location correlation coefficients. The two confidence limits are represented as the green vertical line for the 95% CL and the red vertical line for the 99% CL.
Figure 4ROC curves representing the accuracy of the reference data (blue line; AUC = 0.834) and after fourth root normalization (red line; AUC = 0.947). The green diagonal line corresponds to the reference AUC = 0.5.
The generated AUCs and the 95% confidence interval for both the reference data and after fourth root normalization.
| Data | AUC | 95% Confidence Interval | |
|---|---|---|---|
| Lower Limit | Upper Limit | ||
|
| 0.834 | 0.815 | 0.853 |
|
| 0.947 | 0.938 | 0.957 |
The total FN and FP error rates of the CV approaches. The entire calibration dataset consisted of 426 intra-plantation r-values with 4230 inter-plantation correlation coefficients.
| % Misclassifications | ||||
|---|---|---|---|---|
| Cross-Validation Approach | 95% CI Limit | 99% CI Limit | ||
| FN | FP | FN | FP | |
|
| 6 | 24 | 4 | 32 |
|
| 7 | 25 | 4 | 33 |
|
| 6 | 24 | 4 | 32 |
Figure 5Three-level full factorial design, with −1 representing the lowest level of both factors (SL and SS), 0 representing the intermediate level, and 1 representing the highest level.
SL and SS level values used in the full factorial designs.
| Design | Segment Length (SL) | Slack Size (SS) | ||||
|---|---|---|---|---|---|---|
| −1 | 0 | 1 | −1 | 0 | 1 | |
|
| 15 | 58 | 100 | 1 | 6 | 10 |
|
| 25 | 113 | 200 | 1 | 6 | 10 |
Figure 6Schematic overview of the methodology used to evaluate the discriminating power of the studied pre-treatments.