Literature DB >> 28607948

Direct and indirect alcohol biomarkers data collected in hair samples - multivariate data analysis and likelihood ratio interpretation perspectives.

Eugenio Alladio1,2, Agnieszka Martyna3, Alberto Salomone2, Valentina Pirro4, Marco Vincenti1,2, Grzegorz Zadora3,5.   

Abstract

The concentration values of direct and indirect biomarkers of ethanol consumption were detected in blood (indirect) or hair (direct) samples from a pool of 125 individuals classified as either chronic (i.e. positive) and non-chronic (i.e. negative) alcohol drinkers. These experimental values formed the dataset under examination (Table 1). Indirect biomarkers included: aspartate transferase (AST), alanine transferase (ALT), gamma-glutamyl transferase (GGT), mean corpuscular volume of the erythrocytes (MCV), carbohydrate-deficient-transferrin (CDT). The following direct biomarkers were also detected in hair: ethyl myristate (E14:0), ethyl palmitate (E16:0), ethyl stearate (E18:1), ethyl oleate (E18:0), the sum of their four concentrations (FAEEs, i.e. Fatty Acid Ethyl Esters) and ethyl glucuronide (EtG; pg/mg). Body mass index (BMI) was also collected as a potential influencing factor. Likelihood ratio (LR) approaches have been used to provide predictive models for the diagnosis of alcohol abuse, based on different combinations of direct and indirect alcohol biomarkers, as described in "Evaluation of direct and indirect ethanol biomarkers using a likelihood ratio approach to identify chronic alcohol abusers for forensic purposes" (E. Alladio, A. Martyna, A. Salomone, V. Pirro, M. Vincenti, G. Zadora, 2017) [1].

Entities:  

Keywords:  Alcohol; Empirical cross entropy; Ethyl glucuronide; Fatty Acid Ethyl Esters; Hair analysis; Likelihood ratio; Multivariate data analysis

Year:  2017        PMID: 28607948      PMCID: PMC5457474          DOI: 10.1016/j.dib.2017.03.026

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data The data reported here represent a valuable collection of all the common biomarkers of alcohol abuse used worldwide; the distinct populations of chronic and non-chronic alcohol consumers can possibly be used by other researcher to develop further interpretation models. The Empirical Cross Entropy plots provide a novel way to look at the effectiveness of alcohol biomarkers that other researcher may use for comparison with more traditional data representations. The detailed data report allows a clear comparison between univariate, multivariate and Bayesian approaches, where the latter is suggested as a benchmark for further developments. The mathematical background reported in the “materials and methods” section allows other researcher to transpose the offered approach to different applications.

Data

Data relative to the population of 125 individuals monitored, previously classified as either chronic (i.e. positive) and non-chronic (i.e. negative) alcohol drinker, are available in Table 1. Analysis of likelihood ratio models and its performance metrics, such as Empirical Cross Entropy plots (ECE), allowed to compare the predictive capabilities of direct and indirect biomarkers of ethanol consumption, as described in [1].
Table 1

Data matrix (125×12) containing the concentration values of the reference populations (i.e. individuals labeled as negative or positive) for the following target analytes: the sum of ethyl myristate, ethyl palmitate, ethyl stearate and ethyl oleate concentrations (FAEEs; ng/mg), ethyl glucuronide (EtG; pg/mg), aspartate transferase (AST, IUL−1), alanine transferase (ALT; IUL−1), gamma-glutamyl transferase (GGT; IUL−1), mean corpuscular volume of the erythrocytes (MCV; fL), carbohydrate-deficient-transferrin (CDT; %) and body mass index (BMI).

SubjectClassFAEEsEtGASTALTGGTMCVCDTBMI
1Negative0.242421373997.51.127
2Negative0.032224274293.61.119
3Negative0.341824201691.42.627
4Negative0.231128183497.31.023
5Negative0.101820221487.40.722
6Negative0.071924312696.41.326
7Negative0.021719191795.21.127
8Negative0.001818131992.81.229
9Negative0.161116221690.60.926
10Negative0.2915302510295.50.722
11Negative0.172319323091.71.920
12Negative0.342018152586.91.117
13Negative0.231425221687.61.325
14Negative0.242429332691.81.421
15Negative0.272025393495.61.128
16Negative0.301524202798.41.220
17Negative0.191120171387.20.927
18Negative0.141218181873.71.023
19Negative0.191523203887.41.124
20Negative0.341220274288.01.023
21Negative0.092126242887.30.925
22Negative0.131425231788.70.924
23Negative0.041425171495.60.725
24Negative0.371329326894.01.323
25Negative0.101228211572.31.422
26Negative0.071818162196.41.227
27Negative0.112123211990.21.224
28Negative0.191133592192.01.122
29Negative0.191226341988.91.023
30Negative0.2016372412993.41.128
31Negative0.361322211690.11.328
32Negative0.361919182185.11.226
33Negative0.361222201375.61.823
34Negative0.402326482594.20.922
35Negative0.441325201794.60.921
36Negative0.052622101194.91.025
37Negative0.02139893585.61.428
38Negative0.384356611990.11.135
39Negative0.00619182684.71.125
40Negative0.38431259187.61.222
41Negative0.328411252088.90.826
42Negative0.09928596683.10.927
43Negative0.00233201783.70.825
44Negative0.26823232558.81.021
45Negative0.00124421192.61.225
46Negative0.21123211980.01.123
47Negative0.10735282993.11.123
48Negative0.11922272089.11.327
49Negative0.02951261989.50.623
50Negative0.31922263288.61.126
51Negative0.12332438789.70.927
52Negative0.03726495792.71.429
53Negative0.07625142895.31.024
54Negative0.07126352588.91.023
55Negative0.21324243088.11.220
56Negative0.04723295189.41.126
57Negative0.02125263789.91.127
58Negative0.03420161487.80.922
59Negative0.21718141695.31.225
60Negative0.31522231592.71.121
61Negative0.42835402192.71.023
62Negative0.00218111692.10.719
63Negative0.22320243387.51.229
64Negative0.06224272887.51.024
65Negative0.09432171597.81.121
66Negative0.26725211488.91.121
67Negative0.05930351799.11.021
68Negative0.09920151888.21.622
69Negative0.01149428291.70.624
70Negative0.02124212092.00.924
71Negative0.10218172388.40.921
72Negative0.16822304390.11.143
73Negative0.37331312191.10.821
74Negative0.04126241389.80.923
75Negative0.13427344584.91.131
76Negative0.25832513388.81.329
77Negative0.10631325588.51.227
78Negative0.15223323383.90.928
79Negative0.01127384393.11.526
80Negative0.25620161190.31.219
81Negative0.14624202493.11.128
82Negative0.35417181689.60.928
83Negative0.16719232192.00.719
84Negative0.161428811692.60.929
85Negative0.15223162998.91.720
86Negative0.02932493097.60.819
87Negative0.25827281898.00.823
88Negative0.00318252590.10.922
89Negative0.12525171585.51.024
90Negative0.01933282395.71.027
91Negative0.13530231591.90.819
92Negative0.20122243491.11.524
93Negative0.19224172192.50.719
94Negative0.02550863290.80.922
95Negative0.09134422293.91.327
96Negative0.02833262084.81.125
97Positive0.524323214563.41.728
98Positive0.923827231692.71.027
99Positive0.573637657898.81.326
100Positive0.935231212099.11.525
101Positive2.0535294013594.02.031
102Positive1.2256293510297.11.030
103Positive3.195240734191.61.425
104Positive1.564339301792.00.919
105Positive1.305217111893.71.518
106Positive1.353841533995.54.826
107Positive0.513619151292.01.221
108Positive0.516028423565.31.828
109Positive4.507927252588.71.426
110Positive1.42382392497.21.025
111Positive1.374122212691.10.922
112Positive2.983721192287.11.623
113Positive6.4410625282396.81.123
114Positive3.173321151498.40.824
115Positive0.985426376789.81.224
116Positive0.579322242793.60.926
117Positive2.253227311487.00.824
118Positive0.696525184293.81.125
119Positive2.456819142095.00.921
120Positive2.1095651149797.71.327
121Positive1.259026112394.54.229
122Positive1.03385239202106.51.027
123Positive5.8435284316096.81.428
124Positive2.0411925205898.52.023
125Positive2.021819132193.01.919
Data matrix (125×12) containing the concentration values of the reference populations (i.e. individuals labeled as negative or positive) for the following target analytes: the sum of ethyl myristate, ethyl palmitate, ethyl stearate and ethyl oleate concentrations (FAEEs; ng/mg), ethyl glucuronide (EtG; pg/mg), aspartate transferase (AST, IUL−1), alanine transferase (ALT; IUL−1), gamma-glutamyl transferase (GGT; IUL−1), mean corpuscular volume of the erythrocytes (MCV; fL), carbohydrate-deficient-transferrin (CDT; %) and body mass index (BMI).

Experimental design, materials and methods

Ethical approval for the study was granted by the Ethical Committee of the Azienda Ospedaliero-Universitaria San Luigi Gonzaga of Orbassano (Protocol Number 0012756). Serum activities of AST, ALT and GGT were measured by means of colorimetric assays with a Roche-Cobas Integra 800® auto-analyzer (Roche Diagnostic, Basel, Switzerland). MCV was measured with an ADVIA® 2120 Hematology auto-analyzer (Siemens Healthcare Diagnostic, Milan, Italy). The %CDT was determined by the HPLC reagent kit purchased by BioRad® (Munich, Germany). FAEEs were detected by HS–SPME–GC/MS analysis and a MultiPurpose Sampler Flex A05-FLX-0001 (Est Analytical, West Chester Township, OH, USA) equipped with a 65 μm StableflexTM polydimethylsiloxane/divinylbenzene fiber (PDMS/DVB) from Supelco (Sigma-Aldrich, Milan, Italy) was used in combination with a 6890N GC 5975-inert MSD (Agilent Technologies, Milan, Italy). EtG concentrations were monitored by UHPLC–MS/MS analysis and a Shimadzu Nexera UHPLC system (Shimadzu, Duisburg, Germany) interfaced to an AB Sciex API 5500 triple quadrupole mass spectrometer (AB Sciex, Darmstadt, Germany) was employed. Descriptions about the analytical methodologies utilized to detect both the direct and the indirect biomarkers are available in [1] and [4]. Base 10 logarithm transformation (log10x) was applied on the analyzed data. Before calculating the different LR models, all the variables were autoscaled and equal prior probabilities were utilized. LR evaluations (briefly represented by this formula LR=Pr(E|H1)/Pr(E|H2)) involved two mutually exclusive hypotheses (H1: the subject is not a chronic alcohol abuser – “negative” class; H2: the subject is a chronic alcohol abuser – “positive” class) and a reference population was used to build the model, representing the experimental evidence (E). The ECE plots relative to indirect biomarkers detected in blood samples are reported in Fig. 1.
Fig. 1

The ECE plots describing the performance of univariate LR models relative to ALT (a), AST (b), CDT (c) and GGT (d), MCV (e) and BMI (f) variables. These plots suggest that the indirect biomarkers detected in blood samples prove inadequate to provide clear discrimination between chronic from non-chronic alcohol consumers, as measured by both correct classification rates and ECE plots.

The ECE plots describing the performance of univariate LR models relative to ALT (a), AST (b), CDT (c) and GGT (d), MCV (e) and BMI (f) variables. These plots suggest that the indirect biomarkers detected in blood samples prove inadequate to provide clear discrimination between chronic from non-chronic alcohol consumers, as measured by both correct classification rates and ECE plots. ECE plots relative to the sum of the four FAEEs and EtG are reported in [1]. Further LR models were tested combining biomarkers, providing higher performances. As an example, LR models developed taking into account all the variables simultaneously (LR8, i.e. AST, ALT, GGT, CDT, MCV, BMI, FAEEs and EtG) and a shorter list of variables (LR4, i.e. CDT, GGT, FAEEs and EtG) are shown in Fig. 2a–b.
Fig. 2

The ECE plots describing the performance of LR models relative to all the variables (LR8)(a) and CDT, GGT, FAEEs and EtG only (LR4) (b).

The ECE plots describing the performance of LR models relative to all the variables (LR8)(a) and CDT, GGT, FAEEs and EtG only (LR4) (b). Multivariate approaches were also performed on the collected data simultaneously; Principal Components Analysis [5] (PCA, Fig. 3a) and Partial Least Squares – Discriminant Analysis [6] (PLS-DA, Fig. 3b).
Fig. 3

The PCA(a) and PLS-DA (b) Score Plots: chronic alcohol drinkers are represented by red diamonds, while non-chronic alcohol drinkers are indicated by green squares.

The PCA(a) and PLS-DA (b) Score Plots: chronic alcohol drinkers are represented by red diamonds, while non-chronic alcohol drinkers are indicated by green squares. The formulas employed, together with the description of ECE plots, are reported in Supplementary material.
Subject areaChemistry
More specific subject areaBiomarkers of ethanol consumption in biological samples
Type of dataTables, figures
How data was acquiredAnalysis by Likelihood Ratio (LR) approach regarding the collected concentration values of the direct and indirect biomarkers of alcohol consumption.
Data formatAnalyzed
Experimental factorsCorrect classification rates and Empirical Cross Entropy (ECE) plots[2], [3]were employed to evaluate LR models
Experimental featuresAST, ALT and GGT were measured by means of colorimetric assays, MCV was measured with an on-purpose hematological auto-analyzer, %CDT was determined by an ad hoc High Performance Liquid Chromatography (HPLC) reagent kit, FAEEs were detected by HS-SPME-GC/MS analysis and EtG concentrations were monitored by Ultra High Performance Liquid Chromatography - Tandem Mass Spectrometry (UHPLC–MS/MS).
Data source locationCentro Regionale Antidoping e di Tossicologia “A. Bertinaria”, Regione Gonzole 10/1, 10043 Orbassano, Torino, Italy.
Data accessibilityData are included in this paper
  3 in total

1.  Information-theoretical feature selection using data obtained by scanning electron microscopy coupled with and energy dispersive X-ray spectrometer for the classification of glass traces.

Authors:  Daniel Ramos; Grzegorz Zadora
Journal:  Anal Chim Acta       Date:  2011-05-24       Impact factor: 6.558

2.  Multivariate strategies for screening evaluation of harmful drinking.

Authors:  Valentina Pirro; Paolo Oliveri; Bruno Sciutteri; Raffaella Salvo; Alberto Salomone; Silvia Lanteri; Marco Vincenti
Journal:  Bioanalysis       Date:  2013-03       Impact factor: 2.681

3.  Evaluation of direct and indirect ethanol biomarkers using a likelihood ratio approach to identify chronic alcohol abusers for forensic purposes.

Authors:  Eugenio Alladio; Agnieszka Martyna; Alberto Salomone; Valentina Pirro; Marco Vincenti; Grzegorz Zadora
Journal:  Forensic Sci Int       Date:  2016-12-21       Impact factor: 2.395

  3 in total
  1 in total

1.  Evaluation of Forensic Data Using Logistic Regression-Based Classification Methods and an R Shiny Implementation.

Authors:  Giulia Biosa; Diana Giurghita; Eugenio Alladio; Marco Vincenti; Tereza Neocleous
Journal:  Front Chem       Date:  2020-10-21       Impact factor: 5.221

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.