Literature DB >> 31789950

Performance of liver biomarkers, in patients at risk of nonalcoholic steato-hepatitis, according to presence of type-2 diabetes.

Thierry Poynard^1,2, Valentina Peta^2,3, Olivier Deckmyn³, Raluca Pais^1,2, Yen Ngo^2,3, Frederic Charlotte⁴, An Ngo³, Mona Munteanu^2,3, Françoise Imbert-Bismut⁵, Denis Monneret⁵, Chantal Housset², Dominique Thabut^1,2, Dominique Valla⁶, Christian Boitard⁷, Laurent Castera⁶, Vlad Ratziu¹.

Abstract

OBJECTIVE: There is a controversy about the performance of blood tests for the diagnostic of metabolic liver disease in patients with type-2-diabetes in comparison with patients without type-2-diabetes. These indirect comparisons assumed that the gold-standard is binary, whereas fibrosis stages, steatosis and nonalcoholic-steato-hepatitis (NASH) grades use an ordinal scale. The primary aim was to compare the diagnostic performances of FibroTest in type-2-diabetes vs. controls matched on gender, age, fibrosis stages and obesity, and taking into account the spectrum effect by Obuchowski measure.
METHODS: Data were retrospectively compared among patients prospectively included, with simultaneous biopsy and blindly assessed FibroTest, SteatoTest-2 and NashTest-2. The secondary aim was to construct an index (SpectrumF3F4-Index) to predict an adjusted-area under the receiver operating curve (AUROC) for F3F4 diagnosis from the prevalences of fibrosis stages, permitting to reduce the spectrum effect when performances of FibroTest, transient elastography and magnetic resonance elastography are indirectly compared.
RESULTS: In 505 patients at risk of NASH, the Obuchowski measures [95% confidence interval (CI)] of FibroTest, SteatoTest-2 and NashTest-2 were all equivalent in 136 type-2-diabetes cases vs. 369 matched controls: 0.871 (0.837-0.905), vs. 0.880 (0.879-0.881), 0.835 (0.797-0.873) vs. 0.806 (0.780-0.832) and 0.829 (0.793-0.865) vs. 0.855 (0.829-0.869), respectively. Standard-AUROCs (95% CI) were 0.932 (0.898-0.965), 0.872 (0.837-0.907) and 0.834 (0.699-0.969) and reduced after adjustment by SpectrumF3F4-Index to 0.794 (0.749-0.838), 0.767 (0.750-0.783) and 0.773 (0.725-0.822) for transient, magnetic resonance elastography and FibroTest, respectively.
CONCLUSIONS: When compared by Obuchowski measures, the performances of tests were not different in patients with T2-diabetes vs. patients without T2-diabetes. When individual data are not available, adjusted-AUROCs reduced the spectrum effect.

Entities: Chemical

Mesh：

Substances：
Biomarkers

Year: 2020 PMID： 31789950 PMCID： PMC7337110 DOI： 10.1097/MEG.0000000000001606

Source DB: PubMed Journal: Eur J Gastroenterol Hepatol ISSN： 0954-691X Impact factor: 2.586

Introduction

Due to the increase in the liver disease-related mortality and the forthcoming treatment of nonalcoholic-steato-hepatitis (NASH), there is a need to identify performant noninvasive biomarkers for the diagnosis of significant metabolic liver diseases [nonalcoholic fatty liver disease (NAFLD)] [1]. In patients with type-2-diabetes mellitus (T2-diabetes) [2], the performances of patented blood tests such as FibroTest, SteatoTest-2, NashTest-2 for the diagnosis of liver diseases lesions, despite in the range of performances observed in patients at risk of NASH with and without diabetes [3-5], have been interpreted as lower than those usually observed in patients without T2-diabetes. These indirect comparisons, without controls and not taking into account the spectrum effect, assumed that the gold standard is binary, whereas fibrosis stages, steatosis and NASH grades use an ordinal scale. Indirect comparisons should be adjusted on the liver lesions’ prevalence in the context of use and on the spectrum of stages or grades (spectrum effect). These issues have been already studied in patients with chronic hepatitis C for the performance of FibroTest for the diagnosis of fibrosis stages, and the use of Obuchowski measure was recommended as well as adjusted-AUROC according to the prevalences of stages [6,7]. These recommendations were not frequently followed in patients at risk of NASH. In a recent meta-analysis of 64 studies of fibrosis biomarkers [8], only four (6.4%) used Obuchowski measure [3,9-11]. We identify only two other studies published in the same period not included in the meta-analysis [12,13], and two studies published later [14,15], which used Obuchowski measure. Therefore, the first aim was to analyze the performances of FibroTest (primary endpoint), and the new SteatoTest-2 [4], NashTest-2 [5], in a large group of T2-diabetes compared to patients at risk of NASH without T2-diabetes, and with patients with chronic hepatitis C, with and without T2-diabetes, using Obuchowski measures and matched controls for gender, age, obesity and fibrosis stages. The second aim, in the absence of Obuchowski measure, was to construct an index [index of fibrosis spectrum variability (Spectrum-Index)] in patients at risk of NASH, to predict an adjusted-AUROC for F3F4 diagnosis from the prevalences of biopsy proven fibrosis stages, as previously published for chronic hepatitis C [7]. This index was assessed in all the possible combinations of fibrosis stages using database with and without patients with T2-diabetes. This index permitted to assess the variability of the standard (binary) AUROCs for the diagnosis of fibrosis stages F3F4 of three biomarkers, FibroTest, transient elastography and magnetic resonance elastography, due to the presence of T2-diabetes, after adjustment according to the prevalences of fibrosis stages, from an-overview of 25 published studies [8].

Patients and methods

Common criteria of inclusion

The design was a noninterventional analysis of fresh serum specimen recorded in prospective subsets, two with patients at risk of NASH, and one with hepatitis-C to increase the sample size of cases and controls with biopsy with and without T2-diabetes (Fig. 1).

Fig. 1.

Populations included.

Populations included. The inclusion criteria were the same. To standardize the performances’ comparisons as much as possible, cases were defined as history of T2-diabetes or treatment of T2-diabetes, together with a fasting glucose of 7.0 ≥ mmol/l, with a contemporaneous biopsy and blood sample used for liver tests [16]. Patients receiving specific treatment for NAFLD before biopsy or other cause of diabetes were not included. All patients had contemporaneous biopsies with centralized scoring of features, independently to any result of blood tests. Patients were excluded if the blood tests were disqualified according to the company recommendations for reliable tests [17], or if the interval between the biopsy and blood tests was greater than 4 weeks. We analyzed post-HOC patients with T2-diabetes and at risk of NASH, and patients with chronic hepatitis C, prospectively included in two cohorts (FibroFrance NCT01927133, FLIP) [18-20] and a multicenter trial (EPIC3) [12,13,19,20] described in Supplementary File S2, Supplemental digital content 2, http://links.lww.com/EJGH/A490. The inclusion criteria were the presence of reliable FibroTest, ActiTest and SteatoTest, as well as the reading of the biopsy using the SAF scoring system. Ethical approvals were obtained for the interventional studies at each participating institution and all patients provided written informed consent [12,13,18-20].

Blood tests

The FibroTest, the updated NashTest-2, the updated SteatoTest-2 are patented tests (BioPredictive, Paris, France) that have been validated extensively to assess the stages of fibrosis, the necro-inflammatory activity and steatosis grades of steatosis, using the SAF and NASH Clinical Research Network (CRN) scoring systems for NAFLD [2-5,12,13,15,18,21], and the meta-analysis of histological data in viral hepatitis (METAVIR) scoring system for hepatitis-C [22]. Details were given in Supplementary file S2, Supplemental digital content 2, http://links.lww.com/EJGH/A490. The FibroTest includes serum α2-macroglobulin, apolipoprotein-A1, haptoglobin, total bilirubin and γ-glutamyl transpeptidase. NashTest-2 combined the seven components of FibroTest, plus cholesterol and triglycerides [5], SteatoTest-2 combined with different weights, 10 components of NashTest-2, without bilirubin, but with fasting glucose [4]. As the presence of at least 5% of steatosis is mandatory for the definition of NASH, but with a risk of false negative due to ‘burning steatosis’ (NASH without steatosis at biopsy, in cases without any cause of ballooning or inflammation) two NashTest-2 values were assessed by the algorithm. The standard NashTest-2 followed the CRN definition and cases with SteatoTest-2 grade S0 (<0.40 grade S1cutoff) were graded as non-NASH. The second value was the NashTest-2-‘raw’ value even if the SteatoTest-2 concluded as grade S0 (<0.40 grade S1 cutoff) [4,5,21]. The scores of these biomarkers range from 0 to 1.00, the highest scores being attributed to the most severe lesions. The preanalytical analytical procedures were those recommended by BioPredictive. Exclusion criteria were nonreliable results identified using security control algorithms [17].

Histological references

The SAF scoring system, specific for NAFLD features and permitting simplified construction of blood tests, has been described elsewhere, and detailed in Supplementary file S2, Supplemental digital content 2, http://links.lww.com/EJGH/A490 [21]. According to the combination of each semi-quantification of the three elementary features of NAFLD using the SAF score for steatosis, inflammatory activity and fibrosis, respectively, the steatosis score (S) varies from 0 to 3. Activity grade (A, from 0 to 4) is the addition of hepatocyte ballooning (0–2) and lobular inflammation (0–2). Fibrosis stage (F) varies from 0 to 4 [21,22].

Statistical methods

Comparison of Obuchowski measures

Our main hypothesis was that there was no significant difference between the Obuchowski measure of the three blood tests, for the prediction of fibrosis stages, NASH grades and steatosis grades, respectively, in T2-diabetes vs. matched non-TD2M controls. The primary endpoint was the Obuchowski measure, with five, five and four SAF ordinal classes, for FibroTest, NashTest-2 and SteatoTest-2, respectively [6].

Case-control studies

We matched the controls according to the four main factors associated, besides the presence of T2-diabetes, with the severity of NAFLD (male gender, age ≥ 50 years, liver fibrosis stages ≥ stage 2) and the severity of overweight (BMI ≥ 30 kg/m2) [23-26]. To construct a case-control population with the maximum of power and the minimum of differences between the confounding factors, we classified T2-diabetes and controls in 16 groups according to the 16 possible combinations of the four confounding factors. A nonsignificant difference was defined as a Fisher exact test ≥ 0.05 (Supplementary File S3, Table S1, Supplemental digital content 3, http://links.lww.com/EJGH/A491).

Construction and comparisons of adjusted-area under the receiver operating curves in patients with and without diabetes

The second aim, in the absence of Obuchowski measure, was to construct a ‘SpectrumF3F4-Index’ in patients at risk of NASH, with and without T2-diabetes, to predict an adjusted-AUROC of FibroTest for F3F4 diagnosis from the prevalences of biopsy proven fibrosis stages, as previously published for chronic hepatitis C [7]. Two studies with biopsy and FibroTest were used, 600 subjects at risk of NASH, with and without diabetes [3], and in the only study in cases with T2-diabetes [2]. Details were described in Table 4.

Table 4.

Comparison between standard-F3F4-area under the receiver operating curves of FibroTest vs. adjusted-area under the receiver operating curves and Spectrum-index, according to presence of T2-diabetes

The adjusted-AUROC was predicted by the linear regression linking the observed standard-AUROC to the SpectrumF3F4-Index. In a prospective study including 501 at risk of T2-diabetes of a tertiary center [23], the prevalence of the stages had an almost uniform distribution (each stage prevalence = 20%), and the SpectrumF3F4-Index was 2.59 (Supplementary File S3. Table S2, Supplemental digital content 3, http://links.lww.com/EJGH/A491). The regression curves linking the observed standard-AUROCs to the SpectrumF3F4-Index were compared for the 42 different combinations of prevalences in subsets of patients at risk of NASH with biopsy, in patients with and without T2-diabetes [2,3]. Furthermore, it was possible from published studies, to assess a possible impact of diabetes. For this purpose, we compared the regression lines of the subset of studies with prevalence of T2-diabetes equal or above the median vs. the subset with prevalence lower than the median. These data were those of an overview of the performances of transient and magnetic resonance elastography [8], with the studies of FibroTest [2,3,12,27]. We seek for a minimum of 100 cases of T2-diabetes to reach the median of the number of subjects included in diagnostic studies [28,29]. Comparison between Obuchowski measures used Z-test and equivalence tests of means [21]. Means comparison between several groups used multiple comparison Tukey-Kramer’s test. Number Crunching Statistical System (NCSS12) and numROC-software were used for statistical analyses [30,31].

Results

Patients’ characteristics

A total of 1070 subjects were preincluded, 600 at risk of metabolic liver disease, and 470 with chronic hepatitis-C, including 136 and 35 T2-diabetes, respectively (Fig. 1). All the 16 combinations of matching factors were represented (Supplementary File S3, Table S1, Panel A, Supplemental digital content 3, http://links.lww.com/EJGH/A491). In control groups, and in comparison, with T2-diabetes, there was a predominance of male under the age of 50 year, with BMI < 30 kg/m2, and without clinically significant fibrosis (group 8). The exclusion of 201 controls permitted to match the remaining controls with all T2-diabetes (Supplementary File S3, Table S1, Panel B, Supplemental digital content 3, http://links.lww.com/EJGH/A491), without significant differences in prevalences of gender, age, severe obesity, and advanced fibrosis. A total 505 patients at risk of NASH were included, 369 controls and 136 T2-diabetes cases. A total of 354 hepatitis-C patients were included, 329 controls and 35 T2-diabetes cases, with similar prevalences of clinically significant fibrosis (Table 1).

Table 1.

Characteristics of 869 patients included in the case-control analyses

Comparison of Obuchowski measures in patients with and without diabetes

For the primary endpoints, the Obuchowski measures in patients at risk of NASH, all the diagnostic performances of the three tests, were not different in cases with T2-diabetes vs. matched controls without T2-diabetes (Table 2). The higher difference was observed for NashTest-2, mean [95% confidence interval (CI)], 0.829 (0.793–0.865) in T2-diabetes vs. 0.855 (0.829–0.869) in matched controls, a nonsignificant difference (Z = −1.0) and equivalence (tests of means, P = 0.01). For SteatoTest-2, the Obuchowski measure was even higher in T2-diabetes, 0.835 (0.797–0.873) vs. matched-controls, 0.806 (0.780–0.832), nonsignificantly P = 0.23, with equivalence (tests of means, P = 0.01). For FibroTest, the Obuchowski measure was 0.871 (0.837–0.905) in T2-diabetes, vs. 0.880 (0.879–0.881) in matched-controls nonsignificantly different (P = 0.60), with equivalence (P = 0.01).

Table 2.

Diagnostic performances of FibroTest, NashTest-2 and SteatoTest-2 in T2-diabetes vs. controls: Obuchowski measures in 600 patients at risk of nonalcoholic-steato-hepatitis

Diagnostic performances of FibroTest, NashTest-2 and SteatoTest-2 in T2-diabetes vs. controls: Obuchowski measures in 600 patients at risk of nonalcoholic-steato-hepatitis In hepatitis-C, the results were in line with the results observed in patients at risk of NASH, for the variability of standard-AUROCs (Supplementary File S3, Table S3, Supplemental digital content 3, http://links.lww.com/EJGH/A491), for the absence of impact of diabetes on Obuchowski measures (Supplementary File S3, Table S4, Supplemental digital content 3, http://links.lww.com/EJGH/A491) and for the interest of spectrum-adjusted AUROCs in indirect comparisons (Supplementary File S3, Table S5, Supplemental digital content 3, http://links.lww.com/EJGH/A491).

Case-control studies to compare performances using standard-area under the receiver operating curves in patients with and without T2-diabetes

According to the histological endpoint, the standard-AUROCs of blood test varied in patients with T2-diabetes and suspected NASH from 0.575 to 0.801, without any significant differences with the performances in matched controls, which also varied from 0.555 to 0.848 (Table 3).

Table 3.

Standard-area under the receiver operating curves variability according to the histological feature’s endpoint and choices of controls in 600 patients at risk of nonalcoholic-steato-hepatitis

Standard-area under the receiver operating curves variability according to the histological feature’s endpoint and choices of controls in 600 patients at risk of nonalcoholic-steato-hepatitis In T2-diabetes, the greatest standard-AUROC difference vs. matched controls was for NashTest-2-raw and the diagnostic of clinically significant NASH A2-5, 0.607 (0.481–0.708) vs. 0.671 (0.601–0.731) and was not-significant (P = 0.34), with equivalence (P = 0.01). For SteatoTest-2 and the diagnostic of steatosis grade S3, the standard-AUROC was even higher in T2-diabetes, 0.637 (0.506–0.739) vs. matched-controls, 0.555 (0.490–0.614), and was nonsignificant (P = 0.22) with equivalence (P = 0.01). In hepatitis C, the results were also in line with the results observed in patients at risk of NASH (Supplementary File S3, Table S3, Supplemental digital content 3, http://links.lww.com/EJGH/A491).

Construction and comparisons of adjusted-area under the receiver operating curves in patients with and without diabetes

The linear regression between SpectrumF3F4-Index and standard-AUROCs was assessed among the 43 subsets combining fibrosis stages with and without T2-diabetes. The uniform standardized point is the intersection of the regression line (AUROCs vs. SpectrumF3F4-Index) and the horizontal line, assuming a SpectrumF3F4-Index of 2.5 (black dashed line), which is the index of a population of equal fibrosis stage prevalence (prevalence F0 = F1 = F2 = F3 = F4 = 20%). The adjusted-AUROC for FibroTest was 0.774. The regression curve permitted to calculate the adjusted F34-AUROC as = 0.553 + (0.088 × Spectrum-Index) with a R2 = 0.628 (Fig. 2, Panel A). The red dotted line corresponds to an example of a given study in T2-diabetes [2], which had a FibroTest with standard-AUROC = 0.720 and is equivalent to an adjusted AUROC = 0.760 using its Spectrum-Index = 2.34 (green dotted line).

Fig. 2.

Correlation between the standard-area under the receiver operating curve (AUROC) of FibroTest for the diagnosis of stage F3F4 according to the different spectrum of stages prevalences (summarized by the SpectrumF3F4-index), in patients with or without T2-diabetes. SpectrumF3F4-index is the difference between the mean of F3-F4 stages (from 3 if all patients were F3 to 4 if all were F4) and the mean of F0-F1-F2 stages (from 0 if all patients were F0 to 2 if all were F2). The regression curve (black line) between SpectrumF3F4-index and observed FibroTest’ F3F4-AUROCs permitted to adjust the standard AUROC to various distribution of stages. If there was a uniform distribution between all fibrosis stages (uniform prevalence F0 = F1 = F2 = F3 = F4 = 20%), the SpectrumF3F4-index is equal to 2.5 {(3 + 4)/2 = 3.5 − (2 + 1 + 0)/3 = 1}. The uniform standardized point is the intersection (black dashed line) of the regression curve (between SpectrumF3F4-index and observed FibroTest’ F3F4-AUROCs) and the horizontal line assuming a SpectrumF3F4-index = 2.5. Panel A: F3F4 AUROC adjusted using SpectrumF3F4-index (dashed green line) or not (dashed red line). A total of 43 combinations of fibrosis stages, with and without T2-diabetes. R2 = 0.628. Regression curve (black line with 95%CI): predicted FibroTest adjusted F34-AUROC = 0.554 + (0.088 × SpectrumF3F4-index). For FibroTest, the adjusted F34-AUROC for a uniform distribution was 0.774 (black dashed line) with SpectrumF3F4-Index = 2.5 (dashed black line). The red dotted line corresponds to an example of a given study in T2-diabetes, the reference 2 (Bril et al.), which had a FibroTest with standard-AUROC = 0.720, which is equivalent to an adjusted AUROC = 0.760 (dashed dotted line) using its SpectrumF3F4-index which is 2.34. Panel B: Regression curves in 22 subsets with diabetes (in red, 95% confidence interval, R2 = 0.72), and in 21 without T2-diabetes (in blue, R2 = 0.53) were similar. There was no significant curve-inequality between the two curves (F-ratio-test = 0.03, P = 0.97). Panel C: Regression curves in 21 different publications in patients at risk of nonalcoholic-steato-hepatitis (NASH) giving the prevalence of diabetes, permitted to assess a possible impact according to a prevalence of T2-diabetes equal or above the median (≥31.9%, in red with 95% CI, n = 11, R2 = 0.36) vs. those with prevalence < median (in blue, n = 10, R2 = 0.34). There was no significant curve-inequality between these two regression lines (F-ratio-test = 0.36, P = 0.70). Panel D: Regression curves in 25 different publications assessing fibrosis stages assessed with transient elastography (TE, n = 15 in grey), magnetic resonance elastography (MRE, n = 5, in red) or FibroTest (n = 5, in blue), permitted to assess a possible variability between these three biomarkers. Indeed, a significant correlation (R-Pearson = 0.81; P = 0.005) was reached by the 10 studies using FibroTest (blue line; R2 = 0.50) or magnetic resonance elastography (red line R2 = 0.79) but not by the 15 studies using transient elastography (red curve, R-Pearson = 0.16; R2 = 0.01).

Pilot study for comparing the impact of spectrum effect on biomarkers performances

For FibroTest, there was a significant difference between standard-AUROCs lower in patients at risk of NASH (0.741; 95% CI: 0.691–0.784) vs. CHC (0.821; 0.766–0.864; P = 0.02). When adjusted-AUROCs were compared, the significance disappeared (0.745; 0.698–0.792) vs. 0.754 (0.705–0.803; P = 0.80) (Table 4). Comparison between standard-F3F4-area under the receiver operating curves of FibroTest vs. adjusted-area under the receiver operating curves and Spectrum-index, according to presence of T2-diabetes Standard-AUROCs were 0.932 (0.898–0.965), 0.872 (0.837–0.907) and 0.834 (0.699–0.969) and reduced after adjustment to 0.794 (0.749–0.838), 0.767 (0.750–0.783) and 0.773 (0.725–0.822) for magnetic resonance, transient elastography and FibroTest respectively, without significant difference (Table 5).

Table 5.

Comparisons between 25 standard-area under the receiver operating curves of transient elastography, magnetic resonance elastography, and FibroTest after adjustment on the SpectrumF3F4-Index

Comparisons between 25 standard-area under the receiver operating curves of transient elastography, magnetic resonance elastography, and FibroTest after adjustment on the SpectrumF3F4-Index Regression lines in publications assessing stages with transient elastography (n = 15), magnetic resonance elastography (n = 5) or FibroTest (n = 5), permitted to assess a possible variability between these three biomarkers. Indeed, a significant correlation (R-Pearson = 0.81; P = 0.005) was reached by FibroTest (green line; R2 = 0.50) or magnetic resonance (red line R2 = 0.79) but not by transient elastography (blue line, R2 = 0.01) (Fig. 2, Panel D).

Discussion

Ninety percent of studies designed to validate noninvasive tests for the diagnosis of metabolic liver disease continue to use the standard-AUROC as a summary measure of diagnostic accuracy [1,2,8,23,24], even if the risk of spectrum effect and the type-one error of the tests comparing these measures in two samples with different distributions of stages or grades, have been identified [3,4]. In patients at risk of NASH, several articles discussed these limitations, but less than 10% used the recommended Obuchowski measure [2,8,24]. The present results demonstrate that using appropriate methods, Obuchowski measures, adjusted AUROCs or case-controls comparisons, there was no significant difference between the performances of FibroTest, NashTest-2 or SteatoTest-2, in patients with or without T2-diabetes. A simple index of spectrum effect, SpectrumF3F4-Index, has been constructed, which permits to assess adjusted-AUROCs with decreased variability in comparison with standard-AUROCs. Finally, a pilot study suggests that this SpectrumF3F4-Index could identify biomarkers more impacted than others by spectrum index.

Strengths

First, we reproduce the same results for estimating the accuracy of FibroTest for fibrosis staging in patients at-risk of NASH than in hepatitis-C and extend the analyses for estimating the accuracy of NashTest-2 for NASH grading and SteatoTest-2 for steatosis grading. Indeed, the Obuchowski measures were equivalent in T2-diabetes vs. nonT2-diabetes, as well than the standard-AUROCs when adjusted on the prevalence of fibrosis stages and the main covariables. The strength of the present study was the consistent concordance between all the performances of FibroTest, SteatoTest-2, in the two cases-controls population at-risk of NASH or with hepatitis-C, whatever the methodology used, Obuchowski measure, or adjusted-AUROCs. The number of cases and controls together with the centralized assessment of both the histological endpoint and the components of the tests were also a strength of the study. The number of controls was sufficient to find those who matched according to the main covariables, gender, age, obesity and fibrosis. Indeed, cases with diabetes had twice more women, older than 50 years of age and with severe obesity (25.7%) than in the non-T2-diabetes of the population at risk of NASH (11.3%). Matching was mandatory as male younger than 50 years of age, without severe obesity, and without significant fibrosis was 10 times more frequent (17.2%) than among T2-diabetes (1.5%) (Supplementary file S4, Table S2, Supplemental digital content 3, http://links.lww.com/EJGH/A491). The SpectrumF3F4-Index will never replace the Obuchowski measure to assess the complete performance of diagnostic tests for the prediction of an ordinal endpoint such as fibrosis stages. However, this method needs to have all the individual data. The interest of a spectrum index is that it can be simply assessed if the details of stages are given, and the adjusted-AUROCs can be directly compared to the standard-AUROCs. Our study has revealed several other limitations of recent overviews or meta-analyses of biomarkers in patients at risk of NASH [8,32,33]. Ideally, comparisons between biomarkers should be performed by direct comparisons in the same patients and analyzed in intention to diagnose, the failure and the nonreliable results being considered as a misclassification of the biomarker [29]. Therefore, the AUROCs of transient elastography with the M-probe in patients at risk of NASH should be reduced by at least 20%. If patients with nonreliable/failure results had different prevalences than those with reliable results, that is more reliable in F4 and less reliable in obese F0, a major spectrum effect is possible. Frequently, the prevalences of F0 were not separately given from F1 [8]. Here, the heterogeneity of histological F0 prevalence varied from 3.8 to 50% in transient-elastography studies, much more than for FibroTest (26.8–59.0%).

Limitations

We acknowledge several limitations. We defined T2-diabetes as fasting glucose ≥7 mmol/l. This definition permitted to homogenize the matching with controls, but we have not analyzed the previous history of diabetes, the glycosylated hemoglobin-A1C, and the treatments received for the diabetes as well as lipid-lowering and blood pressure medications. Due to the sample size and to the four covariables already included in the matching, we were not able to assess several other factors potentially associated with variability of the tests in T2-diabetes, that is, African-American ethnicities [34]. Another limitation was the absence of recognized ‘natural’ prevalences of fibrosis stages, NASH grades and steatosis grades, in large population of T2-diabetes and matched controls, as available in hepatitis-C [3]. Ideally, the tests should be compared by Obuchowski measures in T2-diabetes vs. not-T2-diabetes, using the ‘natural’ prevalences of stages or grades in the different context usual context of use. However, according to the larger study published, we used the prevalences assessed in diabetology open clinics, which was close to the uniform prevalence of the five fibrosis stages [23]. The regression curves and the spectrum index were assessed on a relatively small number of combinations (n = 43) and published studies (n = 33). However, the graphical concordance of the curves with and without T2-diabetes was useful for excluding a significant impact of T2-diabetes. For transient-elastography, contrarily to magnetic resonance and FibroTest curves, there was no significant correlation between the standard-AUROCs and the spectrum index. In the absence of individual data, it was not possible to explain these differences, such as a heterogeneity between covariables or between cutoffs. We also acknowledge that the power to exclude a significant impact of diabetes was limited by the sample size of patients with T2-diabetes. However, it is the first time that the risk of false interpretations of indirect comparisons between NASH tests performances was underlined. In conclusion, when compared by Obuchowski measures, the performances of FibroTest, SteatoTest-2 and NashTest-2 were not different in patients with T2-diabetes vs. patients without T2-diabetes. When individual data are not available, adjusted-AUROCs reduced the variability due to spectrum effect.

Acknowledgements

The lists of investigators in the FLIP consortium, the FibroFrance-Pitié Salpêtrière group the EPIC3 Study group and in the QUID-NASH group appear in Supplementary File S1, Supplemental digital content 1, http://links.lww.com/EJGH/A489. T.P. participated in experiment conception and design. T.P., V.P., D.M., F.C., F.I.B., D.T. and V.R. participated in experiment performance. T.P., O.D., M.M., Y.N. and A.N. performed data analysis. T.P., C.H., C.B., D.V. and L.C. performed drafting of the paper. All authors approved the final version of the article. Trial registration number: FibroFrance: NCT01927133; FLIP:HEALTH-F2-2009-241762.

Conflicts of interest

Thierry Poynard is the inventor of FibroTest, SteatoTest-2, and NashTest-2, founder of BioPredictive, the patents belong to the public organization Assistance Publique-Hôpitaux de Paris. Valentina Peta, Olivier Deckmyn, Mona Munteanu, Yen Ngo, and An Ngo are full employees of BioPredictive. The other authors have nothing to declare, Denis Monneret, Frederic Charlotte, Olivier Lucidarme, Françoise Imbert-Bismut, Chantal Housset, Dominique Thabut, Dominique Valla, Christian Boitard, Laurent Castera and Vlad Ratziu.

32 in total

1. Standardization of ROC curve areas for diagnostic evaluation of liver fibrosis markers based on prevalences of fibrosis stages.

Authors: Thierry Poynard; Philippe Halfon; Laurent Castera; Mona Munteanu; Françoise Imbert-Bismut; Vlad Ratziu; Yves Benhamou; Marc Bourlière; Victor de Ledinghen
Journal: Clin Chem Date: 2007-07-18 Impact factor: 8.327

2. Diagnostic performance of a new noninvasive test for nonalcoholic steatohepatitis using a simplified histological reference.

Authors: Thierry Poynard; Mona Munteanu; Frederic Charlotte; Hugo Perazzo; Yen Ngo; Olivier Deckmyn; Raluca Pais; Wassil Merrouche; Victor de Ledinghen; Philippe Mathurin; Vlad Ratziu
Journal: Eur J Gastroenterol Hepatol Date: 2018-05 Impact factor: 2.566

Review 3. Fibrosis progression in nonalcoholic fatty liver vs nonalcoholic steatohepatitis: a systematic review and meta-analysis of paired-biopsy studies.

Authors: Siddharth Singh; Alina M Allen; Zhen Wang; Larry J Prokop; Mohammad H Murad; Rohit Loomba
Journal: Clin Gastroenterol Hepatol Date: 2014-04-24 Impact factor: 11.382

Review 4. Range of Normal Liver Stiffness and Factors Associated With Increased Stiffness Measurements in Apparently Healthy Individuals.

Authors: Fateh Bazerbachi; Samir Haffar; Zhen Wang; Joaquín Cabezas; Maria Teresa Arias-Loste; Javier Crespo; Sarwa Darwish-Murad; M Arfan Ikram; John K Olynyk; Eng Gan; Salvatore Petta; Alessandra Berzuini; Daniele Prati; Victor de Lédinghen; Vincent W Wong; Paolo Del Poggio; Norberto C Chávez-Tapia; Yong-Peng Chen; Pin-Nan Cheng; Man-Fung Yuen; Kausik Das; Abhijit Chowdhury; Llorenç Caballeria; Núria Fabrellas; Pere Ginès; Manoj Kumar; Shiv Kumar Sarin; Fabio Conti; Pietro Andreone; Roxana Sirli; Helena Cortez-Pinto; Sofia Carvalhana; Takaaki Sugihara; Seung Up Kim; Pathik Parikh; Kazuaki Chayama; Christophe Corpechot; Kang Mo Kim; George Papatheodoridis; Ayman Alsebaey; Patrick S Kamath; M Hassan Murad; Kymberly D Watt
Journal: Clin Gastroenterol Hepatol Date: 2018-09-07 Impact factor: 11.382

5. Prognostic value of liver fibrosis and steatosis biomarkers in type-2 diabetes and dyslipidaemia.

Authors: H Perazzo; M Munteanu; Y Ngo; P Lebray; N Seurat; F Rutka; M Couteau; S Jacqueminet; P Giral; D Monneret; F Imbert-Bismut; V Ratziu; A Hartemann-Huertier; C Housset; T Poynard
Journal: Aliment Pharmacol Ther Date: 2014-09-03 Impact factor: 8.171

6. Effect of treatment with peginterferon or interferon alfa-2b and ribavirin on steatosis in patients infected with hepatitis C.

Authors: Thierry Poynard; Vlad Ratziu; John McHutchison; Michael Manns; Zachary Goodman; Stefan Zeuzem; Zobair Younossi; Janice Albrecht
Journal: Hepatology Date: 2003-07 Impact factor: 17.425

7. Performance of biomarkers FibroTest, ActiTest, SteatoTest, and NashTest in patients with severe obesity: meta analysis of individual patient data.

Authors: Thierry Poynard; Guillaume Lassailly; Emmanuel Diaz; Karine Clement; Robert Caïazzo; Joan Tordjman; Mona Munteanu; Hugo Perazzo; Bernard Demol; Robert Callafe; François Pattou; Frederic Charlotte; Pierre Bedossa; Philippe Mathurin; Vlad Ratziu
Journal: PLoS One Date: 2012-03-14 Impact factor: 3.240

8. Diagnostic value of biochemical markers (FibroTest-FibroSURE) for the prediction of liver fibrosis in patients with non-alcoholic fatty liver disease.

Authors: Vlad Ratziu; Julien Massard; Frederic Charlotte; Djamila Messous; Françoise Imbert-Bismut; Luninita Bonyhay; Mohamed Tahiri; Mona Munteanu; Dominique Thabut; Jean François Cadranel; Brigitte Le Bail; Victor de Ledinghen; Thierry Poynard
Journal: BMC Gastroenterol Date: 2006-02-14 Impact factor: 3.067

9. The diagnostic performance of a simplified blood test (SteatoTest-2) for the prediction of liver steatosis.

Authors: Thierry Poynard; Valentina Peta; Mona Munteanu; Frederic Charlotte; Yen Ngo; An Ngo; Hugo Perazzo; Olivier Deckmyn; Raluca Pais; Philippe Mathurin; Rob Myers; Rohit Loomba; Vlad Ratziu
Journal: Eur J Gastroenterol Hepatol Date: 2019-03 Impact factor: 2.566

10. Long-term prognostic value of the FibroTest in patients with non-alcoholic fatty liver disease, compared to chronic hepatitis C, B, and alcoholic liver disease.

Authors: Mona Munteanu; Raluca Pais; Valentina Peta; Olivier Deckmyn; Joseph Moussalli; Yen Ngo; Marika Rudler; Pascal Lebray; Frederic Charlotte; Vincent Thibault; Olivier Lucidarme; An Ngo; Françoise Imbert-Bismut; Chantal Housset; Dominique Thabut; Vlad Ratziu; Thierry Poynard
Journal: Aliment Pharmacol Ther Date: 2018-10-17 Impact factor: 8.171

3 in total

Review 1. Liver macrophages and inflammation in physiology and physiopathology of non-alcoholic fatty liver disease.

Authors: Ronan Thibaut; Matthew C Gage; Inès Pineda-Torra; Gwladys Chabrier; Nicolas Venteclef; Fawaz Alzaid
Journal: FEBS J Date: 2021-05-02 Impact factor: 5.622

2. Liver Fibrosis Biomarkers Accurately Exclude Advanced Fibrosis and Are Associated with Higher Cardiovascular Risk Scores in Patients with NAFLD or Viral Chronic Liver Disease.

Authors: Stefano Ballestri; Alessandro Mantovani; Enrica Baldelli; Simonetta Lugari; Mauro Maurantonio; Fabio Nascimbeni; Alessandra Marrazzo; Dante Romagnoli; Giovanni Targher; Amedeo Lonardo
Journal: Diagnostics (Basel) Date: 2021-01-09

3. Clinical Interest of Serum Alpha-2 Macroglobulin, Apolipoprotein A1, and Haptoglobin in Patients with Non-Alcoholic Fatty Liver Disease, with and without Type 2 Diabetes, before or during COVID-19.

Authors: Olivier Deckmyn; Thierry Poynard; Pierre Bedossa; Valérie Paradis; Valentina Peta; Raluca Pais; Vlad Ratziu; Dominique Thabut; Angelique Brzustowski; Jean-François Gautier; Patrice Cacoub; Dominique Valla
Journal: Biomedicines Date: 2022-03-17

3 in total