Siamak Sabour1. 1. Safety Promotion and Injury Prevention Research Center, Department of Clinical Epidemiology School of health, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
I was interested to read the paper by Suarez AL and colleagues published in the Mar 2016 issue of Journal of Neurogastroenterology and Motility. They aimed to evaluate the reproducibility of sphincter of Oddi manometry.1 The authors used 214 subjects with post-cholecystectomy pain who were randomized into 3 arms, irrespective of manometric findings: sham (no sphincterotomy), biliary sphincterotomy, and dual (biliary and pancreatic). Thirty-eight subjects had both biliary and pancreatic manometries performed twice, at baseline and at repeat endoscopic retrograde cholangiopancreatography (ERCP) after 1–11 months. The sham arm was examined to assess the reproducibility of manometry.1 They reported that biliary and pancreatic measurements were reproduced in 7/14 (50%) untreated subjects. All 12 patients with initially elevated biliary pressures in biliary and dual sphincterotomy groups normalized after biliary sphincterotomy. However, 2 of 8 subjects with elevated pancreatic pressures in the dual sphincterotomy group remained abnormal after pancreatic sphincterotomy. Paradoxically, normal biliary pressures became abnormal in 1 of 15 subjects after biliary sphincterotomy, and normal pancreatic pressures became abnormal in 5 of 15 patients after biliary sphincterotomy, and in 1 of 9 after pancreatic sphincterotomy.1 First of all, it is crucial to know that descriptive statistics cannot provide a simple substitute for clinical judgment in reliability analysis.2–5 Moreover, to assess the reproducibility, depending on the quantitative or qualitative type of our data, exact intra class correlation coefficient or weighted kappa can be used.2–5 As the authors pointed out in their conclusion, SOM measurements are poorly reproducible, and question our ability to perform pancreatic sphincterotomy adequately. Such a conclusion can be a misleading message due to inappropriate use of statistical tests to assess reproducibility. As a take home message, for reproducibility analysis, appropriate tests should be used with careful interpretation.