| Literature DB >> 25880066 |
Klaus Gottlieb1, Fez Hussain2.
Abstract
Independent central reading or off-site reading of imaging endpoints is increasingly used in clinical trials. Clinician-reported outcomes, such as endoscopic disease activity scores, have been shown to be subject to bias and random error. Central reading attempts to limit bias and improve accuracy of the assessment, two factors that are critical to trial success. Whether one central reader is sufficient and how to best integrate the input of more than one central reader into one output measure, is currently not known.In this concept paper we develop the theoretical foundations of a reading algorithm that can achieve both objectives without jeopardizing operational efficiency We examine the role of expert versus competent reader, frame scoring of imaging as a classification task, and propose a voting algorithm (VISA: Voting for Image Scoring and Assessment) as the most appropriate solution which could also be used to operationally define imaging gold standards. We propose two image readers plus an optional third reader in cases of disagreement (2 + 1) for ordinary scoring tasks. We argue that it is critical in trials with endoscopically determined endpoints to include the score determined by the site reader, at least in endoscopy clinical trials. Juries with more than 3 readers could define a reference standard that would allow a transition from measuring reader agreement to measuring reader accuracy. We support VISA by applying concepts from engineering (triple-modular redundancy) and voting theory (Condorcet's jury theorem) and illustrate our points with examples from inflammatory bowel disease trials, specifically, the endoscopy component of the Mayo Clinic Score of ulcerative colitis disease activity. Detailed flow-diagrams (pseudo-code) are provided that can inform program design.The VISA "2 + 1" reading algorithm, based on voting, can translate individual reader scores into a final score in a fashion that is both mathematically sound (by avoiding averaging of ordinal data) and in a manner that is consistent with the scoring task at hand (based on decisions about the presence or absence of features, a subjective classification task). While the VISA 2 + 1 algorithm is currently being used in clinical trials, empirical data of its performance have not yet been reported.Entities:
Mesh:
Year: 2015 PMID: 25880066 PMCID: PMC4349725 DOI: 10.1186/s12880-015-0049-0
Source DB: PubMed Journal: BMC Med Imaging ISSN: 1471-2342 Impact factor: 1.930
Figure 1Increasing joint probability of being correct with additional jurors.
Absolute and relative gain in the joint probability (p-3 jurors) of being correct based on the individual probability (p-1 juror)
|
|
|
|
|
|---|---|---|---|
| 0.6 | 0.648 | 0.048 | 7.4% |
| 0.7 | 0.784 | 0.084 | 10.7% |
| 0.8 | 0.896 | 0.096 | 10.7% |
| 0.9 | 0.972 | 0.072 | 7.4% |
| 0.95 | 0.993 | 0.043 | 4.3% |
Figure 2Triple Modular Redundancy voting applied to image analysis performed by humans (adapted from Latif-Shabgahi et al. [ 13 ] .
Figure 3The VISA algorithm shows how adaptive voting for image scoring (2 or 3 voters) can be efficiently implemented and automated.
Figure 4VISA can be adjusted to create a pool of gold standard cases as the trial progresses to be used in proficiency testing.