OBJECTIVE: To generate a classification of methods to evaluate medical tests when there is no gold standard. METHODS: Multiple search strategies were employed to obtain an overview of the different methods described in the literature, including searches of electronic databases, contacting experts for papers in personal archives, exploring databases from previous methodological projects and cross-checking of reference lists of useful papers already identified. RESULTS: All methods available were classified into four main groups. The first method group, impute or adjust for missing data on reference standard, needs careful attention to the pattern and fraction of missing values. The second group, correct imperfect reference standard, can be useful if there is reliable information about the degree of imperfection of the reference standard and about the correlation of the errors between the index test and the reference standard. The third group of methods, construct reference standard, have in common that they combine multiple test results to construct a reference standard outcome including deterministic predefined rules, consensus procedures and statistical modelling (latent class analysis). In the final group, validate index test results, the diagnostic test accuracy paradigm is abandoned and research examines, using a number of different methods, whether the results of an index test are meaningful in practice, for example by relating index test results to relevant other clinical characteristics and future clinical events. CONCLUSIONS: The majority of methods try to impute, adjust or construct a reference standard in an effort to obtain the familiar diagnostic accuracy statistics, such as sensitivity and specificity. In situations that deviate only marginally from the classical diagnostic accuracy paradigm, these are valuable methods. However, in situations where an acceptable reference standard does not exist, applying the concept of clinical test validation can provide a significant methodological advance. All methods summarised in this report need further development. Some methods, such as the construction of a reference standard using panel consensus methods and validation of tests outwith the accuracy paradigm, are particularly promising but are lacking in methodological research. These methods deserve particular attention in future research.
OBJECTIVE: To generate a classification of methods to evaluate medical tests when there is no gold standard. METHODS: Multiple search strategies were employed to obtain an overview of the different methods described in the literature, including searches of electronic databases, contacting experts for papers in personal archives, exploring databases from previous methodological projects and cross-checking of reference lists of useful papers already identified. RESULTS: All methods available were classified into four main groups. The first method group, impute or adjust for missing data on reference standard, needs careful attention to the pattern and fraction of missing values. The second group, correct imperfect reference standard, can be useful if there is reliable information about the degree of imperfection of the reference standard and about the correlation of the errors between the index test and the reference standard. The third group of methods, construct reference standard, have in common that they combine multiple test results to construct a reference standard outcome including deterministic predefined rules, consensus procedures and statistical modelling (latent class analysis). In the final group, validate index test results, the diagnostic test accuracy paradigm is abandoned and research examines, using a number of different methods, whether the results of an index test are meaningful in practice, for example by relating index test results to relevant other clinical characteristics and future clinical events. CONCLUSIONS: The majority of methods try to impute, adjust or construct a reference standard in an effort to obtain the familiar diagnostic accuracy statistics, such as sensitivity and specificity. In situations that deviate only marginally from the classical diagnostic accuracy paradigm, these are valuable methods. However, in situations where an acceptable reference standard does not exist, applying the concept of clinical test validation can provide a significant methodological advance. All methods summarised in this report need further development. Some methods, such as the construction of a reference standard using panel consensus methods and validation of tests outwith the accuracy paradigm, are particularly promising but are lacking in methodological research. These methods deserve particular attention in future research.
Authors: Otto R Maarsingh; Jacquelien Dros; François G Schellevis; Henk C van Weert; Danielle A van der Windt; Gerben ter Riet; Henriette E van der Horst Journal: Ann Fam Med Date: 2010 May-Jun Impact factor: 5.166
Authors: O Naggara; F Louillet; E Touzé; D Roy; X Leclerc; J-L Mas; J-P Pruvo; J-F Meder; C Oppenheim Journal: AJNR Am J Neuroradiol Date: 2010-07-01 Impact factor: 3.825
Authors: Eric Ramos; Samuel G Schumacher; Mark Siedner; Beatriz Herrera; Willi Quino; Jessica Alvarado; Rosario Montoya; Louis Grandjean; Laura Martin; Jonathan M Sherman; Robert H Gilman; Carlton A Evans Journal: Am J Trop Med Hyg Date: 2010-10 Impact factor: 2.345
Authors: Emanuele Trucco; Alfredo Ruggeri; Thomas Karnowski; Luca Giancardo; Edward Chaum; Jean Pierre Hubschman; Bashir Al-Diri; Carol Y Cheung; Damon Wong; Michael Abràmoff; Gilbert Lim; Dinesh Kumar; Philippe Burlina; Neil M Bressler; Herbert F Jelinek; Fabrice Meriaudeau; Gwénolé Quellec; Tom Macgillivray; Bal Dhillon Journal: Invest Ophthalmol Vis Sci Date: 2013-05-01 Impact factor: 4.799
Authors: Bizu Gelaye; Mahlet G Tadesse; Michelle A Williams; Jesse R Fann; Ann Vander Stoep; Xiao-Hua Andrew Zhou Journal: Ann Epidemiol Date: 2014-05-02 Impact factor: 3.797