Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Conjoint measurement of disorder prevalence, test sensitivity, and test specificity: notes on Botella, Huang, and Suero's multinomial model.

Literature DB >> 24319439

Conjoint measurement of disorder prevalence, test sensitivity, and test specificity: notes on Botella, Huang, and Suero's multinomial model.

Edgar Erdfelder¹, Morten Moshagen.

Abstract

Entities: Chemical Disease Gene Species

Keywords: diagnostic accuracy; gold standard; imperfect reference; multinomial modeling; validity

Year: 2013 PMID： 24319439 PMCID： PMC3837240 DOI： 10.3389/fpsyg.2013.00876

Source DB: PubMed Journal: Front Psychol ISSN： 1664-1078

× No keyword cloud information.

Botella et al. (2013) proposed two useful multinomial models for conjoint measurement of disorder prevalence rates in different populations (e.g., prevalence rates of dementia) and both the sensitivity and the specificity of the test used to assess this disorder (e.g., the Mini Mental State Examination, MMSE; Folstein et al., 1975). Their first model requires a perfect indicator of the disorder (i.e., a gold standard, GS), whereas the second model provides for indicators not perfectly correlated with the disorder (i.e., imperfect references, IR). In line with Lazarsfeld's (1950) latent-class model, the only requirement of the latter model is local stochastic independence of the IR and the test-based classification, that is, stochastic independence of the IR and the test result within subpopulations of individuals with vs. without the disorder. The present comment addresses two shortcomings of the IR model and suggests ways to overcome them: (1) Lack of global identifiability in general and (2) lack of local identifiability when prevalence rates are homogenous across populations. Problem (1). As acknowledged by Botella et al. (2013), the IR model is not globally identifiable. There are always two sets of sensitivity and specificity parameters for both the reference (Se and Sp, respectively) and the test (Se and Sp, respectively) that predict exactly the same outcome probabilities and therefore cannot be distinguished on grounds of model fit [see Botella et al. (2013), Table 1]. Despite the lack of uniqueness in parameter estimates, Botella et al. (2013) recommended use of the unconstrained IR model and to choose the set of parameter estimates that appears more plausible. However, besides introducing an unnecessary degree of subjectivity, a model that is consistent with parameter values incongruent with common sense is obviously too flexible and overly complex. For example, Botella et al.'s IR model allows for references and tests that are negatively correlated with the disorder under investigation, that is, for tools that measure the opposite of what they are supposed to measure. This is clearly not reasonable. In addition, their model lacks unique validity measures for both the reference and the test. A simple way to remedy these problems is to constrain the sensitivity and specificity parameters in accordance with the two-high threshold model of detection (e.g., Snodgrass and Corwin, 1988; Waubert de Puiseau et al., 2012). In this refined model, the parameters of the IR model are reparameterized as follows: The new parameters, D and B, denote validity and bias measures, respectively, for the IR [both in (0, 1)]. D is the probability that the IR detects the true status (disorder present vs. absent), and B represents the disorder-present bias (i.e., the probability of a positive diagnosis) given failure to detect the true status. Accordingly, the sensitivity and specificity parameter estimates of the test, Se and Sp, are reparameterized as functions of test validity and bias parameters D and B, respectively. Importantly, these reparameterizations jointly imply the order constraints Se ≥ (1 − Sp) and Se ≥ (1 − Sp) so that a positive diagnosis cannot be less likely given presence than given absence of the disorder. In other words, whereas the dimensionality of the parameter space remains unchanged (as the Se and Sp parameters are replaced by D and B parameters), the refined model restricts the admissible data space. As a consequence, in contrast to Botella et al.'s IR model, the refined model excludes negative correlations of the disorder with both the IR and the test. Moreover, introducing these order constraints renders the model globally identifiable (subject to the auxiliary condition of unequal prevalence rates, see below), thereby removing any ambiguity in interpretation. As summarized in Table 1, fitting the refined model to the data sets analyzed by Botella et al. (2013, Table 2) results in the same goodness-of-fit statistics as observed for the original IR model. This shows that the order constraints are perfectly in line with the data. However, as a consequence of exclusion of negative correlations, model flexibility as measured by cFIA is reduced for the refined model, resulting in better Minimum Description Length (MDL) indices of model fit than observed for the original IR model. An additional advantage of the refined model is that it provides unique validity and bias measures for both the reference and the test. For the MMSE data, for example, the test validity (0.736) is almost as large as the validity of the reference (0.876), although the difference in validities is statistically significant [ΔG2(1) = 8.60, p = 0.003]. Most importantly, unlike the original IR model, the refined model is globally identifiable so that there is only a single set of validity and bias estimates (and the corresponding sensitivity and specificity estimates) for both measurement tools involved (see Table 1).

Table 1

Maximum likelihood parameter estimates, goodness-of-fit (.

Statistic/Estimate	AUDIT data		MMSE data
	Original model	Refined model	Original model	Refined model
Se_R	0.996/0.000	(0.996)	0.876/0.000	(0.876)
Sp_R	1.000/0.004	(1.000)	1.000/0.124	(1.000)
D_R	–	1.000	–	0.876
B_R	–	–	–	0.000
Se_T	0.637/0.040	(0.637)	0.864/0.128	(0.864)
Sp_T	0.960/0.363	(0.960)	0.872/0.136	(0.872)
D_T	–	0.600	–	0.736
B_T	–	0.098	–	0.486
G²(4)	13.99	13.99	12.14	12.14
c_FIA	20.1	18.7	23.0	21.6
MDL	577.0	575.6	1493.6	1492.2

Parameter estimates in parentheses are derived from the corresponding validity and bias estimates using Equations (1) and (2). The two estimates for the original model correspond to the two maxima of the likelihood function. Note that the BR parameter for the AUDIT data is not identifiable because DR approaches the boundary of the parameter space.

Maximum likelihood parameter estimates, goodness-of-fit (. Parameter estimates in parentheses are derived from the corresponding validity and bias estimates using Equations (1) and (2). The two estimates for the original model correspond to the two maxima of the likelihood function. Note that the BR parameter for the AUDIT data is not identifiable because DR approaches the boundary of the parameter space. Problem (2). To apply their models in situations where classification data are available from a single large study only, Botella et al. (2013) suggested a random split of this sample in k segments and to treat these segments as if they were drawn from k different populations. However, apart from sampling error, random splits necessarily result in the same prevalence rate in each of the random segments so that the same population classification matrix must hold for each data set. In effect, there are only 3 instead of 3k independent category probabilities available, implying that both the standard and the refined IR model (with k + 4 parameters each) cannot be identifiable. Hence, random splits of a large sample will be of no help. A possible remedy is to split the sample based on a third variable that has been observed in addition to the IR and the test result (say, gender, age group, profession, or religion), provided the assumption can be made that the prevalence rates, but not the sensitivity and specificity of the test and the reference, differ between the corresponding subpopulations. Unequal prevalences in at least two subpopulations suffice to ensure local identifiability. Thus, systematic splits of a single large sample may remedy the identifiability problem whereas random splits will not.

5 in total

5. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification.

Authors: Juan Botella; Huiling Huang; Manuel Suero
Journal: Front Psychol Date: 2013-10-03

5 in total

1 in total

1. The advantages of model fitting compared to model simulation in research on preference construction.

Authors: Edgar Erdfelder; Marta Castela; Martha Michalkiewicz; Daniel W Heck
Journal: Front Psychol Date: 2015-02-18

1 in total

Conjoint measurement of disorder prevalence, test sensitivity, and test specificity: notes on Botella, Huang, and Suero's multinomial model.

1. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician.

2. multiTree: a computer program for the analysis of multinomial processing tree models.

3. Pragmatics of measuring recognition memory: applications to dementia and amnesia.

4. Extracting the truth from conflicting eyewitness reports: a formal modeling approach.

5. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification.

1. The advantages of model fitting compared to model simulation in research on preference construction.