Literature DB >> 33854454

Variable Speed Across Dimensions of Ability in the Joint Model for Responses and Response Times.

Peida Zhan1, Hong Jiao2, Kaiwen Man3, Wen-Chung Wang4, Keren He1.   

Abstract

Working speed as a latent variable reflects a respondent's efficiency to apply a specific skill, or a piece of knowledge to solve a problem. In this study, the common assumption of many response time models is relaxed in which respondents work with a constant speed across all test items. It is more likely that respondents work with different speed levels across items, in specific when these items measure different dimensions of ability in a multidimensional test. Multiple speed factors are used to model the speed process by allowing speed to vary across different domains of ability. A joint model for multidimensional abilities and multifactor speed is proposed. Real response time data are analyzed with an exploratory factor analysis as an example to uncover the complex structure of working speed. The feasibility of the proposed model is examined using simulation data. An empirical example with responses and response times is presented to illustrate the proposed model's applicability and rationality.
Copyright © 2021 Zhan, Jiao, Man, Wang and He.

Entities:  

Keywords:  hierarchical modeling framework; joint model; multidimensional item response theory; response times; variable speed

Year:  2021        PMID: 33854454      PMCID: PMC8039373          DOI: 10.3389/fpsyg.2021.469196

Source DB:  PubMed          Journal:  Front Psychol        ISSN: 1664-1078


Introduction

With the popularity of computer-based tests, the collection of item response times (RTs) has become a routine activity in large- and small-scale educational assessments. For example, the Programme for International Student Assessment (PISA) started using computer-based tests and recorded RTs data since 2012. RTs provide information about the working speed of respondents but also could be utilized to improve measurement accuracy because RTs are considered to convey a more synoptic depiction of the respondents’ performance beyond what is obtainable based on correct responses alone (van der Linden et al., 2010; Bolsinova and Tijmstra, 2018). Before making inferences by employing RTs, it is necessary to create an appropriate statistical model for RTs. Over the past few decades, various RT models have been presented based on cognitive/psychological theories and experimental research (for a review, see De Boeck and Jeon, 2019). Currently, the Bayesian hierarchical modeling framework (van der Linden, 2007) is one of the most flexible tools to explain the relationship between latent ability and working speed. This framework is gaining more recognition and is sufficiently generalized to integrate available measurement models for item response accuracy (RA) and RTs. Typically, in the hierarchical modeling of RTs and RA, the RT measurement model assumes that a respondent works at a constant speed throughout a test. Meanwhile, the RA measurement model assumes that a respondent puts his or her best effort forward to solve a set of items correctly by using the required knowledge. Thus, the association between latent ability and working speed is assumed to be changeless for each respondent working on a test. In other words, each respondent is assumed to work at a constant pace given his or her invariant ability at that time (Fox and Marianti, 2016). Currently, most joint models for RA and RTs only use unidimensional measurement models to capture the relationship between latent ability and working speed within a unidimensional test scenario (e.g., Klein Entink et al., 2009a,b; Wang et al., 2013; Fox et al., 2014; Molenaar et al., 2015, 2016; Wang and Xu, 2015; Fox and Marianti, 2016). In reality, however, multiple latent abilities are involved to correctly answer an item, especially in multidimensional tests (e.g., Tatsuoka, 1983; Reckase, 2009). Compared to unidimensional tests, one significant characteristic of multidimensional tests is that different test items may measure distinguish latent ability dimensions. In educational and psychological measurements, working speed as a latent variable reflects a respondent’s efficiency to apply a specific skill or a piece of knowledge to solve a problem. Therefore, latent speed should be discussed by considering the linkage to a particular dimension of latent ability. It is reasonable to assume that respondents could vary their working speeds across items that measure different dimensions of ability. In other words, the multidimensional structure for latent ability could be used to model the process of speed change, where the working speed is allowed to vary across dimensions of ability. For example, in a math test, the working speed on items that measure algebra problem-solving ability may differ from those measuring geometry problem-solving ability. With the development of psychometrics, multidimensional measurement models for RA [e.g., multidimensional item response theory (MIRT) models and diagnostic classification models (DCMs)] have been well developed and widely used (see Reckase, 2009; Rupp et al., 2010). Recently, based on hierarchical modeling, a few studies have attempted to use MIRT models or DCMs for RA to capture the multidimensional structure of the latent trait when multidimensional tests are involved. But still, a unidimensional or single-factor RT (SRT) model is used to measure latent speed (Zhan et al., 2018; Man et al., 2019; Wang et al., 2019). Thus, in these studies, the relationships between multiple latent abilities and one single latent speed are assumed to be constant for each respondent working with a constant speed on different items. However, assuming identical working speeds across different dimensions of ability may be too restrictive to describe intricate data and thus may lead to ambiguous conclusions. It is desirable to release this limitation to allow each dimension of ability to be associated with a specific speed factor. As current joint models may be inappropriate for multidimensional tests, it is critical to develop a joint model that allows working speed to vary across dimensions of ability. To model varying working speeds within different domains of ability, it is possible to use multiple-speed factors/dimensions to describe the speed process. Each speed factor corresponds to a specific dimension of latent ability. An individual speed process is assumed, describing the changes in speed across dimensions. Thus, respondents can work at different levels of speed on items within different dimensions of ability during multidimensional tests. Each individual speed process will be defined using a confirmatory multifactor structure, which in turn is defined by the dimensions of ability measured by items, according to the testing blueprint. Furthermore, it will be shown that the multifactor working speed model can be integrated with a MIRT model for latent ability. Under this new joint model, it is assumed that each respondent works at a unique speed corresponding to the dimension represented by an item. We first extend the most popular single-factor lognormal RT (SLRT) model (van der Linden, 2006) to a multifactor working speed model that considers changing speed across dimensions. This is called the multifactor lognormal RT (MLRT) model. Second, a joint model of multidimensional latent ability and multifactor working speed will be proposed. Our paper starts with a brief review of the SLRT model, followed by presenting the proposed MLRT model. The proposed joint model is then presented. Next, a motivating example will be provided to demonstrate the multifactor structure of working speed and its compatibility with the multidimensional structure of latent ability. Moreover, two simulation studies will be conducted to evaluate the psychometric properties of the proposed joint model. An empirical example will also be analyzed to illustrate the application of the proposed joint model. Finally, we summarize our findings and discuss directions for future research.

Multifactor Lognormal Response Time Model

Let T be the observed RT of person n (n = 1,…, N) to item i (i = 1,…, I). In the SLRT model, the logarithmic function is used to transform the positively skewed distribution of RT to a more symmetric shape and is assumed to be dominated by item i’s time-intensity parameter ξ and person n’s latent speed parameter τ as follows: or equivalently, where represents the time needed to complete item i, τ is the single-factor working speed of person n, and ε is the normally distributed residual error term, with mean zero and variance, where ω is the time-precision parameter. In recent years, the SLRT model has been extended in some studies. For instance, Klein Entink et al. (2009a) included a time-discrimination parameter as a slope parameter for latent speed. Klein Entink et al. (2009b) proposed the Box-Cox transformation for RT modeling. Wang et al. (2013) proposed a linear transformation model for RTs. Furthermore, Fox and Marianti (2016) proposed a variable working speed model, which allows the respondents to adjust their working speed along the sequence of items throughout the test. Although Fox and Marianti’s (2016) model relaxed the assumption of constant speed in the SLRT model, their variable speed was different from that focused on in this study. One is to change speed as the item response progresses, and the other is to change speed as the dimension of ability examined by the item changes. As mentioned previously, the kernel hypothesis of this study is that respondents can work with different levels of speed on items requiring different dimensions of ability during multidimensional tests. In other words, working speed has a multifactor structure, which is defined by the multidimensional structure of ability. In the multidimensional test, assuming there are K sub-dimensions of latent ability. In the current study, only the between-item multidimensionality (Adams et al., 1997) is considered, where each item measures a single dimension but different items measure different dimensions, so the multidimensionality occurs between items. To model variable speed across dimensions, we first relaxed the assumption of the SLRT model that each respondent works at a constant speed on all items throughout the test and allowed the instantaneous speed to be different on different items, that is, τ→. Then, a confirmatory multifactor structure was given to model the instantaneous speed at item i of person n, as where is the instantaneous speed at item i of person n, and τ is the working speed factor of person n corresponding to kth-dimension (k = 1, 2,…, K) of ability. The Q-matrix (Tatsuoka, 1983) is an I-by-K confirmatory matrix with element q indicating whether kth-dimension of ability is required to answer item i correctly: q = 1 if the dimension is required, and q = 0 otherwise. For between-item multidimensionality, only one dimension is measured by an item, namely, only one element in q equals to 1. In such cases, the MLRT model can be expressed as or equivalently, If only one dimension of ability is assumed to be measured by all items, the MLRT model reduces to the SLRT model.

Joint Model for Response Accuracy and Response Times

Model Construction

Since both RA and RTs contain information about items and persons, it is advantageous to analyze them simultaneously. To this end, based on hierarchical modeling, we propose a new joint model called the multidimensional-multifactor joint (MMJ) model. For illustration purposes, in the MMJ model in this study, the MLRT model is used as the measurement model for RTs, and according to the 2012 PISA mathematics assessment framework (OECD, 2013), the multidimensional Rasch (MR) model (Adams et al., 1997) is employed as the measurement model for RA. Besides observing RTs, let Y be the observed RA for person n to item i. The MR model can be expressed as where logit(x) = log(x/(1–x)), P(Y = 1) is the probability of a correct response by person n to item i, θ is the latent ability of person n on dimension k, d is the intercept or easiness of item i, and q is the element of Q-matrix. The multivariate normal distribution was used to describe the relationships among the multidimensional ability and multifactor speed: where θ = (θ1,…, θ,…, θ)’ is the multidimensional latent ability vector; τ = (τ,…,τ,…,τ)’ is the multifactor working speed vector; μθ and μτ are the population mean vector of multidimensional ability and the population mean vector of multifactor working speed, respectively; and Σ is a variance-covariance matrix of person parameters, where is the variance of θ, is the variance of τ, σis the covariance of θ and θ, σis the covariance of τ and τ, and σis the covariance of θ and τ. Furthermore, for the item parameters, a bivariate normal distribution was used to describe the relationship between item easiness and item time-intensity, where μ and μξ are the mean of item easiness and the mean of item time-intensity, respectively; and Σ is a variance-covariance matrix of item parameters, where and are the variance of item easiness and the variance of item time-intensity, respectively; σis the covariance of item easiness and item time-intensity.. The residual error variance, , is assumed to be independently distributed. For the MMJ model, the latent scales of multidimensional ability and mutlifactor speed need to be identified. This can be accomplished by restricting the population mean of the ability and speed as μθ = μτ = 0.

Parameter Estimation

Parameters in the MMJ model can be estimated via the full Bayesian approach with the Markov Chain Monte Carlo (MCMC) method. In Bayesian estimation, prior distributions of model parameters and observed data likelihood produce a joint posterior distribution for the model parameters. In this study, the Just Another Gibbs Sampler (JAGS) software (Plummer, 2015) was used to estimate parameters. JAGS uses a default option of the Gibbs sampler (Gelfand and Smith, 1990), whose code for the proposed joint model is provided in the online Supplementary Appendix. Under the assumption of local independence, Y and logT are independently distributed as Weakly but not non-informative priors are preferentially used in this study to increase the generalizability of our codes by imposing vague prior beliefs on estimating parameters. The setting of priors refers to that used by Zhan et al. (2018) and Man et al. (2019). The priors of the person parameters are set as with a hyper prior where R is a K*-dimensional identity matrix, and K* indicates the degree of freedom, which in this case is equal to the dimension of the R. In addition, the priors of item parameters are set as . Furthermore, the hyper priors are specified as where R is a two-dimensional identity matrix. Finally, the posterior mean is treated as the estimated value for model parameters.

A Motivating Example

To explore the multifactor structure of working speed, and to explore whether this structure matches the multidimensional structure of latent ability, a motivating example with the exploratory factor analysis (EFA) of RTs was presented first.

Data Description

The PISA 2012 computer-based mathematics RT data were analyzed. This data set was originally used by Zhan et al. (2018). In this study, there are N = 1,581 respondents and I = 9 items. The logarithm of RTs was computed before the analysis, and all zero RTs were treated as missing data. A Q-matrix (see Table 1) was specified based on the PISA 2012 mathematics assessment framework (OECD, 2013). Three dimensions that belong to the mathematical content knowledge were chosen, namely, change and relationships (θ1), space and shape (θ2), and uncertainty and data (θ3). However, it should be noted that this Q-matrix was originally used to link items and latent abilities or to present the multidimensional structure of latent ability. In other words, this Q-matrix does not specify the latent structure of working speed unless the structure explored by the EFA of RTs matches it.
TABLE 1

Q-Matrix for PISA 2012 released computer-based mathematics items.

Itemsθ1θ2θ3
CM015Q02D1
CM015Q03D1
CM020Q011
CM020Q021
CM020Q031
CM020Q041
CM038Q03T1
CM038Q051
CM038Q061
Q-Matrix for PISA 2012 released computer-based mathematics items.

Exploratory Analysis and Results

The Mplus (version 8.1) (Muthén and Muthén, 2019) was used here. The EFA within a confirmatory factor analysis framework method was used by default in Mplus. In this study, the number of factors to retain was set as 1 to 5, which means 1- to 5-factor CFA models were all employed to fit RT data. Then, Akaike Information Criterion (AIC; Akaike, 1974) and Bayesian Information Criterion (BIC; Schwarz, 1978) were used as model-data fit indexes to help judge the number of factors/dimensions. Theoretically, correlations should exist among multiple dimensions; thus, oblique rotation was used. Other settings followed the default (e.g., the maximum likelihood was used as an extraction method). Table 2 presents the model-data fit indexes of the EFA. According to previous studies, TLI > 0.95, CFI > 0.95, SRMR ≤ 0.08, and RMSEA < 0.05 mean good model-data fit (Hu and Bentler, 1999; Steiger, 1990). The AIC preferred the 4-factor model, and the BIC preferred the 3-factor model after taking into account the penalty weighting of sample size. On the whole, the 3-factor model seems to fit the data better than the other models.
TABLE 2

Exploratory factor analysis model-data fit indexes for RT data.

Modelχ2dfTLICFIAICBICSRMRRMSEA (90% CI)
1-factor462.79**270.8960.92224592.1524737.030.0450.101 (0.093, 0.109)
2-factor225.49**190.9300.96324370.8524558.650.0320.083 (0.073, 0.093)
3-factor32.66**120.9890.99624192.0224417.380.0100.033 (0.020, 0.047)
4-factor5.5661.0001.00024176.9224434.480.0040.000 (0.000, 0.031)
5-factor0.0911.0061.00024181.4424465.830.0000.000 (0.000, 0.045)
Exploratory factor analysis model-data fit indexes for RT data. Table 3 presents the rotated factor loading matrix for the 3-factor model. Compared to the theoretically constructed Q-matrix for latent ability, there is only a difference in CM038Q03T. The rotated factor loading of CM038Q03T on Factor 3 is 0.300 (p < 0.05), which also supports the theoretical structure to a certain extent. The results indicate that the latent structure of working speed might be a 3-factor structure, which is also consistent with the theoretical multidimensional structure of latent ability (i.e., the Q-matrix in Table 1).
TABLE 3

Rotated factor loading matrix for the 3-factor model for response times data.

ItemFactor 1Factor 2Factor 3
CM015Q02D0.695*
CM015Q03D0.609*
CM020Q010.565*
CM020Q020.801*
CM020Q030.642*
CM020Q040.943*
CM038Q03T0.502*
CM038Q050.985*
CM038Q060.621*
Rotated factor loading matrix for the 3-factor model for response times data. Overall, the results of the EFA support the kernel hypothesis of this study. However, due to the limitations of the EFA, the estimation of parameters such as individual working speed cannot be realized. Therefore, further exploration and utilization of the proposed MMJ model are necessary.

Simulation Studies

Two simulation studies were conducted to evaluate the performance of the MMJ model under various conditions. The primary purpose of simulation study 1 was to examine whether the model parameters could be recovered accurately using the proposed Bayesian estimation algorithm, in which data were simulated from the MMJ model and analyzed with itself. Man et al. (2019) has shown that, in multidimensional tests, the joint model that involves multidimensional ability and single-factor speed (denoted as MSJ model in this study) performs better than the joint model that involves unidimensional ability and single-factor speed (e.g., van der Linden, 2007). In this study, we focus on the comparison between the MMJ model and the MSJ model. Specifically, simulation study 2 was conducted to evaluate: (a) the consequences of ignoring the multifactor structure of working speed, in which the data were simulated from the MMJ model but analyzed with the MSJ model; and (b) the consequences of misspecifying a multifactor structure of working speed, in which the data were simulated from the MSJ model but analyzed with the MMJ model. Note that the results of simulation study 2 were omitted for brevity but can be found in the online Supplementary Appendix (see Supplementary section S1).

Design and Data Generation

In simulation study 1, four factors were manipulated including (a) sample size: N = 500 and 1,000, (b) test length: I = 15 and 30, (c) the correlation coefficient between latent ability and its corresponding working speed factor: ρθτ = –0.7 and –0.4, and (d) the number of dimensions of ability: K = 3 and 5. Q-matrices are presented in Figure 1. In addition, the true values of other parameters were generated according to the results of a data analysis using real data (Zhan et al., 2018). For item parameters, item easiness, d, and item time intensity, ξ, were generated from a bivariate normal distribution with mean vector (0, 4) and covariance matrix of [1, –0.2; –0.2, 0.25]. In such a setting, ρξ = –0.4. The reciprocal of the standard deviation of the error term, ω, is set to 2 for all items. Person parameters were generated from, where
FIGURE 1

K-by-I Q’ matrix in the simulation study 1. D = dimension of latent ability; items with * are used for I = 15 conditions.

K-by-I Q’ matrix in the simulation study 1. D = dimension of latent ability; items with * are used for I = 15 conditions. In such a case, the covariance of two latent abilities is σθθ’ = 0.8 (i.e., correlation coefficient ρθθ’ = 0.8) and the covariance of two latent speeds is στ τ ’ = 0.15 (i.e., correlation coefficient ρτ τ ’ = 0.6). Thirty data sets were generated.

Analysis

In simulation study 1, the MMJ model was fitted to each of the 30 replications. In each replication, two Markov chains with random starting points were used, and each chain ran 10,000 iterations with the first 5,000 iterations in each chain as burn-in. Finally, the remaining 10,000 iterations were used for the model parameter inferences. The potential scale reduction factor (PSRF; Brooks and Gelman, 1998) was computed to assess the convergence of each parameter. A PSRF with values smaller than 1.2 indicates convergence. Our studies indicated that the PSRF was smaller than 1.1 for all parameters, suggesting good convergence. To evaluate parameter recovery, the bias and the root mean square error (RMSE) was computed as: and , where is the estimated value of the model parameter in rth replication and υ is the true value of the corresponding model parameter, respectively; R is the total number of replications. The correlation between estimated and true values (Cor) was also computed.

Results

Table 4 presents the recovery of item parameters. All item parameters were well recovered. The recovery of time-intensity was the best, followed by time-discrimination, and then item easiness. An increasing sample size yielded a better recovery of item parameters. It seems that test length, the correlation coefficient between latent ability and latent speed, and the number of dimensions have a limited impact on the recovery of item parameters.
TABLE 4

Recovery of item parameters in simulation study 1.

Mean Bias
Mean RMSE
Cor
INρθτKdξωdξωdξω
15500−0.430.0000.001−0.0130.1060.0210.0750.9950.999NA
50.011–0.001−0.0240.1100.0230.0770.9950.999NA
−0.73−0.0060.000−0.0160.0980.0240.0730.9960.999NA
50.0090.001−0.0170.1140.0220.0850.9940.999NA
1000−0.43−0.001−0.001−0.0090.0760.0160.0510.9971.000NA
50.0010.001−0.0120.0740.0150.0560.9981.000NA
−0.73–0.0020.000−0.0110.0770.0150.0520.9971.000NA
50.0020.000−0.0140.0770.0160.0530.9971.000NA
30500−0.43–0.0060.000−0.0150.1100.0220.0700.9940.999NA
50.0030.000−0.0180.1060.0220.0730.9950.999NA
−0.73−0.001–0.001−0.0170.1030.0220.0670.9950.999NA
5−0.0030.000−0.0190.1060.0230.0740.9950.999NA
1000−0.430.001–0.001−0.0070.0750.0160.0470.9971.000NA
5−0.0030.000−0.0070.0760.0150.0510.9971.000NA
−0.730.0000.000−0.0080.0770.0160.0500.9970.999NA
5−0.0020.000−0.0100.0760.0160.0510.9971.000NA
Recovery of item parameters in simulation study 1. Tables 5, 6 present the recovery of ability and speed, respectively. First, the recovery of multiple speed factors was better than that of abilities. Increasing test length yielded a better recovery of person parameters; by contrast, increasing the number of dimensions yielded a worse recovery of person parameters. In addition, the higher the correlation coefficient between ability and speed, the better the recovery of latent abilities becomes; however, the correlation coefficient had little effect on the recovery of latent speeds.
TABLE 5

Recovery of multidimensional ability in simulation study 1.

Mean Bias
Mean RMSE
Cor
INρθτKθ1θ2θ3θ4θ5θ1θ2θ3θ4θ5θ1θ2θ3θ4θ5
15500−0.430.0000.0000.0000.5990.5990.5980.7980.8000.800
50.0000.0000.0000.0000.0000.6230.6270.6240.6250.6230.7800.7790.7820.7810.781
−0.730.0000.0000.0000.5200.5180.5190.8540.8540.854
50.0010.0000.0000.0000.0000.5220.5290.5260.5230.5240.8530.8490.8510.8530.850
1000−0.430.0000.0000.0000.5920.5920.5940.8030.8030.802
50.0000.0000.0000.0000.0000.6150.6170.6190.6180.6170.7860.7850.7830.7830.785
−0.730.0000.0000.0000.5150.5140.5140.8560.8560.856
50.0000.0000.0000.0000.0000.5190.5220.5220.5240.5190.8540.8520.8510.8500.854
30500−0.430.0000.0000.0000.4970.4950.4970.8660.8670.866
50.0000.0000.0000.0000.0000.5400.5360.5360.5340.5260.8400.8420.8430.8440.849
−0.730.0000.0000.0000.4480.4500.4490.8930.8920.892
50.0000.0000.0000.0000.0000.4740.4740.4700.4730.4700.8790.8800.8810.8800.881
1000−0.430.0000.0000.0000.4910.4890.4900.8690.8700.869
50.0000.0000.0000.0000.0000.5280.5280.5260.5290.5290.8460.8470.8480.8470.846
−0.730.0000.0000.0000.4470.4500.4480.8920.8920.892
50.0000.0000.0000.0000.0000.4690.4680.4730.4700.4700.8820.8830.8800.8820.881
TABLE 6

Recovery of multifactor speed in simulation study 1.

Mean Bias
Mean RMSE
Cor
INρθ τKτ1τ2τ3τ4τ5τ1τ2τ3τ4τ5τ1τ2τ3τ4τ5
15500−0.430.0000.0000.0000.1910.1940.1940.9220.9200.920
50.0000.0000.0000.0000.0000.2250.2250.2270.2270.2260.8910.8880.8900.8890.890
−0.730.0000.0000.0000.1910.1900.1880.9220.9240.925
50.0000.0000.0000.0000.0000.2210.2210.2210.2240.2240.8950.8950.8950.8920.893
1000−0.430.0000.0000.0000.1930.1910.1930.9210.9230.921
50.0000.0000.0000.0000.0000.2260.2260.2240.2270.2240.8900.8910.8920.8890.892
−0.730.0000.0000.0000.1890.1890.1890.9250.9250.925
50.0000.0000.0000.0000.0000.2210.2210.2210.2230.2230.8950.8940.8950.8940.894
30500−0.430.0000.0000.0000.1440.1450.1460.9570.9560.956
50.0000.0000.0000.0000.0010.1770.1770.1770.1760.1770.9340.9350.9340.9350.934
−0.730.0000.0000.0000.1430.1440.1440.9570.9570.957
50.0000.0000.0000.0000.0000.1750.1750.1760.1730.1750.9350.9360.9350.9370.936
1000−0.430.0000.0000.0000.1440.1440.1460.9570.9570.956
50.0000.0000.0000.0000.0000.1750.1750.1760.1770.1760.9360.9350.9350.9340.935
−0.730.0000.0000.0000.1420.1430.1440.9580.9580.957
50.0000.0000.0000.0000.0000.1740.1740.1730.1740.1740.9360.9360.9370.9370.936
Recovery of multidimensional ability in simulation study 1. Recovery of multifactor speed in simulation study 1. Table 7 presents the recovery of the item mean vector and item variance-covariance. Increasing test length and sample size yielded a better recovery. However, the correlation coefficient between ability and speed and the number of dimensions had a limited effect on the recovery. Additionally, the recovery of covariances (omitted, due to space limitations) was better than that of variances of item parameters.
TABLE 7

Recovery of item mean vector and item variance-covariance in simulation study 1.

Bias
RMSE
INρθ τKσd2σdξσξ2μdμξσd2σdξσξ2μdμξ
15500−0.430.1550.0210.0950.0200.0060.1690.0270.0960.0270.007
50.1350.0130.0970.0230.0050.1620.0160.0970.0290.007
−0.730.1510.0160.0950.0240.0060.1640.0200.0950.0290.007
50.1420.0210.0940.0260.0050.1630.0260.0940.0320.007
1000−0.430.1580.0150.0960.0130.0050.1640.0180.0960.0160.006
50.1240.0150.0950.0150.0040.1360.0200.0950.0180.006
−0.730.1500.0180.0960.0160.0050.1590.0220.0960.0190.006
50.1320.0160.0960.0180.0040.1440.0190.0960.0210.004
30500−0.430.0700.0120.0460.0240.0040.0830.0160.0460.0270.005
50.0600.0120.0440.0180.0030.0680.0140.0450.0230.004
−0.730.0780.0100.0460.0130.0040.0900.0120.0460.0160.005
50.0560.0120.0430.0190.0030.0660.0140.0430.0230.004
1000−0.430.0620.0060.0440.0130.0030.0690.0080.0440.0160.004
50.0530.0080.0450.0130.0020.0600.0100.0450.0170.003
−0.730.0680.0100.0460.0090.0030.0790.0120.0460.0120.004
50.0440.0070.0460.0120.0020.0520.0090.0460.0150.003
Recovery of item mean vector and item variance-covariance in simulation study 1. Tables 8, 9 present the recovery of variances of person parameters. Similar to the pattern of the recovery of ability and speed, the recovery of variances of multiple speed factors was better than that of abilities. Increasing test length, sample size, and the correlation coefficient between ability and speed yielded a better parameter recovery. By contrast, more dimensions led to a worse recovery of variances of person parameters. Additionally, the recovery of covariances (omitted, due to space limitations) was better than that of variances of person parameters.
TABLE 8

Recovery of the variance of ability in simulation study 1.

Bias
RMSE
INρθ τKσθ12σθ22σθ32σθ42σθ52σθ12σθ22σθ32σθ42σθ52
15500−0.430.002−0.047−0.0260.1520.1540.142
5−0.047−0.088−0.055−0.056−0.0620.1390.1930.1950.1840.168
−0.73−0.0070.0000.0100.1400.1210.142
5−0.036−0.066−0.004−0.016−0.0610.1570.1640.1480.1660.135
1000−0.43−0.058−0.033−0.0420.1010.1000.104
5−0.072−0.077−0.023−0.095−0.0920.1230.1470.1160.1390.140
−0.73−0.034−0.018−0.0150.1060.1050.099
5−0.071−0.088−0.067−0.056−0.0450.1480.1390.1170.1310.118
30500−0.430.007–0.0350.0100.0900.0990.078
5−0.068−0.085−0.086−0.054−0.0550.1270.1230.1360.1110.112
−0.73−0.014−0.019−0.0170.0820.0970.080
5−0.030–0.075−0.034–0.070−0.0560.1000.1310.0990.1100.117
1000−0.43−0.0090.003–0.0400.0600.0570.063
5−0.070−0.033−0.070–0.084–0.0420.0990.0970.1010.1070.084
−0.730.011−0.032−0.0060.0450.0870.056
5−0.050−0.060−0.069−0.069−0.0720.1000.0910.1100.1130.101
TABLE 9

Recovery of the variance of speed factor in simulation study 1.

Bias
RMSE
INρθτKστ12στ22στ32στ42στ52στ12στ22στ32στ42στ52
15500−0.430.0020.0010.0040.0100.0100.010
50.004–0.0030.0030.001−0.0010.0120.0160.0170.0130.013
−0.730.0010.0020.0020.0110.0090.010
50.0070.0020.0020.0030.0010.0150.0140.0130.0120.015
1000−0.430.0000.0030.0000.0060.0080.008
5–0.0030.0050.000−0.001−0.0010.0090.0130.0110.0110.008
−0.730.0020.0010.0010.0090.0070.008
50.001–0.0020.002−0.0020.0030.0100.0100.0090.0110.009
30500−0.430.0040.0020.0030.0080.0080.008
50.0020.0030.0010.002−0.0010.0090.0080.0090.0090.009
−0.730.0030.0020.0050.0070.0080.011
50.0030.0010.0040.0020.0010.0100.0090.0100.0100.010
1000–0.430.0030.0020.0000.0060.0050.005
50.0000.0020.0030.0010.0000.0060.0080.0090.0080.006
−0.730.0010.0020.0020.0050.0070.006
50.0000.0000.0020.001−0.0010.0080.0060.0060.0070.006
Recovery of the variance of ability in simulation study 1. Recovery of the variance of speed factor in simulation study 1. In general, the recovery of time-related parameters (e.g., item intensity, the covariance of item easiness and time-intensity, speed factors, and covariance of ability and speed) was better than that of time-unrelated parameters (e.g., item easiness and latent abilities). Overall, simulation study 1 indicated that model parameters of the MMJ could be recovered very well via the proposed full Bayesian MCMC estimation algorithm.

An Empirical Example

Data Description and Analysis

In this section, the PISA 2012 computer-based mathematics RA and RT data were analyzed by using the MMJ model and the MSJ model to explore whether the former fits the data better than the latter when the test structure is multidimensional. Details about this data set were mentioned previously in the motivating example. The Q-matrix in Table 1 was used. For each model, in each replication, the numbers of chains, burn-in iterations, and post-burn-in iterations were the same as those set in the simulation study. Convergence was well achieved according to the PSRF < 1.1. Posterior predictive model checking (PPMC; Gelman et al., 2014) was used to evaluate model-data fit. A posterior predictive probability (ppp) value near 0.5 indicates that there are no systematic differences between the realized and predictive values, and thus an adequate fit of the model. In PPMC, the sum of the squared Pearson residuals for person n and item i (Yan et al., 2003) was used as a discrepancy measure to evaluate the overall fit of the RA model, which is written as where P(Y = 1) has the same definition as that in Equation (6). The sum of the standardized error function of logT for person n and item i was employed as a discrepancy measure of the RT model: Additionally, two information criteria that suitable for Bayesian estimation, the deviance information criterion (DIC) and widely available information criterion (WAIC) (Gelman et al., 2014, Chapter 7), were computed for model selection. A smaller value of these two criteria indicates a better model-data fit. The DIC and WAIC both identified that the MMJ model fit the data better than the MSJ model, as shown in Table 10. In the MMJ model, the ppp values of the RA model and the RT model were 0.736 and 0.578, respectively, which indicates an adequate model-data fit. The results indicate that it is more appropriate to simultaneously consider the multidimensionality of latent ability and the multifactor structure of working speed for the multidimensional test.
TABLE 10

Model fit for the PISA 2012 computer-based mathematics data.

Analysis ModelDICWAICppp_RAppp_RT
MMJ34853344330.7360.578
MSJ35910356690.6080.569
Model fit for the PISA 2012 computer-based mathematics data. Note that the parameter estimates of the MMJ model in the empirical example were omitted for brevity but can be found in the online Supplementary Appendix (see Supplementary section S2), mainly because this part of the content is not the main concern of the empirical study.

Discussion

The kernel hypothesis of this study is that respondents can work with different levels of speed on items that require different dimensions of ability for a multidimensional test. To model the varying speed across dimensions of ability, this study relaxed the assumption of many RT models in which it is assumed that respondents work with a constant rate throughout the test. As a result, a multifactor working speed model and a joint model for multidimensional ability and multifactor speed were proposed. First, a motivating example with the EFA of PISA 2012 computer-based mathematics RTs was presented. The results indicate that working speed has a multifactor structure, which is also consistent with the multidimensional structure of ability. Then, two simulation studies were used to evaluate the psychometric properties of the proposed joint model. The results indicate that (1) parameters of the proposed joint model could be well recovered using the proposed Bayesian MCMC approach, (2) misspecifying a multifactor structure of speed has limited effect on the recovery of model parameters, and (3) ignoring the multifactor structure of speed could lead to biased and imprecise estimation, especially for time-related parameters. The PISA 2012 computer-based mathematics RA and RT data were analyzed as well to illustrate the implications and applications of the proposed models. The results show that it is appropriate to consider the multidimensionality of latent ability and the multifactor structure of working speed, simultaneously, in multidimensional tests. Overall, considering the results of EFA, the simulation studies, and the empirical example, there are reasons to believe that the kernel hypothesis of this study is valid and the proposed model can reasonably jointly analyze RA and RTs in multidimensional tests. The work presented in this article is only a first attempt to deal with the variable speed across dimensions of ability. Despite promising results, further exploration is encouraged. First, the proposed MLRT model is an extension of the classical lognormal RT model (van der Linden, 2006). Thus, there are some limitations of the current model. For instance, it assumes that RA and RTs are conditionally independent given all person parameters (Meng et al., 2015; Bolsinova and Maris, 2016); that after log-transformation, the log RTs follow a normal distribution (Klein Entink et al., 2009b); and that all respondents apply the same problem-solving strategy throughout the whole test (Wang and Xu, 2015). Second, although the proposed model takes into account the differences in working speed across different dimensions of ability, it still assumes that the working speed of a respondent is constant on items within the same dimension. In future studies, this hypothesis can be further relaxed; that is, each respondent could be allowed to change his or her working speed in different dimensions, and could also be allowed to adjust his or her working speed within the same dimension according to the order of items. Third, in the proposed joint model, a multivariate normal distribution was used to describe the relationships among multidimensional ability and multifactor speed. So, the number of total dimensions is twice as many as the number of dimensions that are measured by the test, which may increase the complexity of the model and the computational burden. If the ability and speed can each have a second-order (or bi-factor) structure, not only can the parameter estimation challenge be largely reduced, but the structures of ability and speed can be posited and tested. Fourth, in this study, only the MR model and the MLRT model were used as measurement models for illustration. Given the “plug-and-play” nature of the hierarchical modeling, various MIRT models and multifactor working speed models can be adopted in the future. Fifth, applications of the proposed model, such as detecting aberrant responses (e.g., rapid-guessing and cheating) in multidimensional tests, need further investigation. Moreover, in Bayesian estimation, the prior distribution reflects the data analyst’s beliefs and the known information about the data. In practice, we recommend that the data analyst select appropriate prior distributions based on the actual test scenario rather than copy those given in this study. Last but not least, only the between-item multidimensional test was considered in this study. For the between-item multidimensional test, it is clear that working speed can vary across items when the items are related to different dimensions. However, the within-item multidimensional test is still possible in reality. For example, when respondents, especially non-native English speakers, take part in the GRE® Subject Test (e.g., Mathematics), at least two abilities are needed: one for understanding the questions (e.g., English reading ability), and one for solving the questions (e.g., the subject ability). Meanwhile, the corresponding two latent speed factors work; one reflects the working speed of reading, and the other one reflects the working speed of problem-solving. The introduction of within-item multidimensionality is bound to increase the complexity of the model and the difficulty of constructing the Q-matrix. Thus, the rationality and necessity of the within-item multifactor working speed model is still an open-ended question needed to be studied in the future.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://www.oecd.org/pisa/data/pisa2012database-downloadabledata.htm and http://www.oecd.org/pisa/data/pisa2012database-downloadabledata.htm.

Author Contributions

PZ contributed to the conception, design, and analysis of data as well as paper drafting and revising the manuscript. HJ contributed to the design and critically revising the manuscript. W-CW contributed to conception, design, and revising the manuscript. KM contributed to the critically revising the manuscript. W-CW contributed to conception, design, and revising the manuscript. KH contributed to the interpretation of data and critically revising the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  16 in total

1.  The linear transformation model with frailties for the analysis of item response times.

Authors:  Chun Wang; Hua-Hua Chang; Jeffrey A Douglas
Journal:  Br J Math Stat Psychol       Date:  2012-04-17       Impact factor: 3.380

2.  The Joint Multivariate Modeling of Multiple Mixed Response Sources: Relating Student Performances with Feedback Behavior.

Authors:  Jean-Paul Fox; Rinke Klein Entink; Caroline Timmers
Journal:  Multivariate Behav Res       Date:  2014 Jan-Feb       Impact factor: 5.923

3.  A Box-Cox normal model for response times.

Authors:  R H Klein Entink; W J van der Linden; J-P Fox
Journal:  Br J Math Stat Psychol       Date:  2009-01-30       Impact factor: 3.380

4.  Hidden Markov Item Response Theory Models for Responses and Response Times.

Authors:  Dylan Molenaar; Daniel Oberski; Jeroen Vermunt; Paul De Boeck
Journal:  Multivariate Behav Res       Date:  2016-08-11       Impact factor: 5.923

5.  Improving precision of ability estimation: Getting more from response times.

Authors:  Maria Bolsinova; Jesper Tijmstra
Journal:  Br J Math Stat Psychol       Date:  2017-06-21       Impact factor: 3.380

6.  A test for conditional independence between response time and accuracy.

Authors:  Maria Bolsinova; Gunter Maris
Journal:  Br J Math Stat Psychol       Date:  2015-06-08       Impact factor: 3.380

7.  Cognitive diagnosis modelling incorporating item response times.

Authors:  Peida Zhan; Hong Jiao; Dandan Liao
Journal:  Br J Math Stat Psychol       Date:  2017-09-05       Impact factor: 3.380

8.  Joint Modeling of Compensatory Multidimensional Item Responses and Response Times.

Authors:  Kaiwen Man; Jeffrey R Harring; Hong Jiao; Peida Zhan
Journal:  Appl Psychol Meas       Date:  2019-02-22

9.  A Multivariate Multilevel Approach to the Modeling of Accuracy and Speed of Test Takers.

Authors:  R H Klein Entink; J-P Fox; W J van der Linden
Journal:  Psychometrika       Date:  2008-08-23       Impact factor: 2.500

10.  An Overview of Models for Response Times and Processes in Cognitive Tests.

Authors:  Paul De Boeck; Minjeong Jeon
Journal:  Front Psychol       Date:  2019-02-06
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.