# Sometimes More Is Better, and Sometimes Less Is Better: Task Complexity Moderates the Response Time Accuracy Correlation

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Item Response Time and Item Success

#### 1.2. Dual Processing Theory of Response Time Effects

#### 1.3. Figural Matrices: Concept, Solution Process and RTACs

#### 1.4. Inconsistency Hypothesis

#### 1.5. Goals and Hypotheses of the Present Study

## 2. Experimental Section

#### 2.1. Data Acquisition and Sample

#### 2.2. Instrumentation

#### 2.3. Statistical Analyses

_{pi}) for a particular person and item is decomposed into a fixed general intercept (β

_{0}) that corresponds to the logit of the probability of success in an average item completed by an average person. Additionally, random intercepts of person ability (b

_{0p}) and item easiness (b

_{0i}) are specified.

_{pi}). Following Roskam [7], log-transformed response times were entered as predictors, that is, t

_{pi}= log(RT

_{pi}). Again, a general fixed effect (β

_{0}) is specified as well as random effects for persons (b

_{0p}) and items (b

_{0i}) in Model (2). The fixed effect of slope (β

_{1}) is used to test whether the RTAC is positive on average (H1). Note, however, that the interpretation of β

_{1}is equivocal, as its magnitude is determined both by person and by item characteristics [4]. In order to test their specific contributions (H2 and H3), random effects of persons and of items are modeled simultaneously. This allows computing the RTAC by person (β

_{1}+ b

_{1p}) and the RTAC by item (β

_{1}+ b

_{1i}) as respective response time covariates [2].

_{i}) is investigated as an explanatory variable of item difficulty (H4a). A possible effect of rule number as an item-level covariate would be reflected in a reduction of the random variance in item easiness. Additionally, rule number was tested as a possible moderator of the RTAC by introducing a fixed effect of task rule (β

_{2}r

_{i}) and a fixed interaction effect of number of task rules with response time (β

_{3}r

_{i}t

_{pi}) in Model (3). The first captures the main effect of rule number on the probability to solve an item correctly, whereas the latter interaction term with item-time reflects the predicted moderation of the RTAC by rule number (H4b).

_{p}), of item easiness (σ

_{i}), and of response time (t

_{pi}) on accuracy using a fixed-effects model with respective interaction terms. To this end, the partial-credit test score was used as a proxy of person ability (θ

_{p}), and the observed frequency of solving an item was entered as a proxy of item easiness (σ

_{i}). In a first step, we introduced both scores as covariates in Model (1), showing that they, indeed, accounted perfectly for the entire random variance in the random effects of person and item, respectively. Additionally, we tested that the scores were substantially correlated with the respective random intercepts in all other Models (1)–(3). Then, Model (4) was tested that additionally comprised interaction effects of the predictors.

#### 2.4. Preliminary Checks and Data Preparation

_{3}statistic [47]. The latter tests the absence of meaningful pair-wise correlations of item residuals after controlling for person ability and item difficulty. If the 1PL Rasch Model (1) holds, the Q

_{3}statistic is expected to be −1/(n − 1), with n = number of items in the test. In the present data, the empirical value of −0.029 (SD = 0.070) perfectly matched the expected value of −0.029. Therefore, Rasch homogeneity can be assumed for the DESIGMA items.

## 3. Results

#### 3.1. Average RTAC across Items and Persons (H1)

_{1}= −0.31, p = 0.13).

#### 3.2. Moderation of the RTAC by Person Ability (H2)

_{0p}) = 6.94), indicating individual differences in ability. Additionally, there was a random effect of person in the slope (Var(b

_{1p}) = 1.04), implying individual differences in the RTAC. Finally, both random effects across persons (intercept b

_{0p}and slope b

_{1p}) were negatively related (r = −.69), indicating that more able persons had more negative RTACs. This pattern is reflected in Figure 4. While RTACs were around zero on average, the direction of the correlation linearly depended on person ability: More able persons showed a negative RTAC (i.e., they were more likely to give wrong responses at long RTs), whereas less able persons showed a positive RTAC (i.e., they were more likely to give correct responses at long RTs).

^{2}(2) = 63.84, p < .001. In line with this, preferred lower information criteria were obtained for the full random effects model (AIC = 12,154, BIC = 12,216) compared with the restricted model without random person slope (AIC = 12,215, BIC = 12,261). Additionally, a model with both random effects across persons, but without a correlation between random intercept and random slope fit significantly worse: χ

^{2}(1) = 43.64, p < .001; AIC = 12,197, BIC = 12,250. This confirms the observed negative relationship of person ability with the RTAC.

#### 3.3. Moderation of the RTAC by Item Difficulty (H3)

_{0i}) = 2.58), indicating differential item easiness. There was a random effect of item in the slope (Var(b

_{1i}) = 1.22), indicating differential RTACs across items. Finally, there was a negative relationship between these two random effects (r = −.71), which indicates that the RTAC is more negative the easier the item. Figure 5 illustrates this pattern. For easier items there is a negative RTAC (i.e., they are more likely to be answered wrong at long RTs). Conversely, for difficult items there is a positive RTAC (i.e., they are more likely to be answered correctly at long RTs).

^{2}(2) = 160.84, p < .001. In line with this, preferred lower information criteria were obtained for the full random effects model (AIC = 12,155, BIC = 12,216) compared with the restricted model without random item slope (AIC = 12,312, BIC = 12,358). Furthermore, a model allowing both random effects across items but no correlation of random intercept and slope fit significantly worse compared to the full model: χ

^{2}(1) = 21.59, p < .001; AIC = 12,174, BIC = 12,228. This supports the notion that item difficulty moderates the RTAC.

#### 3.4. Effect of Number of Rules (H4a and H4b)

_{0i})) will result as a testable prediction. In fact, the introduction of rule number as an item-level covariate reduced the variance of the random intercept across items from Var(b

_{0i}) = 3.07 to Var(b

_{0i}) = 2.38, which implies that rule number accounts for 22% percent of the item difficulty. Similarly, the variance in the random intercept in the response time of Model (2) dropped by 29% from Var(b

_{0i}) = 2.58 to Var(b

_{0i}) = 1.82 when rule number was introduced as a covariate in Model (3).

_{1}= −0.30, p < 0.05). There were random intercept (Var(b

_{0p}) = 6.90) and slope (Var(b

_{1p}) = 1.05) effects across persons, and both were negatively related (r = −0.68). Additionally, there were random intercept (Var(b

_{0i}) = 1.82) and random slope (Var(b

_{1i}) = 0.29) effects across items, and both were negatively related (r = −0.55).

_{2}= −0.98, p < 0.001), indicating that by adding one more rule the logit of a correct response is decreased by −0.98. Additionally, there was an interaction effect of rule number with response time (β

_{3}= 1.00, p < .001), which implies that item difficulty moderates the RTAC. At an average value (i.e., zero) of the by-person and the by-item adjustments, the RTAC would be −0.29, −0.09, 0.13, 0.33 for one to four rules, respectively (i.e., when entering the corresponding values of the centered rule number variable), indicating a negative RTAC for items with low difficulty and a positive RTAC for items with high difficulty. This means, for simple items (with just one rule), fast response times were associated with correct responses. In contrast, for very complex items (with four rules), slow response times were associated with correct responses.

#### 3.5. RTAC as a Function of Item Difficulty and Person Ability (H5)

_{1}= −0.09, p = 0.14). There were (trivial) main effects of item easiness (β

_{2}= 8.89, p < .001) and of person ability (β

_{3}= 7.69, p < .001). There were negative interaction effects of item response time with item easiness (β

_{4}= −1.847, p < 0.001) and with person ability (β

_{5}= −1.84, p < 0.001). The interaction of item easiness and person ability was positive (β

_{6}= 0.67, p < .001). Finally, there was the predicted three-way interaction of person ability, response time, and item easiness (β

_{7}= 2.00, p < 0.05).

## 4. Discussion

#### 4.1. Discussion of the Dual Processing Account of Response Time Effects

#### 4.2. Task Complexity and Cognitive Load

#### 4.3. An Illustrative Example of Cognitive Load

#### 4.4. Cognitive Load Theory as an Extension of the Dual Processing Account

## 5. Limitations and Outlook

## 6. Conclusions

## Author Contributions

## Conflicts of Interest

## References

- Goldhammer, F.; Naumann, J.; Stelter, A.; Tóth, K.; Rölke, H.; Klieme, E. The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. J. Educ. Psychol.
**2014**, 106, 608–626. [Google Scholar] [CrossRef] - Goldhammer, F.; Naumann, J.; Greiff, S. More is not Always Better: The Relation between Item Response and Item Response Time in Raven’s Matrices. J. Intell.
**2015**, 3, 21–40. [Google Scholar] [CrossRef] - Van der Linden, W.J. A hierarchical framework for modeling speed and accuracy on test items. Psychometrika
**2007**, 72, 287–308. [Google Scholar] [CrossRef] - Van der Linden, W.J. Conceptual issues in response-time modeling. J. Educ. Meas.
**2009**, 46, 247–272. [Google Scholar] [CrossRef] - Carver, R.P. Reading rate: Theory, research, and practical implications. J. Read.
**1992**, 36, 84–95. [Google Scholar] - Goldhammer, F.; Klein Entink, R.H. Speed of reasoning and its relation to reasoning ability. Intelligence
**2011**, 39, 108–119. [Google Scholar] [CrossRef] - Roskam, E.E. Models for Speed and Time-Limit Tests. In Handbook of Modern Item Response Theory; van der Linden, W.J., Hambleton, R.K., Eds.; Springer: New York, NY, USA, 1997; pp. 187–208. [Google Scholar]
- Luce, R.D. Response Times: Their Role in Inferring Elementary Mental Organization; Oxford University Press: Oxford, UK, 1986. [Google Scholar]
- Wickelgren, W.A. Speed-accuracy tradeoff and information processing dynamics. Acta Psychol. (Amst.)
**1977**, 41, 67–85. [Google Scholar] [CrossRef] - Klein Entink, R.H.; Fox, J.-P.; van der Linden, W.J. A multivariate multilevel approach to the modeling of accuracy and speed of test takers. Psychometrika
**2009**, 74, 21–48. [Google Scholar] [CrossRef] [PubMed] - Van der Linden, W.J.; Scrams, D.J.; Schnipke, D.L. Using response-time constraints to control for differential speededness in computerized adaptive testing. Appl. Psychol. Meas.
**1999**, 23, 195–210. [Google Scholar] [CrossRef] - Goldhammer, F.; Naumann, J.; Keßel, Y. Assessing individual differences in basic computer skills: psychometric characteristics of an interactive performance measure. Eur. J. Psychol. Assess.
**2013**, 29, 263–275. [Google Scholar] [CrossRef] - Schneider, W.; Chein, J.M. Controlled & automatic processing: Behavior, theory, and biological mechanisms. Cogn. Sci.
**2003**, 27, 525–559. [Google Scholar] - Schneider, W.; Shiffrin, R.M. Controlled and automatic human information processing: I. Detection, search, and attention. Psychol. Rev.
**1977**, 84, 1–66. [Google Scholar] [CrossRef] - Landerl, K.; Wimmer, H. Development of word reading fluency and spelling in a consistent orthography: An 8-year follow-up. J. Educ. Psychol.
**2008**, 100, 150–161. [Google Scholar] [CrossRef] - Wirth, J.; Klieme, E. Computer-based assessment of problem solving competence. Assess. Educ. Princ. Policy Pract.
**2003**, 10, 329–345. [Google Scholar] [CrossRef] - Greiff, S.; Wüstenberg, S.; Molnár, G.; Fischer, A.; Funke, J.; Csapó, B. Complex problem solving in educational contexts—Something beyond g: Concept, assessment, measurement invariance, and construct validity. J. Educ. Psychol.
**2013**, 105, 364–379. [Google Scholar] [CrossRef] - Organisation for Economic Co-Operation and Development. OECD Skills Outlook 2013; OECD Publishing: Paris, France, 2013. [Google Scholar]
- Carpenter, P.A.; Just, M.A.; Shell, P. What one intelligence test measures: A theoretical account of the processing in the Raven Progressive Matrices Test. Psychol. Rev.
**1990**, 97, 404–431. [Google Scholar] [CrossRef] [PubMed] - Hornke, L.F. Item response times in computerized adaptive testing. Psicol. Rev. Metodol. Psicol. Exp.
**2000**, 21, 175–190. [Google Scholar] - Hornke, L.F. Untersuchung von Itembearbeitungszeiten beim computergestützten adaptiven Testen. Diagnostica
**1997**, 43, 27–39. (In German) [Google Scholar] - Beckmann, J. Differentielle Latenzzeiteffekte bei der Bearbeitung von Reasoning-items. Diagnostica
**2000**, 46, 124–129. (In German) [Google Scholar] [CrossRef] - Beckmann, J.F.; Beckmann, N. Effects of feedback on performance and response latencies in untimed reasoning tests. Psychol. Sci.
**2005**, 47, 262–278. [Google Scholar] - Beckmann, J.; Guthke, J.; Vahle, H. Analysen zum Zeitverhalten bei computergestützten adaptiven Intelligenz-Lerntests. Diagnostica
**1997**, 43, 40–62. (In German) [Google Scholar] - Rammsayer, T.; Brandler, S. Zum Zeitverhalten beim computergestützten adaptiven Testen. Z. Differ. Diagn. Psychol.
**2003**, 24, 57–63. (In German) [Google Scholar] [CrossRef] - Neubauer, A.C. Speed of information processing in the Hick paradigm and response latencies in a psychometric intelligence test. Personal. Individ. Differ.
**1990**, 11, 147–152. [Google Scholar] [CrossRef] - Sporer, S.L. Eyewitness identification accuracy, confidence, and decision times in simultaneous and sequential lineups. J. Appl. Psychol.
**1993**, 78, 22–23. [Google Scholar] [CrossRef] - Hornke, L.F. Response time in computer-aided testing: A “Verbal Memory” test for routes and maps. Psychol. Sci.
**2005**, 47, 280–293. [Google Scholar] - Lasry, N.; Watkins, J.; Mazur, E.; Ibrahim, A. Response times to conceptual questions. Am. J. Phys.
**2013**, 81, 703–706. [Google Scholar] [CrossRef] - Raven, J.; Raven, J.C.; Court, J.H. Manual for Raven’s Progressive Matrices and Vocabulary Scales; Harcourt Assessment: San Antonio, TX, USA, 1998. [Google Scholar]
- Hornke, L.F.; Küppers, A.; Etzel, S. Konstruktion und Evaluation eines adaptiven Matrizentests. Diagnostica
**2000**, 46, 182–188. (In German) [Google Scholar] [CrossRef] - Preckel, F. Diagnostik Intellektueller Hochbegabung. Testentwicklung zur Erfassung der Fluiden Intelligenz; Hogrefe: Göttingen, Germany, 2003. (In German) [Google Scholar]
- Becker, N.; Spinath, F.M. Design a Matrix Test. Ein Distraktorfreier Matrizentest zur Erfassung der Allgemeinen Intelligenz (DESIGMA); Hogrefe: Göttingen, Germany, 2014. (In German) [Google Scholar]
- Göritz, A. WiSo-Panel. Available online: http://www.wiso-panel.net/ (accessed on 26 February 2016).
- Göritz, A.S. Determinants of the starting rate and the completion rate in online panel studies. In Online Panel Research: A Data Quality Perspective; Callegaro, M., Baker, R., Bethlehem, J., Göritz, A.S., Krosnick, J.A., Lavrakas, P.J., Eds.; John Wiley & Sons: Chichester, UK, 2014; pp. 154–170. [Google Scholar]
- UNESCO Institute for Statistics. International Standard Classification of Education: ISCED 2011; UNESCO Institute for Statistics: Montreal, QC, Canada, 2012. [Google Scholar]
- Becker, N.; Preckel, F.; Karbach, J.; Raffel, N.; Spinath, F.M. Die Matrizenkonstruktionsaufgabe: Validierung eines distraktorfreien Aufgabenformats zur Vorgabe figuraler Matrizen. Diagnostica
**2015**, 61, 22–33. (In German) [Google Scholar] [CrossRef] - Becker, N.; Schmitz, F.; Falk, A.; Feldbrügge, J.; Recktenwald, D.; Wilhelm, O.; Preckel, F.; Spinath, F. Preventing response elimination strategies improves the convergent validity of figural matrices. J. Intell.
**2016**, 4, 2. [Google Scholar] [CrossRef] - Baayen, R.H.; Davidson, D.J.; Bates, D.M. Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang.
**2008**, 59, 390–412. [Google Scholar] [CrossRef] - De Boeck, P.; Bakker, M.; Zwitser, R.; Nivard, M.; Hofman, A.; Tuerlinckx, F.; Partchev, I. The estimation of item response models with the lmer function from the lme4 package in R. J. Stat. Softw.
**2011**, 39, 1–28. [Google Scholar] [CrossRef] - Doran, H.; Bates, D.; Bliese, P.; Dowling, M. Estimating the multilevel Rasch model: With the lme4 package. J. Stat. Softw.
**2007**, 20, 1–18. [Google Scholar] [CrossRef] - Searle, S.R. Linear Models; Wiley: New York, NY, USA, 1971. [Google Scholar]
- Bates, D.; Mächler, M.; Bolker, B.; Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw.
**2015**, 67, 1–48. [Google Scholar] [CrossRef] - R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Wien, Austria, 2015. [Google Scholar]
- Robitzsch, A. Sirt: Supplementary Item Response Theory Models. Available online: https://cran.r-project.org/web/packages/sirt/index.html (accessed on 30 March 2016).
- Wickham, H. Ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2009. [Google Scholar]
- Yen, W.M. Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Appl. Psychol. Meas.
**1984**, 8, 125–145. [Google Scholar] [CrossRef] - Sweller, J. Cognitive load theory, learning difficulty, and instructional design. Learn. Instr.
**1994**, 4, 295–312. [Google Scholar] [CrossRef] - Sweller, J. Cognitive load theory. Psychol. Learn. Motiv. Cogn. Educ.
**2011**, 55, 37–76. [Google Scholar] - Paas, F.; Renkl, A.; Sweller, J. Cognitive load theory: Instructional implications of the interaction between information structures and cognitive architecture. Instr. Sci.
**2004**, 32, 1–8. [Google Scholar] [CrossRef] - Paas, F.G.W.C.; Van Merriënboer, J.J.G. Instructional control of cognitive load in the training of complex cognitive tasks. Educ. Psychol. Rev.
**1994**, 6, 351–371. [Google Scholar] [CrossRef] - Paas, F.; Tuovinen, J.E.; Tabbers, H.; Van Gerven, P.W.M. Cognitive load measurement as a means to advance cognitive load theory. Educ. Psychol.
**2003**, 38, 63–71. [Google Scholar] [CrossRef] - Beckmann, J.F. Taming a beast of burden—On some issues with the conceptualisation and operationalisation of cognitive load. Learn. Instr.
**2010**, 20, 250–264. [Google Scholar] [CrossRef] - Barrouillet, P.; Bernardin, S.; Camos, V. Time constraints and resource sharing in adults’ working memory spans. J. Exp. Psychol. Gen.
**2004**, 133, 83–100. [Google Scholar] [CrossRef] [PubMed] - Barrouillet, P.; Bernardin, S.; Portrat, S.; Vergauwe, E.; Camos, V. Time and cognitive load in working memory. J. Exp. Psychol. Learn. Mem. Cogn.
**2007**, 33, 570–585. [Google Scholar] [CrossRef] [PubMed] - Preckel, F.; Thiemann, H. Online- versus paper-pencil-version of a high potential intelligence test. Swiss J. Psychol.
**2003**, 62, 131–138. [Google Scholar] [CrossRef] - Hess, E.H.; Polt, J.M. Pupil size in relation to mental activity during simple problem-solving. Science
**1964**, 143, 1190–1192. [Google Scholar] [CrossRef] [PubMed] - Kahneman, D.; Beatty, J. Pupil diameter and load on memory. Science
**1966**, 154, 1583–1585. [Google Scholar] [CrossRef] [PubMed] - Beatty, J.; Lucero-Wagoner, B. The pupillary system. Handb. Psychophysiol.
**2000**, 2, 142–162. [Google Scholar] - Querino, E.; dos Santos, L.; Ginani, G.; Nicolau, E.; Miranda, D.; Romano-Silva, M.; Malloy-Diniz, L. Cognitive effort and pupil dilation in controlled and automatic processes. Transl. Neurosci.
**2015**, 6, 168–173. [Google Scholar] [CrossRef]

^{1}The current literature uses the term “response time effect” instead of RTAC. We decided to use the latter term. “Response time effect” carries the risk of misinterpretations since it might obscure the fact that a correlation is meant.

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Becker, N.; Schmitz, F.; Göritz, A.S.; Spinath, F.M.
Sometimes More Is Better, and Sometimes Less Is Better: Task Complexity Moderates the Response Time Accuracy Correlation. *J. Intell.* **2016**, *4*, 11.
https://doi.org/10.3390/jintelligence4030011

**AMA Style**

Becker N, Schmitz F, Göritz AS, Spinath FM.
Sometimes More Is Better, and Sometimes Less Is Better: Task Complexity Moderates the Response Time Accuracy Correlation. *Journal of Intelligence*. 2016; 4(3):11.
https://doi.org/10.3390/jintelligence4030011

**Chicago/Turabian Style**

Becker, Nicolas, Florian Schmitz, Anja S. Göritz, and Frank M. Spinath.
2016. "Sometimes More Is Better, and Sometimes Less Is Better: Task Complexity Moderates the Response Time Accuracy Correlation" *Journal of Intelligence* 4, no. 3: 11.
https://doi.org/10.3390/jintelligence4030011