1. Introduction
Cognitive abilities are widely regarded as key predictors of academic success (
Watkins et al. 2007;
Gustafsson and Balke 1993). Traditionally, school achievement has been assumed to follow cognitive capacity in a largely unidirectional pattern: intelligence influences learning outcomes, but not vice versa (
Watkins and Styck 2017). However, more recent theoretical approaches challenge this view by proposing a reciprocal relationship between cognition and academic skills—a perspective broadly summarized under the theory of mutualism (
Soares et al. 2015;
Stanovich 1986).
Although the link between cognitive ability and academic achievement is well established, the directionality of this relationship remains debated. Traditional models assume a unidirectional influence from intelligence to achievement, while newer frameworks propose reciprocal effects (mutualism). However, empirical studies investigating this issue often rely on only two measurement time points, small or selective samples, or undifferentiated performance indicators (e.g., grades).
To date, there has been a lack of large-scale longitudinal research that systematically examines both unidirectional and reciprocal effects between cognitive ability and domain-specific academic performance—particularly in early school years and across more than two time points. This study addresses this gap using a four-wave cross-lagged design with standardized test instruments in a large sample of German primary school children.
The mutualism model posits that learning success and cognitive development are interdependent and mutually reinforcing over time. This contrasts with the classic “top-down” view, which assumes that cognitive ability sets a fixed limit for academic development. From a mutualist perspective, engaging with demanding academic content—such as solving complex mathematics tasks—can stimulate cognitive growth and thus contribute to the development of general abilities (
Wilhelm and Engle 2005;
Hambrick and Engle 2002). Unidirectional models assume that cognitive ability precedes and determines academic achievement, implying a top-down influence (intelligence → school performance). In contrast, the mutualism model posits a reciprocal relationship: cognitive and academic abilities mutually reinforce each other over time (intelligence ↔ academic performance).
It becomes obvious that far-reaching decisions about students’ academic careers depend on cognitive ability diagnosis. This is in line with the traditionally fortified idea, according to which the level of cognitive ability determines the level of school performance (
Watkins and Styck 2017).
Earlier, (
Soares et al. 2015) asked to what extent learning achievements are responsible for later academic success and how cognitive abilities influence later educational performance in comparison. Based on a longitudinal design with two measurement time points for school grades and baseline cognitive ability assessment, the authors concluded that intelligence is a powerful predictor of educational performance. The increasing effect of intelligence on school performance over time is also called the Matthew effect, first by
Stanovich (
1986). However, as (
Watkins et al. 2007) claimed, longitudinal studies with repeated assessments of both academic performance and cognitive abilities are needed to address the theory of mutualism as compared with the more traditional view of a unidirectional influence. In order to contrast the temporal precedence vs. mutualism hypothesis, Watkins and colleagues conducted two longitudinal studies (
Watkins et al. 2007;
Watkins and Styck 2017), both including two measurement time points. They examined the crossed-lagged effect between cognitive performance and achievement in reading and mathematics of students assessed for special education eligibility. In both studies, the authors concluded that psychometric intelligence causally influenced later achievement, but not the other way around. Although the studies by Watkins (
Watkins et al. 2007;
Watkins and Styck 2017) belong to the rare attempts to analyze the relationships between cognitive performance and achievement in a longitudinal design, there are limitations in these studies that need to be addressed. First, the two samples are relatively small and include children with special educational needs. Therefore, the generalizability of these studies is limited. Second, study drop-outs between the measurement time points were considerably high because many children left the special school and could not be tested repeatedly. Third, the delay between the two measurement occasions encompassed almost three years. The study of such long periods of time does not allow a fine-grained investigation of mutual developmental effects between cognitive ability and school achievement. Additionally, even more importantly, both studies included only two measurement time points, which do not allow studying the robustness and invariance of cross-lagged effects across a longer period of time.
From a theoretical perspective, in addition to the Matthew effect, a compensation effect through school knowledge and previous knowledge is plausible. Furthermore, empirically it has been repeatedly shown that school education can compensate for lower cognitive abilities (
Wilhelm and Engle 2005;
Hambrick and Engle 2002). Compensation across time was also demonstrated for the fluid and crystallized domains of intelligence (
Schroeders et al. 2016). Thus, if compensation and cumulative effects, also referred to as ‘Matthew effects’ (
Stanovich 1986), are mutual, we would expect bidirectional influences between cognition and school achievement across development (intelligence influences math skills, and math skills in turn influence intelligence). However, recently, (
Peng and Kievit 2020) emphasized that academic and cognitive development are indirectly triggered by the existence of a bidirectional relationship between cognitive and academic development.
Although empirical evidence for a bidirectional relationship between cognitive ability and academic achievement is growing, it remains limited, especially in younger children (
Schroeders et al. 2016;
Peng and Kievit 2020). Previous research has often faced methodological constraints. Many longitudinal studies rely on only two measurement points, small sample sizes, or focus on special education populations (
Watkins et al. 2007;
Watkins and Styck 2017). As a result, findings are not easily generalizable, and developmental inferences remain tentative. In addition, broad indicators such as overall school grades have frequently been used, which may obscure the specific processes linking cognition and academic performance.
The present study addresses these limitations by drawing on a large-scale, four-wave longitudinal dataset from German primary schools (N = 1726). This design allows a fine-grained examination of the temporal dynamics between cognitive ability and mathematics achievement across grades 1 to 4. Importantly, we use standardized, domain-specific instruments: the German Cognitive Abilities Test (CAT/KFT) and the DEMAT mathematics test series. Building on theoretical frameworks such as the mutualism model (
Peng and Kievit 2020), we compare unidirectional and reciprocal models to clarify the directionality of developmental effects between cognition and academic achievement.
Based on prior theory and evidence, we formulate the following hypotheses:
Stability of cognitive abilities and math achievement: Both cognitive abilities and mathematics achievement will show strong autoregressive stability across primary school years.
Unidirectional effects of cognitive abilities on math achievement: Earlier cognitive abilities will positively predict later mathematics achievement.
Reciprocal effects of mathematics achievement on cognitive abilities: Mathematics achievement will also predict subsequent cognitive performance, consistent with the mutualism model.
By evaluating these hypotheses across four time points, this study provides new longitudinal evidence to the theoretical debate and offers practical implications for early educational support. Understanding these developmental interdependencies can inform interventions that combine cognitive stimulation with domain-specific learning strategies from the beginning of schooling.
3. Analyses
All analyses were conducted using R 4.4.1. Given that the number of complete data points was not sufficient to build a reflective latent variable model, while also accounting for the hierarchical structure of the data, CAT and DEMAT scores for each school year were calculated by averaging the accuracy scores of the respective three subscales.
Structural equation modeling (SEM) utilizing the package lavaan (version 0.6-20) was used to assess longitudinal stability and cross-lagged effects.
Each SEM was estimated using robust maximum likelihood estimation (MLR) and school as a cluster variable. Accordingly, cluster-robust standard errors are reported in the results.
The first set of models included autoregressive effects between consecutive school years on the one hand and the second-to-last school year on the other hand, separately for the CAT and for the DEMAT. The second model included unilateral effects from the CAT to the DEMAT, first only to the next school year, then including also paths to the second-to-last school year. The third model also included reciprocal effects from the DEMAT to the CAT, representing a full cross-lagged panel model. We used more liberal cut-offs for the RMSEA, since this measure is less representative of overall model fit in hierarchical models (
Hsu et al. 2015).
We use the term “practically relevant” to refer to effects of at least medium size, typically r or β ≥ 0.20, following Cohen’s conventions.
5. Discussion at Detailed Level—Math Achievement
In the following, the results of the current study will be critically reviewed, starting with a look at the study design and construct consistency, followed by a discussion of the results. We structure our discussion around the three research questions: (1) stability, (2) unidirectional effects, and (3) reciprocal effects.
Stability of cognitive abilities and math achievement over time
The results regarding cognitive abilities show very high correlations in the overall view, which speaks for a high stability already at primary school age. This result confirms the findings of (
Rost 2013) and (
Stumpf 2012), which assume high intelligence scores from the age of about seven to eight years.
In terms of practical relevance, this high stability over time means that differentiated support for children must begin at an early age. In particular, individual strengths should be placed in the context of motivational aspects (
Rheinberg 2009) and achievement motivation (
Edelmann 2000) in order to counteract the scissors effect—also called the Matthews effect—which describes the phenomenon that children with initially higher abilities tend to benefit more from learning opportunities, resulting in a widening gap between high and low performers over time.
For children who have high trait expression, a high level of opportunities in which they can use and apply these skills is recommended (
Cain and Oakhill 2011;
Bast and Reitsma 1998). Since cognitive abilities are organized domain-specifically (
Heller and Perleth 2007b) and potentials often undergo increasing specialization domain-specifically and experience of competence in turn leads to more intensive engagement in the chosen domain, which can be trained into exceptional performance (
Preckel and Vock 2021), individual strengths in the verbal, quantitative, or figural/nonverbal domains should also be identified and promoted early on. Differentiated in-school and out-of-school achievement-motivating support at an appropriate level can also counteract the risk of underachievement and dislike of school (
Preckel and Vock 2021).
For the mathematics achievement tests, construct stability existed only partially. This can be explained by the fact that the test procedures were not originally constructed for longitudinal use, but were oriented to the respective grades of the curriculum in a curriculum-valid manner and were used accordingly (
Krajewski et al. 2002,
2004;
Roick et al. 2004;
Gölitz et al. 2006).
The mathematics achievement test scores from each successive year show a small to medium positive correlation. The longitudinal examination under control of performance shows that the predictive power of prior performance is moderately high over the first three years, but it has only a low predictive power for performance in the fourth grade. This means that only a few children can confirm the overall mathematics performance of the first three years in the fourth grade, or that good preliminary grades in the third grade or in the first three school years do not necessarily mean good mathematics performance in the fourth grade. Put positively, poor third-grade performance does not necessarily translate into poor fourth-grade math performance. The reason for this may be the instructional focus in German mathematics classes on factual arithmetic, which accounts for 2/5 of instruction and thus gains a stronger influence on overall performance (
Roick et al. 2004). In view of the competencies required in later school years, in which the solving of complex tasks comes to the fore, the children could already be introduced to these in the third grade level, in order not to confront them abruptly with strongly changed requirement components in this area and thus risk a drop in performance in the overall mathematics performance in the fourth grade level.
Unidirectional effects of cognitive abilities on mathematics achievement
The results of the reciprocal time-lagged effects of cognitive ability and mathematics achievement slightly favour a unidirectional cross-lagged relationship pattern rather than a mutual effect between reasoning ability and math achievement, given that associations of math achievement to cognitive ability scores from the previous years were stronger than vice versa. The correlations between intelligence scores and mathematics achievement that were already clearly present in the first grade level are confirmed over time, though associations drop in the fourth grade. Intelligence performances assessed in the second grade appear to be relatively stable predictors of subsequent mathematics achievement. Intelligence performance achieved in second grade proves to be the most stable predictor of future mathematics performance.
These findings carry considerable implications for pedagogical practice, particularly with regard to the role of cognitive abilities as foundational determinants of mathematics achievement. They expand the theoretical knowledge base by providing empirical evidence that cognitive capacities exert transfer effects beyond their immediate domain. While earlier research often assumed that cognitive training effects remained domain-specific (
Barnett and Ceci 2002;
Hasselhorn and Hager 2008;
Jacob and Parkinson 2015), more recent meta-analyses suggest that general cognitive abilities can influence a range of academic outcomes. Importantly, rather than reflecting reciprocal relationships, the present findings support a unilateral perspective in which cognitive abilities precede and shape mathematics performance. From this standpoint, strengthening non-computation-specific cognitive features can serve as a critical lever for improving mathematics achievement.
Reciprocal effects of mathematics achievement on cognitive abilities
Unfortunately, the reciprocal effects of mathematics achievement on cognitive abilities did not reach the threshold for practical relevance in our models. This is of particular interest, since numerical skills represented an important subscale in the cognitive abilities test and hence conceptual overlap with the construct of mathematics achievement (see also limitations paragraph below). Thus, results point to a primarily unidirectional influence of cognitive ability on mathematics achievement, providing stronger support for the traditional top-down perspective than for the mutualism model. Consistent with earlier work by (
Watkins et al. 2007;
Watkins and Styck 2017) and (
Soares et al. 2015), cognitive abilities predicted later mathematics performance across the four assessment waves. Although reciprocal paths from mathematics achievement to subsequent cognitive ability were detectable, they fell below the threshold for practical relevance. This pattern suggests that, at least in the early primary school years, cognitive abilities play a more central role in shaping mathematics achievement than vice versa. By extending previous studies with a large, representative sample, standardized tests, and multiple time points, our results strengthen the evidence for models emphasizing cognitive ability as a foundation for academic development, while leaving open the possibility that reciprocal effects may become more pronounced in later stages of schooling. In this respect, our findings align more closely with the notion of cumulative advantages or “Matthew effects” (
Stanovich 1986), where higher cognitive abilities facilitate greater learning gains over time, rather than with compensation effects suggesting that academic learning can offset lower initial cognitive capacity (
Wilhelm and Engle 2005;
Hambrick and Engle 2002;
Schroeders et al. 2016).
The findings support the hypotheses formulated at the outset of the study. First, in line with our first hypothesis, both cognitive abilities and mathematics achievement demonstrated high levels of autoregressive stability across the four years of primary school, with consistently strong correlations between successive measurement points. Second, supporting our second hypothesis, earlier cognitive abilities significantly predicted later mathematics performance, particularly in the lower grades, indicating a robust unidirectional effect. Third, the results provide only weak support for our third hypothesis that mathematics achievement predicts subsequent cognitive ability, with associations increasing over time. This pattern of reciprocal cross-lagged effects is more consistent with a unidirectional rather than cross-lagged relationship, though associations of math achievement on cognitive ability increase over time, suggesting that domain-specific learning processes in mathematics can actively contribute to the development of general cognitive abilities over time.
The results could stimulate further research on the compensatory effects (
Schroeders et al. 2016;
Bast and Reitsma 1998) of school-based interventions, already discussed by some researchers, and thus address the question of the extent to which academic education can reduce the negative effects of children’s unfavorable cognitive and socioeconomic conditions by positively influencing the relationship between cognitive abilities and academic education.
Strengths and limitations
A key advantage was the inclusion of four measurement time points over four years of elementary school age. With this longitudinal design examining the relationships between cognitive skills and mathematics achievement, more accurate interpretations are possible than with cross-sectional design analyses. The constructs were each measured with standardized procedures, so that a high quality of the results could be guaranteed. The test procedures used make domain-specific distinctions, and the large number of items that were available for the respective subscales ensured more precise results as well as higher reliability (
Knight and Zerr 2010) than results based on school grades or general level.
One methodological consideration in interpreting the cross-lagged results concerns potential item-level overlap between the quantitative subscale of the CAT and the arithmetic components of the DEMAT. Both instruments include tasks related to basic numerical reasoning, which may result in inflated correlations due to shared variance rather than true conceptual linkage. However, conceptually similar results were obtained after exploratory removal of the quantitative subscale from the CAT scores, suggesting that the reciprocal associations reported in this study were not attributable to the item similarity between the DEMAT and the CAT.
6. Conclusions
The primary goal of the present study was to determine the strength, stability, and direction of the relationships between cognitive ability and mathematics achievement over time. No previous reports of the reciprocal relationships between the two constructs were available in this form for the primary grades. Therefore, data from N = 1726 primary school children were collected and analyzed in a longitudinal study over four years at four measurement time points. This study also allows for differentiated statements on the stability of mathematics achievement and cognitive skills at the elementary school age.
The results suggest unilateral time-lagged associations between cognitive ability and mathematics achievement across the four measurement points.
As discussed by (
Grube 2008) and (
Stern 2005), and likewise by (
Helmke and Weinert 1997), mathematics achievement, while highly correlated with intelligence traits, also results from prior mathematics achievement, as demonstrated here.
At the same time, the stability of mathematics performance over time can be demonstrated, recruited from the corresponding prior performance.
The correlations between intelligence scores and mathematics performance, which are already clearly present in the first grade, become apparent over time. Thus, intelligence scores achieved in the first grade level can clearly predict mathematics performance in the third grade level and even allow for a good prediction of mathematics performance in the fourth grade. Intelligence performances made in the second grade prove to be the most stable predictors of future mathematics performances.
In addition, this study provides detailed information on the stability of cognitive skills. Looking at cognitive abilities as a whole, overall intelligence performance shows very high longitudinal correlations, suggesting a high degree of stability even at the elementary school age.
Consequently, the identified correlations correspond with previous studies that demonstrated that relevant prior (mathematical) knowledge discussed in expertise theories (
Heller 2008;
Preckel et al. 2012) can predict later (mathematical) abilities. Nevertheless, cognitive ability continues to emerge as a relevant predictor of mathematics achievement, which may point to the presence of cumulative effects (
Stanovich 1986;
Bast and Reitsma 1998). With regard to academic achievement, the demonstrated cross-lagging effects may confirm the dominant influence of cognitive ability already demonstrated by (
Watkins et al. 2007).
These findings should have a direct impact on pedagogical action in that interventions take into account integration with (prior) knowledge in order to successfully improve performance via the appropriate organization of knowledge acquisition processes. The importance of prior knowledge and, consequently, the targeted promotion of computationally specific skills should be considered a relevant factor in supporting mathematics achievement.
Thus, the present findings contribute to the still-limited body of research examining potential reciprocal influences between cognitive traits and mathematics performance across multiple time points during the elementary school years. Last but not least, replication of the present findings for secondary schools could provide valuable insights for research theory and school practice.