Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review

Hornos-Arias, Josep; Grau, Sergi; Serra-Grabulosa, Josep M.

doi:10.3390/info16090787

Open AccessSystematic Review

Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review

by

Josep Hornos-Arias

¹

,

Sergi Grau

¹ and

Josep M. Serra-Grabulosa

^2,3,*

¹

Digital Care Research Group, University of Vic–Central University of Catalonia, 08500 Vic, Spain

²

Department of Clinical Psychology and Psychobiology, University of Barcelona, Pg. Vall d’Hebron 171, 08035 Barcelona, Spain

³

Institute of Neurosciences, University of Barcelona, 08035 Barcelona, Spain

^*

Author to whom correspondence should be addressed.

Information 2025, 16(9), 787; https://doi.org/10.3390/info16090787

Submission received: 14 August 2025 / Revised: 2 September 2025 / Accepted: 3 September 2025 / Published: 10 September 2025

(This article belongs to the Special Issue Advances in Digital Learning Processes: Innovation, Ethics and Governance)

Download

Browse Figure

Versions Notes

Abstract

Developmental dyscalculia is a neurobiologically based learning disorder that impairs numerical processing and calculation abilities. Numerous studies underscore the critical importance of early detection to enable effective intervention, highlighting the need for individualized, structured, and adaptive approaches. Digital tools, particularly those based on serious games, appear to offer a promising level of personalization. This systematic review aims to evaluate the relevance of serious game-based digital solutions as tools for the detection and remediation of developmental dyscalculia in children aged 5 to 12 years. To provide readers with a comprehensive understanding of this field, the selected solutions were analyzed and classified according to the technologies employed (including emerging ones), their thematic focus, the mathematical abilities targeted, the configuration of experimental trials, and the outcomes reported. A systematic search was conducted across Scopus, Web of Knowledge, PubMed, Eric, PsycInfo, and IEEEXplore for studies published between 2000 and March 2025, yielding 7799 records. Additional studies were identified through reference screening. A total of 21 studies met the eligibility criteria. All procedures were registered in PROSPERO and conducted in accordance with PRISMA guidelines for systematic reviews. The methodological analysis of the included studies emphasized the importance of employing both control and experimental groups with adequate sample sizes to ensure robust evaluation. In terms of remediation, the findings highlight the value of pre- and post-intervention assessments and the implementation of individualized training sessions, ideally not exceeding 20 min in duration. The review revealed a greater prevalence of remediation-focused serious games compared to screening tools, with a growing trend toward the use of mobile technologies. However, the substantial methodological limitations observed across studies must be addressed to enable the rigorous evaluation of the potential of SGs to detect and support the improvement of difficulties associated with developmental dyscalculia. Moreover, despite the recognized importance of personalization and adaptability in effective interventions, relatively few studies incorporated machine learning algorithms to enable the development of fully adaptive systems.

Keywords:

dyscalculia; early detection; intervention; learning disorders; low numeracy; serious games

1. Introduction

1.1. Rationale

Developmental dyscalculia (DD) is a specific learning disorder (SLD) of neurobiological origin that impairs numerical processing and calculation abilities [1]. Individuals with DD experience persistent difficulties in acquiring and mastering fundamental mathematical concepts, such as number sense and arithmetic fact retrieval, and often struggle with performing calculations and basic mathematical reasoning both accurately and fluently.

International prevalence estimates of DD in children range from 3% to 6%, based on prior studies [2]. Another study [3] reported a relatively stable global prevalence of approximately 5% among primary school students. Similarly, findings from from various international studies indicatied prevalence rates between 3% and 14%, with a commonly replicated figure of 6% across different countries [4].

Over the past few decades, extensive research has sought to elucidate the cognitive and neural mechanisms underlying DD [5,6]. Due to its complex nature, DD is recognized as a heterogeneous disorder, encompassing a wide range of numeracy-related skills. These include counting, transcoding between verbal, written, and symbolic representations, numerical ordering, magnitude comparison, basic arithmetic operations (e.g., addition and subtraction), subitizing (which involves memory and attention), spatial–numerical representation (e.g., the mental number line), and mathematical problem-solving [2].

Visuospatial working memory has been consistently linked to mathematical performance [7,8,9,10,11], underscoring the importance of effectively representing and manipulating quantities during problem-solving tasks [12].

Given that many of the numeracy skills rely on broader cognitive domains—such as literacy, working memory, long-term memory, attention (both selective and sustained), and executive functions—DD frequently co-occurs with other neurodevelopmental disorders, including dyslexia and attention-deficit/hyperactivity disorder (ADHD) [13,14]. Nevertheless, isolated numeracy difficulties can also occur [15]. Importantly, DD is not directly associated with either low or high intelligence quotient (IQ) [16].

Mathematical competence at school entry is one of the strongest predictors of later academic achievement and success across various domains [17]. Early deficits in numeracy not only undermine children’s academic self-confidence but also have long-term consequences, including difficulties in daily functioning, academic failure, limited employment opportunities, and increased risk of mental health issues [16]. The societal burden of mathematical impairments is substantial, encompassing the costs of special education services, mental health treatment, and unemployment support [16].

Given these far-reaching implications, early identification and intervention for DD are critical to mitigating its impact and promoting better educational and life outcomes [18].

The application of computer-aided solutions for the detection and intervention of developmental dyscalculia (DD) has been extensively explored over the past decades. Various approaches have been proposed to develop digital screening tools aimed at identifying mathematical impairments.

A digital screening test was introduced in 2003 for children aged 6 to 14 years, aimed at assessing foundational numeracy skills such as simple reaction time, dot enumeration, number comparison, and basic arithmetic operations including addition and subtraction [19]. In a different context, [20] developed a digital screening tool targeting undergraduate students, which evaluated their understanding of fractions, decimals, negative numbers, and systems such as money and time. This tool also assessed abstract reasoning, mathematical symbolism, and graphical representation skills.

A review of the literature on dyscalculia screening tools reveals a diversity of methodologies and assessment scales used to identify individuals with mathematical difficulties. Building upon these foundations, a recent study analyzed existing screeners and proposed a web-based diagnostic tool that expanded upon Butterworth’s original framework [21]. This enhanced version, referred to as the multimodal screener, incorporated a broader range of cognitive assessments. It included tasks related to basic number processing (e.g., enumeration, symbol-to-quantity mapping, transcoding, counting, number comparison, and measurement) as well as simple mental arithmetic, using both visual and auditory stimuli to increase diagnostic sensitivity.

Similarly, digital interventions can be designed to be fully personalized through the use of adaptive algorithms. These systems allow individuals to focus on specific mathematical domains where difficulties are detected, while also providing real-time feedback, motivational rewards, and a degree of autonomy. Given the heterogeneous and multifaceted nature of DD [22,23], it is not feasible to develop a single intervention that addresses all possible deficits. Effective treatment must therefore be tailored to the individual’s specific needs, encompassing all relevant levels of difficulty.

To maximize the effectiveness of digital interventions for developmental dyscalculia, several key principles should be observed [2]:
Individualized Delivery: Training should not be conducted in group settings or traditional classroom environments. Instead, it should be tailored to the individual, allowing for personalized pacing and focus.
Hierarchical and Structured Design: The intervention should follow a hierarchical structure, beginning with foundational numerical concepts and progressively increasing in complexity as the learner achieves specific milestones.
Motivational Support: Motivation plays a critical role in the success of interventions. Incorporating reward systems can help individuals recognize and value their efforts, even when immediate results are not evident. This approach also contributes to reducing math-related anxiety.
Repetition and Practice: Effective interventions require extensive repetition and practice to reinforce learning and promote long-term retention of numerical concepts.
Comprehensive Content Coverage: The training should address both non-curricular numerical understanding (e.g., number sense, magnitude comparison) and curricular content aligned with school-based mathematics instruction.

Recent advances in computer science have significantly enhanced the development of computer-assisted evaluations and interventions, particularly through the integration of serious games (SGs). Although there is no universally accepted definition, SGs are generally understood as digital (or non-digital) games designed for purposes beyond entertainment. They aim to create engaging, immersive, and enjoyable environments to support objectives in fields such as education, training, and healthcare [24].

Providing individuals with tools that are user-friendly, promote interaction with therapists, offer a degree of autonomy, and enable gameplay in immersive settings can enhance both motivation and attention, thereby improving the overall effectiveness of treatment. While the application of SGs in healthcare remains a relatively novel area of research, emerging evidence supports their potential as reliable tools for diagnosis, training, and intervention [25]. Nonetheless, further research is required to determine which specific game elements and gamification techniques are most effective across different domains [26].

The use of SGs for the detection of specific learning disorders (SLD) has received limited but growing attention. For instance, [27] developed a dyslexia screening tool based on a language-independent web-based game combined with machine learning (ML). Similarly, [28] analyzed a dyscalculia screening tool that integrated SGs and ML, reporting promising results in terms of detection accuracy. Additional SG-based digital tools have been designed and tested for both the detection and remediation of dyscalculia [29,30], as well as for other comorbid learning disorders [31,32].

The application of SGs in the intervention of DD has also been explored in recent studies [33,34,35,36], yielding promising but inconclusive results. A recent meta-analysis [37] evaluated the effectiveness of digital training programs for DD. Although no statistically significant differences were found between game-based training and traditional drill-based techniques, improvements in number sense and arithmetic skills were observed among participants.

Recent research has explored the potential of innovative solutions based on emerging technologies—particularly virtual reality (VR) and augmented reality (AR)—in the context of learning disorders. When combined with machine learning (ML) and deep learning (DL) techniques, these technologies have enabled the development of reliable digital screening tools for the detection of developmental dyscalculia (DD). For instance, several experimental studies have proposed mobile games [28,38] and AR-based assistive learning applications [39] as effective tools for early identification of DD.

These technological innovations hold promise not only for diagnostic purposes but also for intervention and remediation. Some studies have focused on establishing a consistent framework for the use of augmented reality serious games (ARSG) as reliable detection tools [39], while others have demonstrated the effectiveness of mobile applications and games in supporting remediation efforts for children with DD [40,41,42,43,44]. Additionally, promising results have been reported in studies investigating interventions based on virtual reality serious games (VRSG) [45].

Finally, as usability is a critical factor in the design and implementation of effective digital tools, particular attention must be given to user experience (UX) and the development of intuitive and accessible user interfaces (UI) [46]. Ensuring that digital solutions are user-friendly enhances engagement, facilitates interaction, and ultimately contributes to the success of both diagnostic and intervention processes.

1.2. SGs and NDDs/SLDs: Previous Reviews and Meta-Analyses

In recent years, several systematic reviews and meta-analyses have been published concerning the application of serious games to neurodevelopmental and specific learning disorders:

The effects of digital-based interventions in children with mathematical difficulties were examined in a meta-analysis of randomized controlled trials conducted between 2003 and 2019 [37]. The findings revealed no significant advantage of serious game-based training over traditional digital approaches such as drilling and tutoring. Nevertheless, the results were consistent with previous studies, indicating improvements in numerical performance and understanding among children in preschool and primary education.
The effectiveness of digital game-based training in children aged 5 to 16 years with neurodevelopmental disorders was evaluated in a recent meta-analysis [32]. The results indicated that such interventions could enhance overall cognitive abilities, with a small to medium effect size reported across 8 of the 29 included studies.
The effectiveness of emerging technologies such as augmented reality (AR) and virtual reality (VR) in the remediation of specific learning disorders (SLDs) was examined recently [47]. Among the 34 articles included in their review, 8 focused on dyslexia and only 1 addressed dyscalculia. Not all studies targeted children, and only one employed a serious game-based approach. Given the novelty of this research area, further investigation is required to validate the promising effects observed in both educational and healthcare contexts.
Several reviews have examined the application of serious games (SG) in the context of neurodevelopmental disorders (NDDs). One of them [48] conducted a systematic review and qualitative synthesis on the effectiveness of digital SGs for the assessment and intervention of ADHD. The study analyzed 11 screening tools and 11 intervention programs based on non-commercial video games. Results indicated improvements in cognitive functioning and/or reductions in ADHD symptoms. Intervention studies reported high engagement levels and low dropout rates, reinforcing the benefits of video games observed in previous research. Moreover, the screening tools demonstrated effectiveness not only in distinguishing ADHD cases from controls but also in differentiating between subtypes. Similarly, [49] carried out a qualitative synthesis of 24 studies on the use of video games for the treatment of autism spectrum disorder (ASD). Although SG-based interventions were found to be effective in alleviating various ASD symptoms, the reported effect sizes were modest. As with ainterventions, high engagement and low dropout rates were noted, aligning with the findings of [48].
A systematic review and qualitative synthesis on the use of digital applications and video games in interventions targeting reading difficulties was conducted recently [50]. The review analyzed 55 studies involving 33 different training programs—digital tools based on serious games or applications deployed on computers or mobile devices. The findings revealed medium to large effect sizes and notable improvements in first-language reading processes.

1.3. Theoretical Background

While digital-based tools have been widely explored for learning disabilities, a focused review on serious games offers a more pedagogically coherent and practically relevant approach—especially for dyscalculia. Serious games combine motivational engagement with adaptive feedback and contextualized learning, which are key for both early detection and intervention. Prior studies have shown their superiority over traditional instruction in terms of learning outcomes and motivation [51], as well as their potential for personalized learning [52]. Moreover, their ability to simulate mathematical reasoning in interactive environments makes them uniquely suited for identifying early signs of dyscalculia [53]. Based on the existing literature, there remains a significant gap in knowledge regarding the utility of digital solutions based on serious games for the early detection and remediation of developmental dyscalculia (DD). In contrast to the broader body of research supporting the use and effectiveness of serious games in the assessment and intervention of other neurodevelopmental disorders (NDDs) [54,55,56,57] and specific learning disorders (SLDs) [58,59], no comprehensive review has been identified that specifically addresses the application of serious games to dyscalculia. Furthermore, although some studies cited in previous reviews and metanalysis concerning the application of digital-based solutions to neurodevelopmental and specific learning disorders aim to rigorously evaluate effectiveness, they often fail to account for key moderating variables—such as specific mathematical abilities, gamification techniques, or the type of experimental design—thereby limiting the interpretability and generalizability of their findings. This highlights a clear need for further research in this emerging field.

1.4. Objective

The primary objective of this systematic review is to evaluate, for the period spanning from 2000 to March 2025, the relevance and effectiveness of serious game (SG)-based digital solutions—including immersive technologies such as virtual reality (VR) and augmented reality (AR), designed for computers or mobile devices—as reliable tools for the screening and intervention of developmental dyscalculia (DD) in children. These digital solutions will be systematically categorized according to the technologies employed and their thematic focus, with particular attention to

The specific mathematical competencies addressed;
The gamification techniques implemented;
The configuration of the experimental trials;
The outcomes reported.

This analytical framework is intended to offer a comprehensive overview of current research practices in the field and to identify promising directions for future investigations. To this end, several guiding research questions will be formulated and systematically addressed throughout the review:

Which types of serious game-based digital tools are most utilized for the detection and/or remediation of developmental dyscalculia?
What emerging technologies (e.g., virtual reality, augmented reality) have been integrated to enhance the effectiveness of these tools?
What types of experimental trials and methodological configurations have been employed to evaluate the effectiveness of these digital tools?
What are the primary outcomes reported in these trials regarding the effectiveness of the digital tools?

2. Methods

2.1. Protocol and Registration

To ensure transparency in the review process and enhance its methodological rigor, this systematic review was conducted in accordance with the guidelines and recommendations outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [60] (see ‘Supplementary Materials’ section). Additionally, a predefined review protocol was followed and registered in the PROSPERO database prior to the initiation of the review (https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42023405915).

2.2. Eligibility Criteria

Documents meeting the following criteria were considered eligible for inclusion in this study:

Population. Children aged 5 to 12 years, diagnosed with or at risk of developmental dyscalculia, were considered. Comorbidity with other specific learning disorders (SLDs) was accepted; however, studies involving other mental disorders were excluded.
Intervention. Screening tests and interventions employing serious games, video games, or applications designed for computers or mobile devices were included. Studies incorporating immersive technologies such as virtual reality (VR) or augmented reality (AR) were also considered relevant.
Comparators. Other kinds of detection/intervention.
Outcomes. For detection: reliability. For intervention: improvements in various mathematical domain areas, engagement, and positive user and administrator experience.

2.3. Information Sources

To address the primary objective of this review—the use of serious games in the detection and remediation of developmental dyscalculia—systematic searches were conducted across three major scientific databases to ensure a comprehensive selection of relevant literature:

PubMed (https://www.ncbi.nlm.nih.gov/pubmed, accessed on 7 August 2025): A free database comprising an extensive collection of scientific publications in the fields of biomedicine and health. It includes over 38 million citations and abstracts from MEDLINE, PubMed Central (PMC), life science journals, and online books.
Web of Science (http://www.isiknowledge.com): A multidisciplinary bibliographic database indexing scientific articles from more than 22,000 high-impact journals worldwide. It provides advanced search capabilities, citation analysis, and bibliometric tools. Relevant disciplines such as computer science, mathematics, and neuroscience were included.
Scopus (https://www.scopus.com/home.uri, accessed on 7 August 2025): The largest multidisciplinary database of peer-reviewed scientific literature, covering science, technology, medicine, social sciences, and the arts and humanities. With over 100 million records, including content from MEDLINE and EMBASE, Scopus offers advanced search functionality and robust analytics tools.
ERIC (https://eric.ed.gov accessed on 7 August 2025): Education Resources Information Center is an authoritative database of indexed and full-text education literature and resources. It includes records for a variety of source types, having the option to specify the education level.
PsycInfo (https://www.proquest.com/psycinfo/, accessed on 7 August 2025): The preeminent psychology database, with worldwide coverage and registered dates from 1800 to the present. It includes four million citations and abstracts from different kinds of publications, bearing the American Psychological Association’s seal of approval. It is an indispensable starting point for any information search, compiling all the high-quality, peer-reviewed literature on behavioral sciences and mental health.
IEEEXplore (https://ieeexplore.ieee.org/Xplore/home.jsp, accessed on 7 August 2025): the flagship digital platform for scientific and technical content published by the IEEE (Institute of Electrical and Electronics Engineers). It contains more than 6 million documents and additional materials from some of the world’s most cited publications in electrical engineering, computer science, and related sciences.

A growing trend has been observed in the use of computer technologies to support individuals with learning difficulties, including the development of games aimed at enhancing cognitive skills [61]. This observation supports the relevance of exploring literature on serious games and developmental dyscalculia from the year 2000 to the present.

Based on this background, systematic searches were conducted to identify studies related to the detection and remediation of learning disorders in children, published between 1 January 2000, and March 2025. Only articles published in English—the primary language of scientific communication—were considered. The search strategy included filtering by title and abstract to ensure relevance to the review objectives.

2.4. Search

To conduct the database searches, search terms were organized into three conceptual categories (see Table 1) and combined accordingly. The Context category includes various specific learning disorders (SLDs), acknowledging the frequent comorbidity among these conditions. This approach avoids limiting the search exclusively to developmental dyscalculia, thereby allowing the identification of tools that may address multiple learning disorders simultaneously.

The Environment category encompasses all types of digital solutions, including those involving emerging technologies and serious games. Finally, the Purpose category specifies the intended use of the digital tools, distinguishing between detection and intervention. With all terms defined, the basic structure of the search strategy was as follows: (“learning disorder*” OR “learning disabilit*” OR “dyslexia” OR “dysgraphia” OR “dyscalculia” OR “dyspraxia” OR “mathematics disorder*” OR “reading disorder*” OR “disorder* of written expression” OR “impairment* in reading” OR “impairment* in written expression” OR “impairment* in mathematics” OR “mathematical difficult*” OR “arithmetic difficult*”) AND (“videogam*” OR “video gam*” OR “serious gam*” OR “computer gam*” OR “app” OR “application” OR “webapp” OR “vr” OR “ar” OR “virtual reality” OR “augmented reality” OR “digital solution” OR “eye-tracking” OR “mhealth” OR “mobile app*” OR “mobile gam*” OR “ICT solution” OR “computer-based”) AND (“intervention*” OR “detection” OR “remediation*” OR “training”). Full searches for each database engine are available at Appendix A.

2.5. Study Selection

Studies retrieved through the search process were initially screened to eliminate duplicates. This was accomplished using Microsoft Excel functions, followed by a manual review to identify and correct any errors that may have occurred during the initial deduplication. Subsequently, documents were filtered by publication type, with only journal articles, conference papers, and book chapters retained, resulting in a refined list of studies eligible for screening.

To ensure the inclusion of high-quality, novel, and credible research, only peer-reviewed publications were selected for this review [62]. Although systematic reviews and meta-analyses were not included in the final analysis, they were considered valuable sources for identifying additional relevant studies.

Articles that did not meet the eligibility criteria based on their title and abstract were excluded during the initial screening phase. Additionally, relevant studies identified through external references were incorporated into the selection process, resulting in a final list of articles for full-text review. These articles were then downloaded and thoroughly examined to ensure that the information provided aligned with the predefined eligibility criteria.

2.6. Data Collection Process and Data Items

Given the heterogeneity of the studies in terms of implementation protocols, participant characteristics (including sample size and distribution), and disciplinary domains, this review focused primarily on providing a qualitative synthesis of the available evidence. Nevertheless, certain outcomes were identified as valuable sources of methodological insight and impact assessment.

In relation to detection outcomes, success was defined as the ability to yield a reliable diagnosis when benchmarked against other validated detection instruments. For intervention studies, the primary outcome was the demonstration of improvement in one or more domains of mathematical competence—specifically, numerical processing and calculation—assessed through measures of accuracy and/or response time. Numerical processing encompassed tasks such as counting, transcoding, estimation, number line placement, subitizing, comparison, and problem-solving. Calculation referred to arithmetic operations, including addition, subtraction, multiplication, and division.

Given the context of serious games, data pertaining to user engagement, user experience, and usability—when evaluated—were also considered relevant and informative.

Relevant data is extracted from the selected studies:

Bibliographic:
○
Author/s, title, publication date, journal.
Participants:
○
Sex, age, country, number of participants.
Study:
○
Type (detection/intervention/both).
○
Used tool/s (computer-assisted, mobile, AR/VR).
○
Tasks included by area of mathematical knowledge: numerical processing and calculation.
○
Indicators: response time, accuracy.
○
Groups: experimental and/or control, participants.
○
Methodology: duration, participant’s inclusion criteria, pre–post assessment.
○
Additional information about participants and tools.
○
Main outcomes: sample size, based on means and standard deviations p-values and effect size.

2.7. Risk of Bias in Individual Studies

The Physiotherapy Evidence Database (PEDro) scale [63] was employed as an assessment tool to evaluate the methodological quality of each individual study and to determine the associated risk of bias. The PEDro scale comprises 11 items (see Table 2), each rated as “yes” or “no” depending on whether the specified criterion is met. Criterion 1 assesses the external validity or applicability of the study, while criteria 2 through 9 evaluate the internal validity of the trial. Criteria 10 and 11 address whether sufficient statistical information is provided to allow for an appropriate interpretation of the results.

The PEDro scale is a validated and widely recommended instrument for assessing the quality of Randomized Controlled Trials (RCTs) and Controlled Clinical Trials (CCTs). As such, it offers valuable insights for comparing methodologies and determining the overall quality of the included studies.

2.8. Data Extraction and Synthesis

A total of 7799 studies were initially retrieved through the database search process. Following deduplication, 7150 unique records remained, with 649 duplicates removed. These records were subsequently filtered by publication type, resulting in 5362 studies eligible for the screening phase. After applying the screening criteria, 202 articles were selected. An additional 12 documents were incorporated based on references cited in the reviewed articles, previously published reviews, and the authors’ prior work in the field of developmental dyscalculia (DD), yielding a total of 214 articles for full-text review.

Upon full-text evaluation, 193 articles were excluded due to one or more of the following reasons: lack of relevance to video games or computer-based environments, focus on learning disabilities other than dyscalculia, or insufficient documentation of methods, tools, and outcomes. Ultimately, 21 articles met the inclusion criteria and were selected for qualitative synthesis. The complete selection process is illustrated in Figure 1.

The selection process—including title and abstract screening as well as full-text review—was conducted independently by three authors. Any discrepancies were resolved through discussion and consensus. During the screening phases, studies were categorized as either “included” or “excluded.”

In the first phase, 5362 articles were screened. Consensus was reached on the exclusion of 3544 articles and the inclusion of 180, leaving 63 articles requiring further deliberation. Following discussion, an additional 22 articles were included. In the second phase, 214 full-text articles were reviewed. Of these, 178 were excluded and 18 were included by direct consensus, while 18 required further analysis. After discussion, 3 additional articles were included.

Studies were excluded based on the following criteria:

The intervention did not involve video games or serious games.
The study was not related to developmental dyscalculia (DD).
The study provided insufficient detail regarding key aspects, including methodology (e.g., study design, participant groups, indicators, mathematical tasks), tools (e.g., type, technological platform), and outcomes (e.g., sample size, effect size, reliability).

To evaluate the consistency of the selection process among reviewers, inter-rater reliability was assessed. Given that more than two reviewers participated, Fleiss’ Kappa (FK) was used as the appropriate statistical measure. For the first screening phase (title and abstract), a Fleiss’ Kappa value of 0.86 was obtained, while the second phase (full-text review) yielded a value of 0.81. According to the interpretative guidelines proposed by [64], these values indicate a high level of agreement, thereby supporting the reliability and methodological rigor of the selection process.

3. Results

3.1. Study Characteristics

This review aims to identify and analyze current trends in the detection and remediation of developmental dyscalculia (DD) in childhood through digital interventions based on serious games. The following sections present a series of classifications applied to the reviewed studies, organized to facilitate a comprehensive understanding of the methodologies, technologies, and outcomes associated with these digital approaches.

3.1.1. Participants Data

Table 3 presents the demographic characteristics of the participants included in the selected studies. In total, 1410 children participated across the various trials. Based on the available data, 53.51% of the participants were female, with ages ranging from 4 to 12.6 years. Although the target population for this review was defined as children aged 5 to 12 years, two studies [30,65] included participants outside this age range. As the proportion of such participants was not specified, these studies were retained in the analysis.

Regarding group allocation, 288 children were assigned to control groups, while 1122 participated in experimental groups. The studies were conducted in a diverse range of countries: three in Finland [18,41,66], three in Malaysia [40,44,67], two in Switzerland [33,35], two in Germany [34,36], and two in Sri Lanka [28,29]. Additionally, one study was conducted in each of the following countries: Brazil [68], China [69], Ecuador [45], France [70], India [30], Italy [71], Portugal [42], Sweden [43], and Trinidad [65].

Table 3. Demographic characteristics and group allocation across included studies.

Study	Demographic Data						Group Distribution
Authors	Year	Country	Sample (N)	Mean Age (SD)	Age Range	Sex (F/M)	CG	EG
Ariffin et al. [40]	2019	Malaysia	7	NA	6–10	NA	–	7
Aunio, P., & Mononen, R. [41]	2017	Finland	22	5.7 (0.425)	NA	15/7	Active: 8 Passive (reading): 7	7
Avila-Pesantez et al. [45]	2018	Ecuador	40	7.95 (0.81)	7–9	22/18	20	20
Cheng et al. [69]	2019	China	78	EG: 9.53 (0.73) CG: 9.55 (0.85)	NA	29/49	38	40
De Castro et al. [68]	2014	Brazil	26	8.12 (NA)	7–10	16/10	13	13
Ferraz et al. [42]	2017	Portugal	45	9.18 (NA)	8–10	22/23	–	22 F/23 M
Gunasekare et al. [28]	2024	Sri Lanka	420	NA	8–10	NA	–	420
Hallstedt et al. [43]	2018	Sweden	283	8.25 (0.33)	NA	142/141	52	EG1: 78 EG2: 76 EG3: 77
Kariyawasam et al. [29]	2019	Sri Lanka	50	NA	NA	NA	–	50
Käser et al. [33]	2013	Switzerland	41	EG1: 9.96 (1.35) EG2: 9.98 (1.33)	7–12	27/14	–	EG1: 13 F/7 M EG2: 14 F/7 M
Kohn et al. [34]	2020	Germany	67	CG: 8.98 (0.88) EG: 8.94 (0.77)	7–10	49/18	33	34
Kucian et al. [35]	2011	Switzerland	32	CG: 9.5 (1.1) EG: 9.5 (0.8)	8–10	19/13	9 F/7 M	10 F/6 M
Kuhn and Holling [36]	2014	Germany	59	9 (0.7)	7–9	32/27	20	EG1: 19 EG2: 20
Mohd Syah et al. [67]	2016	Malaysia	50	7 (NA)	7	NA	25	25
Mukherjee et al. [30]	2024	India	15	NA	4–6	NA	–	15
Räsänen et al. [18]	2009	Finland	59	CG: 6.56 (0.275) EG1: 6.61 (0.267) EG2: 6.48 (0.325)	6–7	27/32	16 F/13 M	EG1: 5 F/10 M EG2: 6 F/9 M
Re et al. [71]	2020	Italy	PRI: 31 SEC: 26	PRI: 8.975 (NA) SEC: 11.06 (NA)	PRI: 8.2–11.6 SEC: 10.3–12.6	PRI: 20/11 SEC: 16/10	PRI: 9 F/5 M SEC: 8 F/5 M	PRI: 11 F/6 M SEC: 8 F/5 M
Rohizan et al. [44]	2020	Malaysia	3	7.67 (2.082)	6, 7, 10	NA	–	3
Salminen et al. [66]	2015	Finland	17	EG1: 6.68 (0.38) EG2: 6.53 (0.34)	EG1: EG2	6/11	–	EG1: 2 F/7 M EG2: 4 F/4 M
Walcott and Romain [65]	2019	Trinidad	30	NA	4–6	NA	–	30
Wilson et al. [70]	2006	France	9	8.1 (NA)	7–9	–	–	9

Note. Experimental groups are denoted as EG and control groups as CG. The abbreviations PRI and SEC refer to primary and secondary school levels, respectively. When specific information is not available, it is indicated as NA (not available). A dash (–) in the group column indicates that the group was not included in the study.

3.1.2. Trial Design and Configuration

The first classification applied to the selected studies is based on the primary purpose of the digital tools evaluated: detection, remediation, or both. According to this criterion, one study focused exclusively on detection [28], while 18 studies concentrated on intervention [18,33,34,35,36,40,41,42,43,44,45,65,66,67,68,69,70,71]. Two studies addressed both detection and remediation [29,30].

Notably, two studies [28,29] incorporated machine learning algorithms to support screening assessments. These algorithms enhanced the detection process by improving predictive accuracy and reliability, and in the case of [29], also contributed to the personalization and adaptability of the intervention.

Detailed information about each study and the corresponding digital tools is provided in Table 4.

Table 4. Studies’ characteristics.

Study	Detection/Intervention Parameters			Methodology
Authors	Type	Mathematical Knowledge *	Technology (Tool Name)	Indicator	Pre-Test	Post-Test	Sessions Number	Duration (Mins)
Ariffin et al. [40]	I	ADD, MR, SUB	Mobile Game (Calculic Kids prototype)	ACC	Y	Y	NA	NA
Aunio, P., & Mononen, R. [41]	I	ADD, CMP, NUM, SEQ, TRN	Mobile Game, iOS (LolaPanda)	ACC	Y	Y	3 weeks daily sessions	15
Avila-Pesantez et al. [45]	I	BC, BG, MR, SEQ	AR Serious Game (ATHYNOS)	ACC RT	NA	NA	2 weekly sessions 4 weeks	15
Cheng et al. [69]	I	NC, SUB	Web-based system	ACC	Y	Y	Once per day, 8 days	15
De Castro et al. [68]	I	BC, QUA, RQ, SV, WM, WQ	Web-based virtual environment (contains 18 computer games)	ACC	Y	Y	Twice a week for 5 weeks	60
Ferraz et al. [42]	I	DIR, LAT, MEA (HG, WG, T), ORI, QUA, SZ	Android Mobile Game (disMAT)	ACC RT	NA	NA	NA	NA
Gunasekare et al. [28]	D	NA	Android Mobile Game (CalcPal)	ACC	NA	NA	NA	NA
Hallstedt et al. [43]	I	ADD, MT, SUB	Mobile Game (iOS, Android) All tests, except Ravens, were conducted on tablet (iPad2)	ACC	Y	Y	EG1: 52.2 days EG2: 56.1 days EG3: 56.6 days	EG1: 20 EG2: 20 EG3: 30
Kariyawasam et al. [29]	D/I	ADD, CMP (NUM), CNT (NUM)	Mobile Game + ML (Pudubu)	ACC RT	NA	NA	NA	NA
Käser et al. [33]	I	ADD, NL, SBT, SEQ (ORD), SUB, TRN	Computer Video Game	ACC RT	Y	Y	5 weekly sessions EG1: 12 weeks EG2: 6 weeks	20
Kohn et al. [34]	I	ADD, CMP, DIV, EST, MUL, NL, SBT, SUB, TRN	Computer Video Game (Calcularis)	ACC RT	Y	Y	4–5 weekly sessions 12 weeks	20
Kucian et al. [35]	I	ADD, EST, NL, ORD, SUB, TRN	Computer Based (Rescue Calcularis)	ACC RT	Y	Y	5 weekly sessions 5 weeks	15
Kuhn and Holling [36]	I	BC, CMP, NL, SM (SEQ, POS), TRN	Computer based + Mobile Game (Talasia Meister Cody)	ACC RT	Y	Y	5 weekly sessions 3 weeks	20
Mohd Syah et al. [67]	I	ADD, CNT, SUB, TRN	Computer Video Game (MathACE)	ACC	Y	Y	5 days	60
Mukherjee et al. [30]	D/I	CC, CL, CNT	Computer based (CountCandy)	ACC	NA	NA	NA	NA
Räsänen et al. [18]	I	ADD, CMP, EST, NL, SBT, SUB, TRN	Computer Based (Graphogame Math & NumberRace)	ACC RT	Y	Y	5 weekly sessions 3 weeks	10–15
Re et al. [71]	I	MUL, ADD, SUB, DIV, TRN *	Web App (multiplatform) (I Bambini Contano)	ACC RT *	Y	Y	30 sessions 1 month	PSY: 60 APP: 15
Rohizan et al. [44]	I	ADD, DIV, MUL, SUB	Mobile Game (MathFun)	NA	NA	NA	NA	NA
Salminen et al. [66]	I	BC, CMP, EST, CNT, NC, SBT, TRN	Computer Based (Graphogame Math & NumberRace)	ACC RT	Y	Y	3 weeks daily sessions (12–15 sessions)	10–15
Walcott and Romain [65]	I	ADD, SUB	Video Game	ACC	Y	Y	5 weekly sessions 2 weeks	15
Wilson et al. [70]	I	ADD, CMP, CNT, SBT, SUB	Computer Video Game (The Number Race)	ACC RT	Y	Y	4 weekly sessions 4 weeks	30

Note. Abbreviations: ACC = accuracy; ADD = addition; APP = application/game intervention group; BC = basic calculation (BC = ADD + SUB); BG = basic geometry; CC = calculation; CL = coloring; CMP = comparison; CNT = counting; D = detection; DIR = direction; DIV = division; EG = experimental group; EF = executive function; EST = estimation; I = intervention; LAT = laterality; MEA = measures; MR = mathematical reasoning; MT = missing term; MUL = multiplication; NA = information not provided; NC = numerosity comparison; NL = number line; NUM = numbers; ORD = ordinality; ORI = orientation; POS = position; PRI = primary school; PS = problem solving; PSY = psychologists group; QUA = quantities; RQ = reading quantities; RT = response time; SBT = subitizing; SEC = secondary school; SEQ = sequences; SZ = size; SV = spatial visualization; SUB = subtraction; TRN = transcoding; WM = working memory; WQ = writing quantities. * Indicates that the item is not used in the game.

3.1.3. Detection

A set of predictive methods has been developed for screening evaluations aimed at identifying various specific learning disabilities (SLDs), including developmental dyscalculia (DD), through the integration of machine learning (ML) algorithms and deep learning (DL) techniques [29]. Their dyscalculia assessment tool collected data on response times and accuracy across three mathematical tasks: counting, number comparison, and basic arithmetic (addition). This dataset enabled the development of a predictive model using six input features for a Support Vector Machine (SVM) classifier with a radial basis function (RBF) kernel, trained on a sample of 100 children (50 diagnosed with DD). The model achieved a predictive accuracy of 90%, correctly identifying 18 out of 20 previously diagnosed cases within the test subset. Comparable performance was observed for other SLD screening tools: 85% accuracy for the letter dysgraphia screener (17/20 detected), 90% for the numeric dysgraphia screener (18/20 detected), and 65% for the dyslexia screener (13/20 detected), the latter representing the lowest performance among the evaluated tools.

On the other hand, ref. [28] analyzed a dyscalculia risk screening tool that integrates serious games (SGs) with multiple ML models tailored to different DD subtypes. The tool collected accuracy data from children’s gameplay and applied distinct algorithms for each subtype. After training and testing, the models demonstrated high detection performance: operational and ideognostic subtypes were identified with 95% and 98% accuracy using SVM; graphical and practognostic subtypes with 92% and 91% accuracy using Random Forest (RF) and Extreme Gradient Boosting (XGB), respectively; verbal and lexical subtypes with 98% and 94% accuracy using RF and Gradient Boosting (GB); and sequential and visuospatial subtypes with 96% accuracy using XGB.

Recently, ref. [30] evaluated a mobile SG designed to assess DD risk, employing a fuzzy logic-based ML model. The tool included activities involving color detection, counting, and calculation, which were used not only for screening but also to monitor progression across tasks. However, no quantitative outcomes were reported in the study.

3.1.4. Intervention

Main indicators of intervention

Accuracy and response time were the primary indicators used to evaluate session outcomes in 11 studies [18,29,33,34,35,36,42,45,66,70,71]. Eight studies relied exclusively on accuracy as the outcome measure [30,40,41,43,65,67,68,69], while one study did not report any performance indicators [44]. It is worth noting that although [71] employed both accuracy and response time, the intervention was divided into two components: one conducted by professionals, and another implemented at home via a web-based game. For the latter, only accuracy was considered in the analysis.

b.: Sessions’ duration

Regarding the duration of individual intervention sessions, the shortest reported durations ranged between 10 and 15 min in two studies [18,66]. The most reported durations fell within the 15 to 20 min range. Specifically, sessions lasted approximately 15 min in six studies [35,41,45,65,69,71], and 20 min in four studies [33,34,36,43]. One study reported sessions of up to 30 min [70], while two studies involved sessions lasting up to 60 min [67,68]. No relevant information regarding session duration was provided in the remaining studies [29,30,40,42,44].

c.: Total duration of the intervention period

The duration of the interventions represents a relevant variable, with reported lengths ranging from 1 to 12 weeks. A predominant pattern among the studies was the implementation of sessions five times per week [18,33,34,35,36,43,65,66,67]. Other studies adopted less intensive schedules, including two sessions per week [45,68], three sessions per week [41], and four sessions per week [70]. Additionally, some interventions were designed with daily sessions over shorter periods, such as eight consecutive days [69] or a continuous one-month period [71].

d.: Mathematical domains and specific concepts evaluated across the selected studies

The interventions described in the analyzed studies encompassed a variety of activities targeting key mathematical concepts. The most frequently implemented activities included addition (17 studies), subtraction (14), transcoding (9), comparison (7), number line tasks (5), and subitizing (4). Of the 21 studies reviewed, 20 incorporated two or more of these activities, and in 15 studies, four or more were included. This reflects a multidimensional approach to mathematical skill development across the interventions.

Table 4 provides a detailed overview of each study’s configuration, including the specific methodologies and mathematical activities employed.

e.: Methods and measures used to assess the effectiveness of the training interventions

Several studies incorporated pre- and post-intervention assessments as part of their methodological design. These assessments were used to evaluate the effectiveness of the interventions in improving mathematical performance or related cognitive skills. A summary of the pre- and post-test instruments employed across the studies is presented in Table 5.

Ten studies employed pre- and post-intervention assessments and reported effect sizes to quantify the impact of the interventions. Ref. [41] calculated effect size using Pearson’s correlation coefficient (r), reporting a significant group-level improvement in the intervention group (r = 0.59). Ref. [52] reported effect sizes of 0.60 for subtraction, 0.39 for numerosity comparison, and 0.70 for figure matching between pre- and post-tests.

Ref. [43] observed significant improvements in addition (0–12), subtraction (0–12), and subtraction (0–18) tasks, particularly when data from different groups were combined. The resulting effect sizes at post-test were classified as medium. Ref. [33] used a repeated measures general linear model and reported medium (0.39) to large (0.52) effect sizes for subtraction, medium effect sizes for addition and number line tasks (range 0–10), and no significant effects for estimation or comparison.

Ref. [34] assessed the group × time interaction using partial eta squared (η²), reporting medium effect sizes for arithmetic operations (η² = 0.12), number line linearity (η² = 0.08), and one-digit comparison (η² = 0.10). Ref. [36] used Cohen’s d to evaluate both between-group and within-group differences. They found medium to large effect sizes for the number sense (NS) group (d = 0.54) and the working memory (WM) group (d = 0.57) compared to controls. Within-group improvements were also observed: d = 0.46 (WM) and d = 0.38 (NS) for core mathematics, and d = 0.35 (NS) and d = 0.51 (WM) for spatial working memory.

Ref. [18] calculated effect sizes using Hedges’ g and reported medium effects for verbal counting (g = 0.20) and subitizing (g = 0.32), and a large effect for number comparison (g = 0.52) in the GraphoGame-Math (GG-M) group. The Number Race (NR) group showed medium effects for number comparison (g = 0.36), subitizing (g = 0.29), and arithmetic (g = 0.22), with an average effect size of 0.44 across all groups.

Ref. [71] analyzed training effects in both primary and secondary school students, though only data from the primary group were considered. Significant main effects of time were observed for arithmetic facts (ES = 0.40) and written calculation (ES = 0.26). Interaction effects (time × group) were also significant for mental calculation accuracy (ES = 0.16), arithmetic facts (ES = 0.42), and written calculation (ES = 0.12), all within the moderate range.

Ref. [66] reported significant intervention effects for verbal counting, dot counting fluency, and basic arithmetic using GraphoGame Math (GGM) and Number Race (NR), with large effect sizes at the group level (r = 0.46, 0.52, and 0.63, respectively). Ref. [65] observed improvements in addition and subtraction, reporting medium (α = 0.5) and large (α = 0.8) effect sizes, respectively, using Cronbach’s alpha.

Two additional studies reported performance improvements using percentage gains rather than standardized effect sizes. Ref. [40] found that 85.71% of participants improved post-intervention, while [67] reported a 57.9% improvement in the intervention group compared to 21% in the control group, with specific gains of 21% in addition and 37% in subtraction.

The remaining studies did not employ pre/post assessments or did not report effect size metrics.

f.: Cognitive assessment

In addition to mathematical skills, several studies assessed other cognitive domains using pre- and post-intervention evaluations. Ref. [34] included measures of spatial working memory and non-verbal intelligence, the latter assessed using the Culture Fair Intelligence Test (CFT 20-R). Estimated IQ was evaluated using Raven’s Progressive Matrices in [43,69]. Ref. [35] administered verbal and performance subtests from the Wechsler Intelligence Scale for Children—Third Edition (WISC-III), as well as the Corsi Block-Tapping Test to assess spatial working memory.

Ref. [70] also used the WISC-III to determine eligibility for inclusion in the study sample. In [18], verbal working memory was assessed using the repetition of nonsense words task from the NEPSY (Developmental Neuropsychological Assessment) battery, while visuospatial working memory was evaluated using the Corsi Block-Tapping Test, which was also employed in [66]. Additionally, ref. [66] included a Rapid Automatized Naming (RAN) task involving color naming.

g.: User experience

In addition to performance assessments using neuropsychological measures, the evaluation of usability and user experience was addressed in two studies. Ref. [40] employed two brief surveys—comprising four to five questions—administered to both teachers and participants to gather feedback on the intervention. Similarly, ref. [44] assessed usability through a combination of direct observation and structured questionnaires, consisting of eight items for children and four for teachers. These approaches provided complementary insights into the acceptability and practical implementation of the digital tools in educational settings.

h.: Trial type (RCT/nonRCT)

It was observed that eleven studies employed control groups alongside one or two experimental groups, with participants randomly assigned to each condition [18,34,35,36,41,43,45,67,68,69,71]. These studies were therefore classified as Randomized Controlled Trials (RCTs). Although [35] included a control group, it was not classified as an RCT due to the absence of reported information regarding the randomization procedure. The remaining studies could not be classified under this framework, as they did not provide sufficient methodological details to determine the presence or absence of random assignment.

3.1.5. Studies by Technology

This review aims to evaluate the relevance of digital solutions based on serious games as tools for the detection and remediation of developmental dyscalculia (DD) in childhood. In this subsection, particular attention is given to the technological platforms and implementations used in the selected studies.

As shown in Table 6, two primary platforms were identified for the deployment of the games: thirteen studies implemented their interventions on computers—most commonly using the Windows operating system [18,30,33,34,35,36,45,65,66,67,68,69,70], while eight studies utilized mobile platforms [28,29,36,40,41,42,43,44]. Two games were implemented as multiplatform solutions: I bambini contano [71], which was web-based, and The Number Race, used in two studies [18,70], developed in Java for desktop environments.

Among the various games employed, two computer-based interventions previously validated for children with dyscalculia stood out: Rescue Calcularis and The Number Race. The full version of The Number Race was used in both of the studies in which it appeared. In contrast, Rescue Calcularis was used in three studies with different configurations: the original version in [35], a reduced version including only core components in [33], and an updated version (2.0) in [34].

3.1.6. Studies’ Main Outcomes

A comprehensive summary of the main outcomes reported in each study is presented in Table 7. This synthesis is intended to provide readers with valuable insights into the results and their implications for the field of research on digital interventions for developmental dyscalculia.

3.2. Risk of Bias Within Studies

To evaluate the risk of bias in the included studies, each was assessed using the PEDro scale. Based on the scores obtained, studies were categorized as having high risk (scores below 4), moderate risk (scores of 4–5), low risk (scores of 6–8), or very low risk (scores of 9–10), following the criteria established by [72]. Given that the objective of this review was not to examine the validity of the methodologies employed in the selected studies, only items 2 through 9 of the PEDro scale were considered for determining the risk of bias.

The studies with the lowest risk of bias included ten randomized controlled trials (RCTs) [18,34,36,41,43,45,67,68,69,71] and one non-RCT [33]. Five studies were classified as having a moderate risk of bias [29,40,42,66,70], while four studies [28,30,44,65] were identified as having a high risk of bias, particularly [30] and [65], due to insufficient reporting.

Regarding individual PEDro items, none of the studies reported blinding of therapists, and only one study [67] included information on assessor blinding. All studies, except [28,30], confirmed baseline group similarity, typically by reporting whether participants were diagnosed with or at risk for dyscalculia or other learning disabilities. Outcome measures for at least one key variable were obtained from more than 85% of the originally assigned participants in all studies except three of them [30,33,65]. Additionally, participants for whom outcomes were reported received the intervention or control condition as allocated, or an intention-to-treat approach was applied, except for [65].

For all remaining items, positive responses were consistently observed in the RCTs, which aligns with the PEDro scale’s suitability for evaluating randomized trials. Table 8 presents the detailed PEDro scores for each individual study. In summary, the overall risk of bias across the selected studies ranged from low to moderate.

4. Discussion

4.1. Summary

This systematic review aims to provide a comprehensive overview of published research employing digital solutions based on serious games for the detection and remediation of developmental dyscalculia in childhood. It seeks to present the current state of the art in this field by analyzing and synthesizing existing evidence to identify key strengths, limitations, and emerging trends. The ultimate goal is to support researchers in designing and conducting future studies on this topic.

4.1.1. Trial Design and Configuration

Methodological analysis revealed considerable variability in trial designs across the included studies. A larger proportion of studies focused on intervention rather than detection of developmental dyscalculia (DD), likely due to the constraints of diagnostic tools, which require rapid, distraction-free assessment of specific cognitive functions. These conditions constrain the applicability of gamification within serious games (SGs) in screening contexts. Consequently, SG-based digital tools remain underutilized for the detection of developmental dyscalculia (DD).

Future research should explore how SGs can be adapted for early identification of mathematical learning difficulties. This pattern is consistent with findings in other neurodevelopmental disorders. For instance, although video game-based assessment tools—often incorporating virtual or augmented reality (VR/AR)—have been developed for ADHD [48], most studies have prioritized intervention over screening [54], supporting the role of SGs as digital therapeutic tools for children with ADHD [56].

Similarly, in developmental dyslexia, only a limited number of studies (n = 4) have examined SGs for identifying literacy difficulties [73], while the majority have focused on intervention. These trends highlight the need for further investigation into the diagnostic potential of SGs across neurodevelopmental conditions.

Average Duration and Frequency of Intervention Sessions Across Studies

Regarding the intervention protocols, most studies implemented sessions lasting between 10 and 20 min, conducted 4 to 5 times per week over a period ranging from 2 to 12 weeks. These configurations are broadly consistent with the recommendations outlined by [74] within the Response-to-Intervention (RTI) Tier 2 framework. According to this model, effective interventions for learning disabilities should be explicit and systematic, delivered 3 to 5 days per week, with each session lasting at least 20 min over a minimum duration of 5 weeks.

Mathematical Domains and Specific Concepts Evaluated Across the Selected Studies

Developmental dyscalculia (DD) is a complex specific learning disorder that involves multiple cognitive functions and skills. The most frequently targeted activities in the reviewed interventions—addition, subtraction, transcoding, comparison, number line tasks, and subitizing—are foundational for mathematical development. These tasks address both symbolic (e.g., counting, Arabic numerals, verbal representations) and non-symbolic processing (e.g., magnitude comparison, object transformations) [6].

Given that early numeracy performance is a reliable predictor of later mathematical difficulties [75], the inclusion of these activities in intervention protocols provides a theoretically grounded basis for improving outcomes.

Methods and Measures Used to Assess the Effectiveness of the Training Interventions

Due to the heterogeneity of the selected studies—reflected in the diversity of experimental designs and the tools used to assess participants’ baseline and progress—it is particularly relevant to focus on two key aspects: the use of pre- and post-tests, and the reporting of effect sizes (statistical data), alongside the main outcomes described in each study (subjective data). This dual approach can offer valuable insights for future research.

On one hand, the use of pre- and post-tests was observed in the majority of the studies, which supports the reliability of the reported results. These tests allow researchers to quantify improvements in various cognitive domains targeted by the interventions. A robust method for assessing such improvements is through the calculation of effect size (ES) [76,77]. However, only 12 out of the 21 studies included in this review reported effect size values. Among these, the average ES was medium, suggesting a moderate impact of the interventions.

On the other hand, the main outcomes reported by the studies also provide meaningful qualitative evidence. The interventions demonstrated promising results, consistent with previous findings in the field: enhanced cognitive skills, improved understanding of mathematical concepts, and better performance in related tasks, as indicated by higher accuracy rates and faster response times. Specific improvements were noted in arithmetic operations and number sense. Nevertheless, three studies reported no significant gains in certain mathematical abilities, highlighting the variability in intervention effectiveness.

Overall, the findings of this review align with those of similar studies on the use of serious games for learning difficulties. For instance, [37] reported a moderate but statistically significant positive effect of digital interventions on mathematics achievement, with a mean effect size of approximately 0.55. Comparable trends have been observed in research on other learning disabilities. In the case of dyslexia, [50] found that 62% of the studies reviewed reported large or very large effect sizes. Although reviews on ADHD [48] and ASD [49] did not consistently report effect sizes, they did highlight cognitive improvements associated with video game-based interventions. Collectively, these findings support the growing consensus that serious games can serve as effective tools for both the detection and intervention of learning difficulties, although further research is warranted to consolidate these results.

User Experience

In addition to their cognitive and motivational benefits, serious games—particularly those developed for mobile platforms—must ensure optimal gameplay and user experience to maximize their effectiveness. When targeting pediatric populations, these applications require high levels of accessibility, simplicity, and intuitive interaction. The interface and interaction design must be tailored to the developmental characteristics of children, facilitating engagement through a user-centered approach that enhances usability and maintains motivation throughout the intervention.

Despite the critical role of User Experience (UX) and User Interface (UI) design in digital interventions, this dimension is largely underexplored in the reviewed literature. Only three of the analyzed studies explicitly addressed UX/UI considerations in the design or evaluation of their applications. This limited attention to usability factors represents a significant gap, as poor interface design can hinder user engagement and compromise the efficacy of the intervention. Future research should systematically incorporate UX/UI evaluation frameworks to ensure that digital tools are not only pedagogically sound but also ergonomically and cognitively appropriate for their intended users.

Trial Type (RCT/Non RCT)

A critical indicator of methodological quality in intervention studies is the inclusion of control groups. Randomized controlled trials (RCTs) are considered the gold standard for establishing causal relationships, and the use of control groups is essential to ensure internal validity [78]. In the context of evaluating serious game-based interventions—particularly in health and educational domains—it is recommended not only to include a control group, but to implement a dual-control design: one group receiving no treatment and another receiving a standard, evidence-based intervention. This configuration allows for a more robust comparison, enabling researchers to attribute observed effects specifically to the experimental intervention [79].

However, analysis of the selected studies reveals that nearly half (10 out of 21; 48%) did not include any control group. This omission significantly compromises the internal validity of these studies, limiting the ability to draw reliable conclusions about the effectiveness of the interventions [80]. This finding is consistent with previous systematic reviews, which have highlighted persistent methodological limitations in research on neurodevelopmental disorders, including the absence of appropriate control groups, small sample sizes, and suboptimal study designs [81].

Among the 11 studies that did include a control group, various approaches were adopted. These included:

Active control groups, receiving an alternative intervention.
Passive control groups, which encompassed:
Children with dyscalculia who received no training.
Typically developing children who continued with regular instruction.
Children with dyscalculia supported by specialized educators.
Children attending standard classroom activities during the intervention period.

Notably, in all studies employing a control group, the experimental group consistently outperformed the control group across outcome measures. While these results are promising, the lack of control groups in nearly half of the studies underscores the need for more rigorous experimental designs. Future research should prioritize the inclusion of well-defined control conditions to strengthen the evidence base regarding the efficacy of serious games in the remediation of dyscalculia

Sample characteristics

An essential aspect of methodological rigor in intervention studies is the determination of an appropriate sample size [82]. A sufficiently large sample is necessary to ensure adequate statistical power and to allow for meaningful interpretation of the results. However, none of the studies included in this review reported how sample size was calculated. Furthermore, the majority of studies (15 out of 21) involved small samples, with fewer than 50 participants. This limitation is consistent with previous findings indicating that research on interventions for learning disorders often faces challenges related to limited sample sizes [83].

Several factors contribute to this issue. First, the prevalence of specific learning disorders (SLDs) in the general population is relatively low, estimated at 5–10% [1], which restricts the available pool of eligible participants. Second, intervention studies typically require a high level of commitment from both children and their families, often involving multiple sessions over extended periods. This demand can hinder both recruitment and retention, further reducing sample sizes and potentially introducing bias [83].

Regarding participant demographics, the sex distribution across the studies was relatively balanced, with 54% of participants identified as female (442 out of 826). Although some studies have explored potential sex-related differences in the prevalence or manifestation of SLDs, including developmental dyscalculia (DD), there is currently no consensus on the role of sex as a determining factor [2]. The data from the reviewed studies do not allow for any definitive conclusions in this regard.

Another consistent feature across all studies was the use of the native language of the country in which the research was conducted. While the core symptoms of dyscalculia and the diagnostic criteria are well established [1], it is important to recognize that mathematical learning is partially mediated by language. Tasks such as transcoding—converting numerical information between symbolic, verbal, and written forms—are language-dependent and must be administered in the child’s native language to avoid introducing confounding variables or diagnostic inaccuracies. Accordingly, all assessments in the reviewed studies were conducted in the participants’ native language, ensuring linguistic appropriateness and minimizing potential bias.

4.1.2. Classification of Studies by Technological Approach

Over the past decade, the proliferation of mobile devices has provided a powerful new platform for the deployment of video game-based interventions. Among the 21 studies analyzed in this review, 13 utilized computer-based tools, while only 8 implemented applications specifically designed for mobile platforms. Only one study addressed both detection and remediation; the remaining studies focused exclusively on intervention, highlighting the potential of serious games to address mathematical impairments.

These findings are consistent with previous systematic reviews in related domains. For example, [50] reviewed 55 studies on video games for reading difficulties, identifying 33 distinct training programs, of which only 4 were developed for mobile devices. Similarly, [49] analyzed 24 studies on video games for ASD, with only one mobile-based intervention. [48] reviewed 22 studies on video games for ADHD, evenly split between assessment and intervention, yet only one was designed for mobile platforms. In all cases, the majority of tools were non-commercial serious games developed specifically for research purposes.

In line with these findings, the current review reveals a predominance of computer-based interventions, with relatively few mobile applications. This is somewhat unexpected given the widespread availability and advanced capabilities of modern smartphones. A plausible explanation, as suggested by [50], is the extended development cycle required for mobile applications—encompassing design, implementation, testing, and evaluation—which may delay their integration into empirical research.

The use of video games in intervention research is largely motivated by their capacity to deliver engaging, adaptive, and user-centered experiences [79]. Despite this, only five of the reviewed studies explicitly reported increased motivation or engagement among participants. While gamification strategies—such as goal-oriented challenges and character embodiment—are commonly employed to enhance user experience, the actual impact of these elements on learning outcomes remains underexplored. Further research is needed to systematically evaluate the benefits and potential drawbacks of gamification strategies in educational serious games, particularly in the context of interventions for learning disorders.

Use of artificial intelligence

The results of the review indicate that artificial intelligence is beginning to be implemented in the development of tools for the detection and intervention of developmental dyscalculia. However, only four of the studies included in this review reported the use of machine learning or deep learning algorithms, highlighting the limited adoption of these technologies within the context of serious games targeting this specific learning difficulty. Maximizing adaptability and reliability is essential for achieving accurate diagnoses and delivering personalized interventions. In this context, the integration of ML and DL algorithms holds significant promise. These techniques can enhance the precision and responsiveness of digital tools by enabling real-time adaptation to individual user profiles.

In contrast, the application of ML and DL is more prevalent in the detection and remediation of other neurodevelopmental disorders, although typically not in combination with serious games. For example, [84] highlighted substantial interest in applying ML algorithms to neuroimaging data (EEG, fMRI, MRI) for the diagnosis of autism spectrum disorder and attention-deficit/hyperactivity disorder, with 135 and 55 studies respectively. Notably, none of these studies incorporated serious games as part of the diagnostic or intervention framework.

Artificial Intelligence (AI) techniques have the potential to significantly improve the effectiveness and reliability of digital interventions by enabling the development of adaptive, data-driven systems. However, their implementation is constrained by several factors. Chief among these is the requirement for large, high-quality, and well-structured datasets to train predictive models effectively. Although some ML algorithms can operate with smaller datasets, doing so increases the risk of overfitting, particularly when the number of features is high relative to the sample size. Overfitting compromises the generalizability of the model and reduces its predictive validity [85].

This limitation may explain the low adoption of ML/DL techniques in the reviewed studies. Given the typically small sample sizes and limited number of sessions in these interventions, researchers may have opted to delay the integration of AI components until more robust datasets are available. Future research should explore strategies for overcoming these barriers, such as data augmentation, transfer learning, or collaborative data sharing initiatives, to fully leverage the potential of AI in the development of intelligent, personalized educational tools.

4.1.3. Good Practices on Detection and Intervention

The timing of detection and initiation of intervention is a critical factor in the effective management of learning disorders. [17] demonstrated that early mathematical and reading skills—particularly those assessed at school entry (ages 5–6)—are strong predictors of later academic achievement. This suggests that early screening and intervention for developmental dyscalculia (DD) should ideally begin at this stage. In the studies analyzed in this review, participant ages ranged from 4 to 12 years, with mean ages clustering around 9 years. While not all participants were within the optimal early detection window, they were within the broader developmental period of childhood, during which intervention remains beneficial.

Similar findings have been reported for other learning disorders. For example, [86] emphasized the importance of early detection and intervention in dyslexia. [87] found that interventions administered to first-grade students (ages 6–7) yielded more sustained improvements in reading skills compared to those initiated in later grades. Likewise, [88] reported that fluency-related deficits in dyslexia could be mitigated through interventions implemented in kindergarten or first grade.

In addition to timing, the structure and delivery of interventions are key determinants of their effectiveness. According to [2], successful interventions should incorporate motivational elements, structured repetition, and hierarchically organized activities that can be adapted to individual profiles. This review found that 13 of the 21 studies implemented interventions with multiple levels tailored to the specific needs of each child, aligning with this framework.

The principle of individualization is not unique to DD. In dyslexia, [88] and [89] highlighted the effectiveness of interventions delivered in one-to-one or small-group settings, with content adapted to the learner’s level. Similarly, individualized approaches are widely recommended for other neurodevelopmental disorders, such as autism spectrum disorder and attention-deficit/hyperactivity disorder [90]. The convergence of evidence across disorders underscores the importance of early, personalized, and developmentally appropriate interventions in maximizing treatment outcomes.

4.1.4. Reliability of Analysed Studies

To better assess the reliability of the included studies, a risk of bias analysis was conducted using the PEDro scale. For the purposes of this review, items 10 and 11—related to the statistical validity of outcome measures—were excluded, as the focus was not on validating tools or measurement methods. Additionally, two studies [30,65] were excluded from the analysis due to insufficient methodological information.

Based on the adjusted PEDro scores, the average rating across the remaining studies was 4.26 out of 8, indicating a low to medium risk of bias. This suggests that, despite the heterogeneity in study designs and implementation strategies, the majority of the selected studies were methodologically sound and provided valuable insights for the present review.

Among the evaluated studies, six achieved scores indicative of good to high methodological quality [18,33,34,36,45,71], and one study [67] was rated as excellent. Considering both the PEDro assessment and the overall quality of study design—including sample size, group allocation, and intervention structure—three studies stand out as exemplary models for future research on serious games for mathematical impairments: [34,36,67]. While these studies could still benefit from certain methodological refinements, they represent strong foundations for the development of rigorous, evidence-based interventions in this field.

4.2. Conclusions

Digital solutions based on serious games present a promising avenue for the detection and remediation of mathematical impairments. Their integration with mobile technologies and internet connectivity offers the potential to increase public awareness of learning difficulties such as developmental dyscalculia (DD), leveraging the ubiquity and accessibility of mobile devices. Furthermore, the computational capabilities of modern platforms, combined with immersive technologies—such as virtual reality (VR) and augmented reality (AR)—and artificial intelligence (AI), enable the development of engaging, adaptive, and personalized interventions. These features can support both reliable screening and individualized remediation pathways tailored to each child’s cognitive profile.

The results of the systematic review do not allow for definitive conclusions regarding the benefits of serious games (SG) in the detection and intervention of developmental disorders (DD). Only three of the included studies specifically addressed the use of SGs for DD detection. Given the potential of AI-enhanced systems to assess a broad range of math-related cognitive skills, there is a clear need for further research aimed at designing and validating effective screening instruments. These tools should be capable of delivering comprehensive, scalable, and accessible assessments that can be integrated into educational and clinical settings. In contrast to developmental disorders (DD), in the case of ADHD, the screening tools proved effective not only in distinguishing individuals with ADHD from controls, but also in differentiating between subtypes.

Concerning intervention, the findings reveal substantial limitations that hinder the ability to draw robust and reliable conclusions. On one hand, although most studies report improvements associated with the use of serious games (SG), only about half of them employ pre–post intervention designs and quantify the effects using effect size calculations. Additionally, just over half can be classified as randomized controlled trials (RCTs). On the other hand, only a small proportion of the studies analyze user experience (UX) and user interface (UI), which limits our ability to draw conclusions about the potential motivational impact of SGs compared to other designs. These findings highlight the need for further studies that rigorously evaluate the effects of SGs on improving mathematical difficulties.

Although the field is still developing and further empirical validation is needed, the findings from studies that meet scientific rigor support the continued investigation of serious games as effective tools for both the detection and intervention of developmental disorders.

4.3. Limitations

One such potential limitation is the inclusion of only English-language publications. Although English is the dominant language in scientific literature, this restriction may have led to the exclusion of relevant studies published in other languages, potentially omitting valuable insights or culturally specific approaches that could have enriched the analysis. There are additional limitations that are inherent to the studies included in this review. First, there is a lack of longitudinal data to assess whether the reported beneficial effects of serious games (SGs) are sustained over time. Second, the considerable methodological heterogeneity among the analyzed studies makes it difficult to draw reliable conclusions regarding the potential benefits of SGs. Finally, the very limited number of studies focused on detection not only restricts our ability to evaluate the effectiveness of SGs in identifying signs of developmental dyscalculia (DD) but also hinders the development of robust designs and methodologies for this purpose.

4.4. Implication of the Results and Future Research

Based on the methodologies and tools employed in the analyzed studies, several recommendations can be proposed to guide future research in the field of developmental dyscalculia (DD). First, future trials would benefit from larger sample sizes, the inclusion of two control groups—one receiving no treatment and another undergoing a validated alternative intervention—and the design of pre–post intervention protocols that quantify the effects of the intervention. Addressing and overcoming these limitations would enhance the robustness of the data and facilitate the validation of new tools.

Second, the mathematical skills assessed in the reviewed studies did not comprehensively cover all domains potentially affected by dyscalculia. While it may not be feasible to include every possible activity, expanding the scope of cognitive and mathematical tasks within digital tools could improve both detection and remediation outcomes. A broader range of activities would also generate richer datasets, which could be leveraged to train machine learning (ML) algorithms, thereby enhancing the precision of screening tools and enabling the development of personalized intervention pathways tailored to individual learning profiles.

Although significant progress has been made in the use of serious games for DD, further research is required to develop reliable, scalable tools for both detection and remediation. One promising direction is the design and validation of multiplatform serious games that integrate both functions. Such tools, enhanced by ML and/or deep learning (DL) algorithms, could simultaneously support skill development and provide diagnostic insights. They would allow for the initial risk assessment and longitudinal monitoring of learning trajectories, adapting interventions dynamically to the evolving needs of each child.

Importantly, these tools could be deployed in school settings without requiring constant supervision by specialists. Children could engage with games independently, while educators receive actionable data regarding their students’ mathematical difficulties. This approach holds potential for early identification and support within mainstream educational environments, contributing to more inclusive and effective learning strategies.

Moreover, future research should explore the development of culturally and linguistically adaptable content. The current predominance of English-language tools may limit the generalizability and accessibility of interventions across diverse populations. Designing serious games that are sensitive to cultural and linguistic contexts would not only broaden their applicability but also ensure that detection and remediation strategies are equitable and contextually relevant.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/info16090787/s1, PRISMA Checklist. Reference [91] are citied in the Supplementary Materials.

Funding

This study was supported by the Industrial Doctorate Plan of the Generalitat de Catalunya (2019DI000088).

Conflicts of Interest

There are no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ADD	addition
ADHD	attention-deficit/hyperactivity disorder
AI/ML	AI-ML-assisted
ANOVA	Analysis of variance
AR	augmented reality
ARSG	augmented reality serious game
ASD	autism spectrum disorder
BC	calculation (ADD + SUB)
BG	basic geometry
BHM	Bonferroni–Holm method
CA	Cronbach’s alpha
CCT	clinical controlled trial
CG	control group
CMP	comparison
CNT	counting
COM	computer based
CST	Chi-square tests
D	detection (screening)
DD	developmental dyscalculia
DIR	direction
DIV	division
DL	deep learning
EG	experimental group
EGx	Experimental Group x
ENT	early numeracy tests
EST	estimation
F	female
GB	gradient boosting
GLM	repeated-measures analysis
GRDRS	graphical dyscalculia risk screening
HG	height
I	intervention
IDDRS	ideognostic dyscalculia risk screening
ITTT	independent t-tests
LAT	laterality
LD	learning disorder
LEDRS	lexical dyscalculia risk screening
M	male
MANOVA	multivariate ANOVA
MEA	measures
MEM	memory
ML	machine learning
MOB	mobile
MR	mathematical reasoning
MUL	multiplication
MVC	mean value comparison
NA	not available
NL	number line
NUM	numbers
OPDRS	Operational Dyscalculia Risk Screening
ORD	ordinality
ORI	orientation
OSTT	one sample t-test
POS	position
PRDRS	Practognostic Dyscalculia Risk Screening
PSTT	paired-samples t-tests
QUA	quantities
RBF	radial basis function
RCT	randomized controlled trial
RF	random forest
RT	Response Time ACC: Accuracy
SBT	subitizing
SEDRS	sequential dyscalculia risk screening
SEQ	sequences
SG	serious game
SLD	specific learning disorder
SM	spatial memory
SUB	subtraction
SVM	support vector machines
SZ	size
TRN	transcoding
VEDRS	verbal dyscalculia risk screening
VR	virtual reality
VRSG	virtual reality serious game
VSDRS	visual spatial dyscalculia risk screening
WG	weight
WILCHC	Wilcoxon’s hypothesis contrast
WILCSRT	Wilcoxon’s sign ranked test
WM	working memory
WWW	web-based
XGB	extra gradients boost

Appendix A. Full Searches Definition

Scopus

Search: TITLE-ABS ((“learning disorder*” OR “learning disabilit*” OR dyslexia OR dysgraphia OR dyscalculia OR dyspraxia OR “mathematics disorder*” OR “reading disorder*” OR “disorder* of written expression” OR “impairment* in reading” OR “impairment* in written expression” OR “impairment* in mathematics” OR “mathematical difficult*” OR “arithmetic difficult*”) AND (videogam* OR “video gam*” OR “serious gam*” OR “computer gam*” OR app OR application OR webapp OR vr OR ar OR “virtual reality” OR “augmented reality” OR “digital solution” OR “eye-tracking” OR mhealth OR “mobile app*” OR “mobile gam*” OR “ICT solution” OR computer-based) AND (intervention* OR detection OR remediation* OR training)) AND PUBYEAR > 1999 AND PUBYEAR < 2026 AND (EXCLUDE (DOCTYPE, “er”) OR EXCLUDE (DOCTYPE, “sh”) OR EXCLUDE (DOCTYPE, “ed”)) AND (LIMIT-TO (LANGUAGE, “English”)).

b.: PubMed

Search: ((“learning disorder*” [Title/Abstract] OR “learning disabilit*” [Title/Abstract] OR dyslexia [Title/Abstract] OR dysgraphia [Title/Abstract] OR dyscalculia [Title/Abstract] OR dyspraxia [Title/Abstract] OR “mathematics disorder*” [Title/Abstract] OR “reading disorder*” [Title/Abstract] OR “disorder* of written expression” [Ti- tle/Abstract] OR “impairment* in reading” [Title/Abstract] OR “impairment* in written expression” [Title/Abstract] OR “impairment* in mathematics” [Title/Abstract] OR “mathematical difficult*” [Title/Abstract] OR “arithmetic difficult*” [Title/Abstract]) AND (videogam* [Title/Abstract] OR “video gam*” [Title/Abstract] OR “serious gam*” [Title/Abstract] OR “computer gam*” [Title/Abstract] OR app [Title/Abstract] OR application [Title/Abstract] OR webapp [Title/Abstract] OR vr [Title/Abstract] OR ar [Title/Abstract] OR “virtual reality” [Title/Abstract] OR “augmented reality” [Title/Abstract] OR “digital solution” [Title/Abstract] OR “eye-tracking” [Title/Abstract] OR mhealth [Title/Abstract] OR “mobile app*” [Title/Abstract] OR “mobile gam*” [Title/Abstract] OR “ICT solution” [Title/Abstract] OR computer-based [Title/Abstract])) AND (intervention* [Title/Abstract] OR detection [Title/Abstract] OR remediation* [Title/Abstract] OR training [Title/Abstract]).

Filters: Languages: English; Exclude preprints; Dates: 2000–2023.

c.: WoS

Search: (((TI = (“learning disorder*” OR “learning disabilit*” OR dyslexia OR dysgraphia OR dyscalculia OR dyspraxia OR “mathematics disorder*” OR “reading disorder*” OR “disorder* of written expression” OR “impairment* in reading” OR “impairment* in written expression” OR “impairment* in mathematics”)) AND TI = (videogam* OR “video gam*” OR “serious gam*” OR “computer gam*” OR app OR application OR webapp OR vr OR ar OR “virtual reality” OR “augmented reality” OR “digital solution” OR “eye-tracking” OR mhealth OR “mobile app*” OR “mobile gam*” OR “ICT solution” OR computer-based)) AND TI = (intervention* OR detection OR remediation* OR training)) OR (((AB = (“learning disorder*” OR “learning disabilit*” OR dyslexia OR dysgraphia OR dyscalculia OR dyspraxia OR “mathematics disorder*” OR “reading disorder*” OR “disorder* of written expression” OR “impairment* in reading” OR “impairment* in written expression” OR “impairment* in mathematics”)) AND AB = (videogam* OR “video gam*” OR “serious gam*” OR “computer gam*” OR app OR application OR webapp OR vr OR ar OR “virtual reality” OR “augmented reality” OR “digital solution” OR “eye-tracking” OR mhealth OR “mobile app*” OR “mobile gam*” OR “ICT solution” OR computer-based)) AND AB = (intervention* OR detection OR remediation* OR training)).

Filters: Languages: English; Publication types: Article, Proceeding paper, Book chapter; Dates: from 2000–2025.

References

American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 5th ed.; text rev; American Psychiatric Association: Washington, DC, USA, 2022. [Google Scholar] [CrossRef]
Kucian, K.; Von Aster, M. Developmental dyscalculia. Eur. J. Pediatr. 2014, 174, 1–13. [Google Scholar] [CrossRef] [PubMed]
Kaufmann, L.; von Aster, M. The diagnosis and management of dyscalculia. Dtsch. Arztebl. Int. 2012, 109, 767–777. [Google Scholar] [CrossRef] [PubMed]
von Aster, M.G.; Shalev, R.S. Number development and developmental dyscalculia. Dev. Med. Child. Neurol. 2007, 49, 868–873. [Google Scholar] [CrossRef] [PubMed]
Peters, L.; De Smedt, B. Arithmetic in the developing brain: A review of brain imaging studies. Dev. Cogn. Neurosci. 2018, 30, 265–279. [Google Scholar] [CrossRef]
Raghubar, K.P.; Barnes, M.A. Early numeracy skills in preschool-aged children: A review of neurocognitive findings and implications for assessment and intervention. Clin. Neuropsychol. 2017, 31, 329–351. [Google Scholar] [CrossRef]
Camos, V. Low working memory capacity impedes both efficiency and learning of number transcoding in children. J. Exp. Child Psychol. 2008, 99, 37–57. [Google Scholar] [CrossRef]
McLean, J.; Hitch, G. Working memory impairments in children with specific arithmetic learning difficulties. J. Exp. Child Psychol. 1999, 74, 240–260. [Google Scholar] [CrossRef]
Nemmi, F.; Helander, E.; Helenius, O.; Almeida, R.; Hassler, M.; Räsänen, P.; Klingberg, T. Behavior and neuroimaging at baseline predict individual response to combined mathematical and working memory training in children. Dev. Cogn. Neurosci. 2016, 20, 43–51. [Google Scholar] [CrossRef]
Rosselli, M.; Matute, E.; Pinto, N.; Ardila, A. Memory abilities in children with subtypes of dyscalculia. Dev. Neuropsychol. 2006, 30, 801–818. [Google Scholar] [CrossRef]
Siegel, L.; Ryan, E. The development of working memory in normally achieving and subtypes of learning-disabled children. Child. Dev. 1989, 60, 973–980. [Google Scholar] [CrossRef]
Menon, V. Working memory in children’s math learning and its disruption in dyscalculia. Curr. Opin. Behav. Sci. 2016, 10, 125–132. [Google Scholar] [CrossRef]
Moll, K.; De Luca, M.; Landerl, K.; Landerl, K.; Zoccolotti, P.; Banfi, C.; Banfi, C.; Zoccolotti, P. Editorial: Interpreting the Comorbidity of Learning Disorders. Front. Hum. Neurosci. 2021, 15, 811101. [Google Scholar] [CrossRef]
Willcutt, E.G.; McGrath, L.M.; Pennington, B.F.; Keenan, J.M.; DeFries, J.C.; Olson, R.K.; Wadsworth, S.J. Understanding Comorbidity Between Specific Learning Disabilities. New Dir. Child Adolesc. Dev. 2019, 2019, 91–109. [Google Scholar] [CrossRef] [PubMed]
Moll, K.; Kunze, S.; Neuhoff, N.; Bruder, J.; Schulte-Körne, G. Specific Learning Disorder: Prevalence and Gender Differences. PLoS ONE 2014, 9, e103537. [Google Scholar] [CrossRef] [PubMed]
Butterworth, B. The implications for education of an innate numerosity-processing mechanism. Philos. Trans. R. Soc. B-Biol. Sci. 2018, 373, 20170118. [Google Scholar] [CrossRef]
Duncan, G.J.; Dowsett, C.J.; Claessens, A.; Magnuson, K.; Huston, A.C.; Klebanov, P.; Pagani, L.S.; Feinstein, L.; Engel, M.; Brooks-Gunn, J.; et al. School readiness and later achievement. Dev. Psychol. 2007, 43, 1428–1446. [Google Scholar] [CrossRef] [PubMed]
Räsänen, P.; Salminen, J.; Wilson, A.; Aunio, P.; Dehaene, S. Computer-assisted intervention for children with low numeracy skills. Cogn. Dev. 2009, 24, 450–472. [Google Scholar] [CrossRef]
Butterworth, B. Dyscalculia Screener; NferNelson Pub: London, UK, 2003. [Google Scholar]
Beacham, N.; Trott, C. Screening for dyscalculia within HE. MSOR Connect. 2005, 5, 1–4. [Google Scholar] [CrossRef]
Grigore, M. Towards a standard diagnostic tool for dyscalculia in school children. CORE Proc. 2020, 1. Available online: https://core.pubpub.org/pub/ttoew31a/release/1 (accessed on 7 August 2025).
Landerl, K.; Göbel, S.M.; Moll, K. Core deficit and individual manifestations of developmental dyscalculia (DD): The role of comorbidity. Trends Neurosci. Educ. 2013, 2, 38–42. [Google Scholar] [CrossRef]
Rubinsten, O.; Henik, A. Developmental dyscalculia: Heterogeneity might not mean different mechanisms. Trends Cogn. Sci. 2009, 13, 92–99. [Google Scholar] [CrossRef]
Susi, T.; Johannesson, M.; Backlund, P. Serious Games: An Overview. 2007. Available online: https://www.diva-portal.org/smash/get/diva2:2416/fulltext01.pdf (accessed on 7 August 2025).
Vacca, R.A.; Augello, A.; Gallo, L.; Caggianese, G.; Malizia, V.; La Grutta, S.; Murero, M.; Valenti, D.; Tullo, A.; Balech, B.; et al. Serious Games in the new era of digital-health interventions: A narrative review of their therapeutic applications to manage neurobehavior in neurodevelopmental disorders. Neurosci. Biobehav. Rev. 2023, 149, 105156. [Google Scholar] [CrossRef]
Damaševičius, R.; Maskeliūnas, R.; Blažauskas, T. Serious Games and Gamification in Healthcare: A Meta-Review. Information 2023, 14, 105. [Google Scholar] [CrossRef]
Rauschenberger, M.; Baeza-Yates, R.; Rello, L. Screening risk of dyslexia through a web-game using language-independent content and machine learning. In Proceedings of the 17th International Web for All Conference (W4A ‘20) Proceedings, Taipei, Taiwan, 20–21 April 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1–12. [Google Scholar] [CrossRef]
Gunasekare, V.; Athukorala, S.; De Zoysa, L.; Serasinghe, V.; Thelijjagoda, S.; Krishara, J. CalcPal: Mobile Application to Comprehensively Screen and Provide Remedial Activities for Dyscalculia. In Proceedings of the 2024 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia, 19–20 November 2024; pp. 1–7. [Google Scholar] [CrossRef]
Kariyawasam, R.; Nadeeshani, M.; Hamid, T.N.A.T.A.; Subasinghe, I.; Ratnayake, P. A Gamified Approach for Screening and Intervention of Dyslexia, Dysgraphia and Dyscalculia. In Proceedings of the 2019 International Conference on Advancements in Computing (ICAC), Malabe, Sri Lanka, 5–7 December 2019; pp. 156–161. [Google Scholar] [CrossRef]
Mukherjee, K.; Kumar, R.; Vasishat, S.; Bhargava, N.; Upadhyay, S.Y.; Muhuri, S. Unraveling Dyscalculia: Identifying Mathematical Learning Difficulties in Early Education. In Proceedings of the 2024 International Conference on Innovations and Challenges in Emerging Technologies (ICICET), Nagpur, India, 7–8 June 2024; pp. 1–6. [Google Scholar] [CrossRef]
Rajivsureshkumar, G.; Malarvizhi, K.; Deebanchakkarawarthi, G. Mobile application development on detection and diagnose of learning disability for children. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 216–224. [Google Scholar]
Ren, X.; Wu, Q.; Cui, N.; Zhao, J.; Bi, H.Y. Effectiveness of digital game-based trainings in children with neurodevelopmental disorders: A meta-analysis. Res. Dev. Disabil. 2023, 133, 104418. [Google Scholar] [CrossRef]
Käser, T.; Baschera, G.M.; Kohn, J.; Kucian, K.; Richtmann, V.; Grond, U.; Gross, M.; von Aster, M. Design and evaluation of the computer-based training program Calcularis for enhancing numerical cognition. Front. Psychol. 2013, 4, 489. [Google Scholar] [CrossRef] [PubMed]
Kohn, J.; Rauscher, L.; Kucian, K.; Käser, T.; Wyschkon, A.; Esser, G.; Von Aster, M. Efficacy of a Computer-Based Learning Program in Children with Developmental dyscalculia. What Influences Individual Responsiveness? Front. Psychol. 2020, 11, 1115. [Google Scholar] [CrossRef] [PubMed]
Kucian, K.; Grond, U.; Rotzer, S.; Henzi, B.; Schönmann, C.; Plangger, F.; Gälli, M.; Martin, E.; von Aster, M. Mental number line training in children with developmental dyscalculia. NeuroImage 2011, 57, 782–795. [Google Scholar] [CrossRef]
Kuhn, J.T.; Holling, H. Number sense or working memory? The effect of two computer-based trainings on mathematical skills in elementary school. Adv. Cogn. Psychol. 2014, 10, 59–67. [Google Scholar] [CrossRef]
Benavides-Varela, S.; Callegher, C.Z.; Fagiolini, B.; Leo, I.; Altoè, G.; Lucangeli, D. Effectiveness of digital-based interventions for children with mathematical learning difficulties: A meta-analysis. Comput. Educ. 2020, 157, 103953. [Google Scholar] [CrossRef]
Ariffin, M.; Aszemi, N.M.; Ismir, N. CHECKDYSC©: Mobile Game for Early Detection of Dyscalculia Signs in Children. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 164–167. [Google Scholar] [CrossRef]
Miundy, K.; Zaman, H.B.; Nordin, A.; Ng, K.H. Screening test on dyscalculia learners to develop a suitable augmented reality (AR) assistive learning application. Malays. J. Comput. Sci. 2019, 92–107. [Google Scholar] [CrossRef]
Ariffin, M.M.; Halim, F.A.A.; Arshad, N.I.; Mehat, M.; Hashim, A.S. Calculic Kids© Mobile App: The Impact on Educational Effectiveness of Dyscalculia Children. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 701–705. [Google Scholar]
Aunio, P.; Mononen, R. The effects of educational computer game on low-performing children’s early numeracy skills—An intervention study in a preschool setting. Eur. J. Spec. Needs Educ. 2017, 33, 677–691. [Google Scholar] [CrossRef]
Ferraz, F.; Costa, A.; Alves, V.; Vicente, H.; Neves, J.; Neves, J. Gaming in Dyscalculia: A Review on disMAT. In Recent Advances in Information Systems and Technologies, Proceedings of the WorldCIST 2017, Advances in Intelligent Systems and Computing, Porto Santo Island, Madeira, Portugal, 11–13 April 2017; Rocha, Á., Correia, A., Adeli, H., Reis, L., Costanzo, S., Eds.; Springer: Cham, Switzerland, 2017; Volume 570. [Google Scholar] [CrossRef]
Hallstedt, M.H.; Klingberg, T.; Ghaderi, A. Short and long-term effects of a mathematics tablet intervention for low performing second graders. J. Educ. Psychol. 2018, 110, 1127–1148. [Google Scholar] [CrossRef]
Rohizan, R.; Soon, L.H.; Mubin, S.A. Mathfun: A mobile app for dyscalculia children. J. Phys. Conf. Ser. 2020, 1712, 012030. [Google Scholar] [CrossRef]
Avila-Pesantez, D.F.; Vaca-Cardenas, L.A.; Delgadillo Avila, R.; Padilla Padilla, N.; Rivera, L.A. Design of an Augmented Reality Serious Game for Children with Dyscalculia: A Case Study. In Technology Trends, Proceedings of the CITT 2018. Communications in Computer and Information Science, Babahoyo, Ecuador, 29–31 August 2018; Botto-Tobar, M., Pizarro, G., Zúñiga-Prieto, M., D’Armas, M., Zúñiga Sánchez, M., Eds.; Springer: Cham, Switzerland, 2019; Volume 895. [Google Scholar] [CrossRef]
Jerin, J.Q.; Zaki, T.; Mahmood, M.; Rochee, S.K.; Islam, M.N. Exploring Design Issues in Developing Usable Mobile Application for Dyscalculia People. In Proceedings of the Emerging Technology in Computing, Communication and Electronics (ETCCE), Dhaka, Bangladesh, 21–22 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
Lozano-Álvarez, M.; Rodríguez-Cano, S.; Delgado-Benito, V.; Mercado-Val, E.A. Systematic Review of Literature on Emerging Technologies and Specific Learning Difficulties. Educ. Sci. 2023, 13, 298. [Google Scholar] [CrossRef]
Peñuelas-Calvo, I.; Jiang-Lin, L.K.; Girela-Serrano, B.; Delgado-Gomez, D.; Navarro-Jimenez, R.; Baca-Garcia, E.; Porras-Segovia, A. Video games for the assessment and treatment of attention-deficit/hyperactivity disorder: A systematic review. Eur. Child. Adolesc. Psychiatry 2020, 31, 5–20. [Google Scholar] [CrossRef]
Jiménez-Muñoz, L.; Peñuelas-Calvo, I.; Calvo-Rivera, P.; Díaz-Oliván, I.; Moreno, M.; Baca-García, E.; Porras-Segovia, A. Video Games for the Treatment of Autism Spectrum Disorder: A Systematic review. J. Autism Dev. Disord. 2021, 52, 169–188. [Google Scholar] [CrossRef] [PubMed]
Ostiz-Blanco, M.; Bernacer, J.; Garcia-Arbizu, I.; Diaz-Sanchez, P.; Rello, L.; Lallier, M.; Arrondo, G. Improving Reading Through Videogames and Digital Apps: A Systematic Review. Front. Psychol. 2021, 12, 652948. [Google Scholar] [CrossRef] [PubMed]
Wouters, P.; van Nimwegen, C.; van Oostendorp, H.; van der Spek, E.D. A meta-analysis of the cognitive and motivational effects of serious games. J. Educ. Psychol. 2013, 105, 249–265. [Google Scholar] [CrossRef]
Cheng, M.-T.; Lin, Y.-W.; She, H.-C.; Kuo, P.-C. Learning through playing virtual age: Exploring the interactions among student concept learning, gaming performance, in-game behaviors, and the use of in-game characters. Comput. Educ. 2015, 86, 18–29. [Google Scholar] [CrossRef]
Ritterfeld, U. Serious Games: Mechanisms and Effects; Cody, M., Vorderer, P., Eds.; Routledge: Oxford, UK, 2016. [Google Scholar]
Cibrian, F.L.; Monteiro, E.M.; Lakes, K.D. Digital assessments for children and adolescents with ADHD: A scoping review. Front. Digit. Health 2024, 6, 1440701, Erratum in Front. Digit. Health 2024, 6, 1528500. https://doi.org/10.3389/fdgth.2024.1528500. [Google Scholar] [CrossRef] [PubMed]
Kokol, P.; Vošner, H.B.; Završnik, J.; Vermeulen, J.; Shohieb, S.; Peinemann, F. Serious Game-based Intervention for Children with Developmental Disabilities. Curr. Pediatr. Rev. 2020, 16, 26–32. [Google Scholar] [CrossRef] [PubMed]
Lin, J.; Chang, W.R. Effectiveness of Serious Games as Digital Therapeutics for Enhancing the Abilities of Children with Attention-Deficit/Hyperactivity Disorder (ADHD): Systematic Literature Review. JMIR Serious Games 2025, 13, e60937. [Google Scholar] [CrossRef]
Rodríguez-Timaná, L.C.; Castillo-García, J.F.; Bastos-Filho, T.; Ocampo-González, A.A.; Hincapié-Monsalve, N.R.; Valencia-Jimenez, N.J. Use of Serious Games in Interventions of Executive Functions in Neurodiverse Children: Systematic Review. JMIR Serious Games 2024, 12, e59053. [Google Scholar] [CrossRef]
Maassen, B.A.M.; Glatz, T.; Borleffs, E.; Martínez, C.; de Groot, B.J.A. Digital game-based learning for dynamic assessment and early intervention targeting reading difficulties: Cross-linguistic studies of GraphoLearn. Clin. Linguist. Phon. 2025, 39, 576–601. [Google Scholar] [CrossRef]
Yildirim, O.; Surer, E. Developing Adaptive Serious Games for Children with Specific Learning Difficulties: A Two-phase Usability and Technology Acceptance Study. JMIR Serious Games 2021, 9, e25997. [Google Scholar] [CrossRef] [PubMed]
Page, M.J.; Moher, D.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ 2021, 372, n160. [Google Scholar] [CrossRef]
Adam, T.; Tatnall, A. Using ICT to Improve the Education of Students with Learning Disabilities. In Learning to Live in the Knowledge Society, Proceedings of the IFIP WCC TC3 2008. IFIP—The International Federation for Information Processing, Milan, Italy, 7–10 September 2008; Kendall, M., Samways, B., Eds.; Springer: Boston, MA, USA, 2008; Volume 281. [Google Scholar] [CrossRef]
Kelly, J.; Sadeghieh, T.; Adeli, K. Peer Review in Scientific Publications: Benefits, Critiques, & A Survival Guide. EJIFCC 2014, 25, 227–243. [Google Scholar]
Blobaum, P.M. Physiotherapy Evidence Database (PEDro). J. Med. Libr. Assoc. 2006, 94, 477–478. [Google Scholar]
Landis, J.R.; Koch, G.G. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef]
Walcott, P.A.; Romain, N. Using Digital Games to Enhance the Mathematical Skills of Children with Dyscalculia. J. Educ. Dev. Caribb. 2019, 18, 88–110. [Google Scholar] [CrossRef]
Salminen, J.; Koponen, T.; Räsänen, P.; Aro, M. Preventive Support for Kindergarteners Most At-Risk for Mathematics Difficulties: Computer-Assisted Intervention. Math. Think. Learn. 2015, 17, 273–295. [Google Scholar] [CrossRef]
Mohd Syah, N.E.; Hamzaid, N.A.; Murphy, B.P.; Lim, E. Development of computer play pedagogy intervention for children with low conceptual understanding in basic mathematics operation using the dyscalculia feature approach. Interact. Learn. Environ. 2015, 24, 1477–1496. [Google Scholar] [CrossRef]
de Castro, M.V.; Bissaco, M.A.; Panccioni, B.M.; Rodrigues, S.C.; Domingues, A.M. Effect of a virtual environment on the development of mathematical skills in children with dyscalculia. PLoS ONE 2014, 9, e103354. [Google Scholar] [CrossRef] [PubMed]
Cheng, D.; Xiao, Q.; Cui, J.; Chen, C.; Zeng, J.; Chen, Q.; Zhou, X. Short-term numerosity training promotes symbolic arithmetic in children with developmental dyscalculia: The mediating role of visual form perception. Dev. Sci. 2020, 23, e12910. [Google Scholar] [CrossRef] [PubMed]
Wilson, A.; Revkin, S.K.; Cohen, D.; Cohen, L.D.; Dehaene, S. An open trial assessment of “the number race”, an adaptive computer game for remediation of dyscalculia. Behav. Brain Funct. 2006, 2, 20. [Google Scholar] [CrossRef] [PubMed]
Re, A.M.; Benavides-Varela, S.; Pedron, M.; De Gennaro, M.A.; Lucangeli, D. Response to a specific and digitally supported training at home for students with mathematical difficulties. Front. Psychol. 2020, 11, 2039. [Google Scholar] [CrossRef]
Cashin, A.G.; McAuley, J.H. Clinimetrics: Physiotherapy Evidence Database (PEDro) Scale. J. Physiother. 2020, 66, 59. [Google Scholar] [CrossRef]
Mridula, T.V.; Manivannan, M.; Albert, S. Early identification and enhanced assessment of learning disabilities: A review. Appl. Neuropsychol. Child 2025, 24, 1–24. [Google Scholar] [CrossRef] [PubMed]
Fuchs, L.S.; Vaughn, S. Responsiveness-to-Intervention: A Decade Later. J. Learn. Disabil. 2012, 45, 195–203. [Google Scholar] [CrossRef]
Aunio, P.; Korhonen, J.; Ragpot, L.; Törmänen, M.; Henning, E. An early numeracy intervention for first graders at risk for mathematical learning difficulties. Early Child. Res. Q. 2021, 55, 252–262. [Google Scholar] [CrossRef]
Aberson, C.L. Applied Power Analysis for the Behavioral Sciences, 2nd ed.; Routledge: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Routledge: New York, NY, USA, 1998. [Google Scholar] [CrossRef]
Galuschka, K.; Schulte-Körne, G. Randomized Controlled Trials in Dyslexia and Dyscalculia. In The Cambridge Handbook of Dyslexia and Dyscalculia; Cambridge Handbooks in Psychology; Skeide, M.A., Ed.; Cambridge University Press: Cambridge, UK, 2022; pp. 337–349. [Google Scholar]
Drummond, D.; Hadchouel, A.; Tesnière, A. Serious games for health: Three steps forwards. Adv. Simul. 2017, 2, 3. [Google Scholar] [CrossRef]
Capili, B.; Anastasi, J.K. Efficacy Randomized Controlled Trials. Am. J. Nurs. 2023, 123, 47–51. [Google Scholar] [CrossRef]
Valentine, A.Z.; Brown, B.J.; Groom, M.J.; Young, E.; Hollis, C.; Hall, C.L. A systematic review evaluating the implementation of technologies to assess, monitor and treat neurodevelopmental disorders: A map of the current evidence. Clin. Psychol. Rev. 2020, 80, 101870. [Google Scholar] [CrossRef]
Althubaiti, A. Sample size determination: A practical guide for health researchers. J. Gen. Fam. Med. 2022, 24, 72–78. [Google Scholar] [CrossRef]
Toffalini, E.; Giofrè, D.; Pastore, M.; Carretti, B.; Fraccadori, F.; Szűcs, D. Dyslexia treatment studies: A systematic review and suggestions on testing treatment efficacy with small effects and small samples. Behav. Res. Methods 2021, 53, 1954–1972. [Google Scholar] [CrossRef]
Ribas, M.O.; Micai, M.; Caruso, A.; Fulceri, F.; Fazio, M.; Scattoni, M.L. Technologies to support the diagnosis and/or treatment of neurodevelopmental disorders: A systematic review. Neurosci. Biobehav. Rev. 2023, 145, 105021. [Google Scholar] [CrossRef]
Rauschenberger, M.; Baeza-Yates, R. How to handle Health-Related small imbalanced data in machine Learning? I-com 2020, 19, 215–226. [Google Scholar] [CrossRef]
Forné, S.; López-Sala, A.; Mateu-Estivill, R.; Adan, A.; Caldú, X.; Rifà-Ros, X.; Serra-Grabulosa, J.M. Improving Reading Skills Using a Computerized Phonological Training Program in Early Readers with Reading Difficulties. Int. J. Environ. Res. Public Health 2022, 19, 11526. [Google Scholar] [CrossRef] [PubMed]
Lovett, M.W.; Frijters, J.C.; Wolf, M.; Steinbach, K.A.; Sevcik, R.A.; Morris, R.D. Early intervention for children at risk for reading disabilities: The impact of grade at intervention and individual differences on intervention outcomes. J. Educ. Psychol. 2017, 109, 889–914. [Google Scholar] [CrossRef] [PubMed]
Peterson, R.L.; Pennington, B.F. Developmental dyslexia. Annu. Rev. Clin. Psychol. 2015, 11, 283–307. [Google Scholar] [CrossRef]
Vaughn, S.; Linan-Thompson, S.; Kouzekanani, K.; Bryant, D.P.; Dickson, S.V.; Blozis, S.A. Reading Instruction Grouping for Students with Reading Difficulties. Remedial Spec. Educ. 2003, 24, 301–315. [Google Scholar] [CrossRef]
Khan, K.; Hall, C.L.; Davies, E.B.; Hollis, C.; Glazebrook, C. The Effectiveness of Web-Based Interventions delivered to children and young people with Neurodevelopmental Disorders: Systematic Review and Meta-Analysis. J. Med. Internet Res. 2019, 21, e13478. [Google Scholar] [CrossRef] [PubMed]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow chart.

Table 1. Search terms and keyword strategy.

Context/Who	Environment/Where	Purpose/Why
learning disorder/s	videogame/s	intervention/s
learning disability/ies	video game/s	detection
dyslexia	serious game/s	remediation/s
dysgraphia	computer game/s	training
dyscalculia	app
dyspraxia	application
mathematics disorder/s	webapp
reading disorder/s	vr, virtual reality
disorder/s of written expression	ar, augmented reality
impairment/s in reading	digital solution
impairment/s in written expression	eye-tracking
impairment/s in mathematics	mhealth
mathematical difficult/ies	mobile app/s
arithmetic difficult/ies	mobile game/s
	ICT solution
	computer-based

Table 2. PEDro scale.

Item	Description
1	eligibility criteria were specified
2	subjects were randomly allocated to groups (in a crossover study, subjects were randomly allocated an order in which treatments were received)
3	allocation was concealed
4	the groups were similar at baseline regarding the most important prognostic indicators
5	there was blinding of all subjects
6	there was blinding of all therapists who administered the therapy
7	there was blinding of all assessors who measured at least one key outcome
8	measures of at least one key outcome were obtained from more than 85% of the subjects initially allocated to groups
9	all subjects for whom outcome measures were available received the treatment or control condition as allocated or, where this was not the case, data for at least one key outcome was analyzed by “intention to treat”
10	the results of between-group statistical comparisons are reported for at least one key outcome
11	the study provides both point measures and measures of variability for at least one key outcome

Table 5. Tests used to evaluate participants’ mathematical skills by study.

Study	Test Description	Pre-Test	Post-Test
Ariffin et al. [40]	Own-designed. Based on simple mathematical questions (addition, subtraction).	✓	✓
Aunio, P., & Mononen, R. [41]	The standardised Early Numeracy Test.	✓	✓
Cheng et al. [69]	Own-designed.	✓	✓
De Castro et al. [68]	The arithmetic test contained in the Scholastic Performance Test (SPT).	✓	✓
Gunasekare et al. [28]	Own designed.	✓	-
Hallstedt et al. [43]	Heidelberger Rechen Test 1–4 (HRT).	✓	✓
Käser et al. [33]	“Heidelberger Rechentest” HRT & AC (arithmetic test). Evaluates addition and subtraction.	✓	✓
Kohn et al. [34]	HRT, BUEGA, number line test, basic number processing test. Evaluate arithmetic performance, reading & spelling, spatial representation of numbers and basic operations respectively.	✓	✓
Kucian et al. [35]	Neuropsychological Test Battery for Number Processing and Calculation in Children ZAREKI-R. Examines the progress of basic skills in calculation and arithmetic (such as counting, subtraction, estimation...).	✓	✓
Kuhn and Holling [36]	DEMAT. 9–10 subtests. Cover core aspects of the mathematics curriculum in elementary school (basic arithmetics, word problems, geometry...).	✓	✓
Mohd Syah et al. [67]	Own-designed. Based on simple mathematical questions (counting, addition, subtraction).	✓	✓
Mukherjee et al. [30]	Own designed. Calculation, coloring and counting tasks.	✓	-
Räsänen et al. [18]	Own designed. Four tasks were used to assess number skills: three were computer-based—number comparison, verbal counting, and object counting (subdivided into subitizing and counting)—while the arithmetic task was administered using a paper-and-pencil format.	✓	✓
Re et al. [71]	AC-MT (6–11 for primary, 11–14 for secondary courses). Evaluates abilities on calculation and solving problems.	✓	✓
Salminen et al. [66]	Corsi blocks (visuospatial working memory). Nonword repetition task from the Neuropsychological tests for Children (phonological working memory). 3 tasks adapted from the Early Numeracy Test (verbal counting). Object counting (own designed). Basic arithmetic (paper-and-pencil test: 3 tasks for concrete object counting and 28 tasks for symbolic calculation).	✓	✓
Walcott and Romain [65]	AC-MT (6–11 for primary, 11–14 for secondary courses). Evaluates abilities on calculation and solving problems.	✓	✓
Wilson et al. [70]	Computerized testing battery (not specified) and a subtest of TEDI-MATH Battery. The computerized testing battery included: enumeration, symbolic/non symbolic numerical comparison, addition and subtraction. The TEDI-MATH subtest (non-computerized) included: counting, number transcoding and understanding of the base-10 number system.	✓	✓

Table 6. Summary of serious game features across included studies.

Study	Platform	Characteristics
Ariffin et al. [40]	MOB	Immersion, Region-specific
Aunio, P., & Mononen, R. [41]	Mobile (iOS)	Rewards, Customization, Challenge, Personalization
Avila-Pesantez et al. [45]	COM (Windows)	Customization, Challenge, Immersion, Personalization, Multi-knowledge
Cheng et al. [69]	COM	Challenge, Storyline, Multi-knowledge
De Castro et al. [68]	COM	Challenge, Feedback, Storyline, Multi-knowledge
Ferraz et al. [42]	MOB (Android)	Challenge, Personalization, Multi-knowledge
Gunasekare et al. [28]	MOB (Android)	Challenge, Personalization, Multi-knowledge
Hallstedt et al. [43]	MOB (iOS, Android)	Multi-knowledge
Kariyawasam et al. [29]	MOB (Android)	Multi-knowledge, Multi-LD
Käser et al. [33]	COM	Challenge, Personalization, Multi-knowledge, AI/ML
Kohn et al. [34]	COM	Rewards, Personalization, Multi-knowledge, AI/ML
Kucian et al. [35]	COM	Challenge, Feedback, Personalization, Multi-knowledge
Kuhn and Holling [36]	COM (Windows) MOB (Android, iOS)	Customization, Challenge, Storyline, Feedback, Personalization, Multi-knowledge
Mohd Syah et al. [67]	COM (Windows)	Challenge, Multi-knowledge, Region-specific
Mukherjee et al. [30]	MOB	Multi-knowledge
Räsänen et al. [18]	COM (MP)	Challenge, Feedback, Personalization, Multi-knowledge
Re et al. [71]	WWW (MP)	Challenge, Personalization, Multi-knowledge
Rohizan et al. [44]	MOB	Challenge, Feedback, Reinforcement, Personalization, Multi-knowledge
Salminen et al. [66]	COM (MP)	Challenge, Feedback, Reinforcement, Personalization, Multi-knowledge
Walcott and Romain [65]	COM	No data available
Wilson et al. [70]	COM (MP)	Challenge, Feedback, Reinforcement, Personalization, Multi-knowledge

Note. Abbreviations: AI/ML = AI-ML-assisted; Challenge = different levels and scenarios with increasing difficulty; COM = computer-based; Customization = environment/character customization; Feedback = feedback for each task; Immersion = immersive environment; MP = multiplatform; MOB = mobile; multi-knowledge = different kinds of mathematical knowledge; Multi-LD = multiple learning difficulty assessment; Personalization = tailored levels adapted to each player profile; Positive reinforcement = positive reinforcement; Region-specific = customized for the country; Rewards = reward system for tasks; Storyline = engaging storyline; WWW = web-based.

Table 7. Summary of reported outcomes across included studies.

Study	Effect Size Reported	Subject Evaluated	Statistic	Affected Group	Result by Area
Ariffin et al. [40]	Significant differences (p = 0.05)	Pre–post tests (training effects)	OSTT	Experimental group	87.51% of children showed performance improvement; app positively rated (motivating, user-friendly, enjoyable).
Aunio, P., & Mononen, R. [41]	Significant improvements Z = −2.226, p = 0.016, r = 0.59 Z = −2.207, p = 0.016, r = 0.59	Group-level between pre–post test	ENT	Experimental group vs. Control group	Improvement in different areas.
Avila-Pesantez et al. [45]	Significant effects (p < 0.01)	Response time and accuracy pre/post intervention	WILCHC	Experimental group	kills in mathematical reasoning improved, and response time decreased. Motivation and interest in mathematics also increased.
Cheng et al. [69]	-	-	-	Experimental group	There were significant improvements in arithmetic performance, Approximate Number System (ANS) acuity, and visual perception.
De Castro et al. [68]	Significant differences on experimental group (p < 0.0001)	Pre–post tests scores (training effects)	STTEST	Experimental group	Statistically significant improvements were recorded in the experimental group, with mean scores exceeding those of the control group.
Ferraz et al. [42]	-	-	-	Experimental group	Access to number sense improved significantly, with the most pronounced gains observed in children exhibiting the highest initial error rates.
Gunasekare et al. [28]	-	OPDRS, IIDRS	-	-	SVM model for both screenings. Accuracy OPDRS: 95%. Accuracy IDDRS: 98%.
		GRDRS, PRDRS			RF model for GRDRS, Accuracy: 92%. XGB model for PRDRS, Accuracy: 91%.
		VEDRS, LEDRS			RF model for VEDRS, Accuracy: 98%. GB model for LEDRS, Accuracy: 94%.
		SEDRS, VSDRS			XGB model for both screenings. Accuracy SEDRS: 96%. Accuracy VSDRS: 96%.
Hallstedt et al. [43] Hallstedt et al. (2018)	Medium-sized effects	Pre–post tests (training effects)	-	Experimental group vs. Control group	Improvement in basic arithmetic skills.
Käser et al. [33]	Significant to moderate effect	Performance differences between consecutive testing periods	PSTT, GLM	Experimental group	Accuracy on the number line improved. Greater improvements in both accuracy and response times were observed in subtraction tasks compared to addition. No significant gains were found in counting or estimation. The game was rated as engaging and helpful by participants.
Kohn et al. [34]	Moderate effect sizes	Group differences	ANOVA, CST	Experimental group vs. Control group	The experimental group exhibited greater improvements than the control group. Significant gains were observed in spatial number processing within the 0–100 range, as well as in magnitude comparison tasks. In contrast, no statistically significant improvements were found in basic number processing.
Kucian et al. [35]	Significant training effects on different areas	Pre–post tests (training effects)	GLM, PSTT, ITTT	Experimental group	Significant improvements were observed in spatial number representation (0–100 range using Arabic digits), as well as in addition and subtraction tasks. No improvement was found in dot estimation. Both the experimental and control groups showed general skill enhancement, although specific gains were more pronounced in the experimental group.
Kuhn and Holling [36]	Substantial but small gains (a,b)	Group means	MANOVA + BH	Experimental groups (WM, NS)	NS: Improvements on arithmetics skills (a) WM: Gains in word problems (b), not on spatial working memory
Mohd Syah et al. [67]	Significant training effects on different areas	Pre–post tests (training effects)	WILCSRT	Experimental group vs. Control group	Improvements were observed in both groups, with the experimental group showing a 57.9% greater gain. Significant progress was noted in addition, subtraction, and number orientation tasks. No improvement was found in counting or in confusion between arithmetic operations. An increase in engagement with mathematics was also reported.
		Children’s overall achievement	MVC
		Pre–post tests (training effects)	WILCSRT
Mukherjee et al. [30]	-	-	-	-	No information provided.
Räsänen et al. [18]	Statistically significant differences (GG)	Children’s overall achievement	MVC	Experimental groups (GG, NR) Control group	Improvements were observed in comparison tasks involving both GG and NR formats. More substantial gains in accuracy and response time were found in NR tasks, particularly for comparisons involving large numerical differences.
Re et al. [71]	Significant training effects on different areas	Training effects between assessment time points (experimental group)	ANOVA	Experimental group vs. Control group	Improvements were observed across evaluations in both arithmetic facts and written calculation, although the interaction effect between groups was modest. No significant effects were found in mental calculation, either in terms of accuracy or response time.
Rohizan et al. [44]	-	Between groups, Pre–post training	MANOVA	-	No information regarding improvements was reported.
Salminen et al. [66]	Large size effects at group level	Training effects between assessment time points	WILCHC	Experimental groups (GGM, NR)	Improvements were observed in verbal counting (a) and dot counting fluency (b) for the GGM condition, as well as in basic arithmetic for the NR condition.
Walcott and Romain [65]	Medium to large effect	Pre–post tests (training effects)	CA	Experimental group	Performance increased by 32% in addition and by 7% in subtraction tasks across evaluations.
Wilson et al. [70]	Significant training effects on different areas	Pre–post tests (training effects, only improved variables)	MANOVA	Experimental group	Improvements were observed in average enumeration performance. No significant gains were found in addition tasks or in knowledge of the base-10 number system. However, response times improved in subtraction problem-solving.

Notes. Abbreviations: BHM = Bonferroni–Holm Method; CA = Cronbach’s Alpha; CST = Chi-Square Tests; ENT = Early Numeracy Tests; GB = Gradient Boosting; GLM = Repeated-Measures Analysis; GRDRS = Graphical Dyscalculia Risk Screening; IDDRS = Ideognostic Dyscalculia Risk Screening; ITTT = Independent t-Tests; LEDERS = Lexical Dyscalculia Risk Screening; MANOVA = Multivariate ANOVA; MVC = Mean Value Comparison; ANOVA = One-Way ANOVA; OPDRS = Operational Dyscalculia Risk Screening; OSTT = One-Sample t-Test; PSTT = Paired-Samples t-Tests; PRDRS = Practognostic Dyscalculia Risk Screening; RF = Random Forest; SEDERS = Sequential Dyscalculia Risk Screening; SVM = Support Vector Machines; VEDRS = Verbal Dyscalculia Risk Screening; VSDRS = Visual Spatial Dyscalculia Risk Screening; WILCHC = Wilcoxon’s Hypothesis Contrast; WILCSRT = Wilcoxon’s Sign Ranked Test; XGB = Extra Gradients Boost.

Table 8. Methodological quality assessment (PEDro scale).

Studies	I1	I2	I3	I4	I5	I6	I7	I8	I9	I10	I11	Score (I2–I11)	Score (I2–I9)
Ariffin et al. [40]	✓	-	-	✓	-	-	-	✓	✓	✓	✓	5/10	3/8
Aunio, P., & Mononen, R. [41]	✓	✓	-	✓	-	-	-	✓	✓	✓	✓	6/10	4/8
Avila-Pesantez et al. [45]	✓	✓	✓	✓	✓	-	-	✓	✓	✓	✓	8/10	6/8
Cheng et al. [69]	✓	✓	-	✓	-	-	-	✓	✓	✓	✓	6/10	3/8
De Castro et al. [68]	✓	✓	-	✓	-	-	-	✓	✓	✓	✓	6/10	3/8
Ferraz et al. [42]	✓	-	-	✓	-	-	-	✓	✓	✓	-	4/10	3/8
Gunasekare et al. [28]	-	-	-	-	-	-	-	✓	✓	-	✓	3/10	2/8
Hallstedt et al. [43]	✓	✓	-	✓	-	-	-	✓	✓	✓	✓	6/10	4/8
Kariyawasam et al. [29]	✓	-	-	✓	-	-	-	✓	✓	✓	✓	5/10	3/8
Käser et al. [33]	✓	✓	✓	✓	✓	-	-	-	✓	✓	✓	7/10	5/8
Kohn et al. [34]	✓	✓	✓	✓	✓	-	-	✓	✓	✓	✓	8/10	6/8
Kucian et al. [35]	✓	-	-	✓	✓	-	-	✓	✓	✓	✓	6/10	4/8
Kuhn and Holling [36]	✓	✓	✓	✓	✓	-	-	✓	✓	✓	✓	8/10	6/8
Mohd Syah et al. [67]	✓	✓	✓	✓	✓	-	✓	✓	✓	✓	✓	9/10	7/8
Mukherjee et al. [30]	✓	-	-	-	-	-	-	-	✓	-	-	1/10	1/8
Räsänen et al. [18]	✓	✓	✓	✓	✓	-	-	✓	✓	✓	✓	8/10	6/8
Re et al. [71]	✓	✓	✓	✓	-	-	-	✓	✓	✓	✓	7/10	5/8
Rohizan et al. [44]	✓	-	-	✓	-	-	-	✓	✓	-	-	3/8	3/8
Salminen et al. [66]	✓	-	-	✓	-	-	-	✓	✓	✓	✓	5/10	3/8
Walcott and Romain [65]	✓	-	-	✓	-	-	-	-	-	-	-	1/8	1/8
Wilson et al. [70]	✓	-	-	✓	-	-	-	✓	✓	✓	✓	5/10	3/8
% on item	95.2	52.4	33.3	90.5	33.3	0	4.8	85.7	95.2	80.9	80.9	5.6/10	3.9/8

Note. I1 = eligibility criteria; I2 = random allocation; I3 = concealed allocation; I4 = baseline similarity; I5 = blinding of subjects; I6 = blinding of therapists; I7 = blinding of assessors; I8 = measures of key outcomes from more than 85% of subjects; I9 = intention-to-treat analysis; I10 = between-group statistical comparisons; I11 = point measures and measures of variability.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hornos-Arias, J.; Grau, S.; Serra-Grabulosa, J.M. Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review. Information 2025, 16, 787. https://doi.org/10.3390/info16090787

AMA Style

Hornos-Arias J, Grau S, Serra-Grabulosa JM. Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review. Information. 2025; 16(9):787. https://doi.org/10.3390/info16090787

Chicago/Turabian Style

Hornos-Arias, Josep, Sergi Grau, and Josep M. Serra-Grabulosa. 2025. "Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review" Information 16, no. 9: 787. https://doi.org/10.3390/info16090787

APA Style

Hornos-Arias, J., Grau, S., & Serra-Grabulosa, J. M. (2025). Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review. Information, 16(9), 787. https://doi.org/10.3390/info16090787

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Early Detection and Intervention of Developmental Dyscalculia Using Serious Game-Based Digital Tools: A Systematic Review

Abstract

1. Introduction

1.1. Rationale

1.2. SGs and NDDs/SLDs: Previous Reviews and Meta-Analyses

1.3. Theoretical Background

1.4. Objective

2. Methods

2.1. Protocol and Registration

2.2. Eligibility Criteria

2.3. Information Sources

2.4. Search

2.5. Study Selection

2.6. Data Collection Process and Data Items

2.7. Risk of Bias in Individual Studies

2.8. Data Extraction and Synthesis

3. Results

3.1. Study Characteristics

3.1.1. Participants Data

3.1.2. Trial Design and Configuration

3.1.3. Detection

3.1.4. Intervention

3.1.5. Studies by Technology

3.1.6. Studies’ Main Outcomes

3.2. Risk of Bias Within Studies

4. Discussion

4.1. Summary

4.1.1. Trial Design and Configuration

Average Duration and Frequency of Intervention Sessions Across Studies

Mathematical Domains and Specific Concepts Evaluated Across the Selected Studies

Methods and Measures Used to Assess the Effectiveness of the Training Interventions

User Experience

Trial Type (RCT/Non RCT)

4.1.2. Classification of Studies by Technological Approach

4.1.3. Good Practices on Detection and Intervention

4.1.4. Reliability of Analysed Studies

4.2. Conclusions

4.3. Limitations

4.4. Implication of the Results and Future Research

Supplementary Materials

Funding

Conflicts of Interest

Abbreviations

Appendix A. Full Searches Definition

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI