Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (221)

Search Parameters:
Keywords = item response theory model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 596 KB  
Article
Inherent Addiction Mechanisms in Video Games’ Gacha
by Sagguneswaraan Thavamuni, Mohd Nor Akmal Khalid and Hiroyuki Iida
Information 2025, 16(10), 890; https://doi.org/10.3390/info16100890 (registering DOI) - 13 Oct 2025
Abstract
Gacha games, particularly those using Free-to-Play (F2P) models, have become increasingly popular yet controversial due to their addictive mechanics, often likened to gambling. This study investigates the inherent addictive mechanisms of Gacha games, focusing on Genshin Impact, a leading title in the genre. [...] Read more.
Gacha games, particularly those using Free-to-Play (F2P) models, have become increasingly popular yet controversial due to their addictive mechanics, often likened to gambling. This study investigates the inherent addictive mechanisms of Gacha games, focusing on Genshin Impact, a leading title in the genre. We analyze the interplay between reward frequency, game attractiveness, and player addiction using the Game Refinement theory and the Motion in Mind framework. Our analysis identifies a critical threshold at approximately 55 pulls per rare item (N55), with a corresponding gravity-in-mind value of 7.4. Beyond this point, the system exhibits gambling-like dynamics, as indicated by Game Refinement and Motion in Mind metrics. This threshold was measured using empirical gacha data collected from Genshin Impact players and analyzed through theoretical models. While not claiming direct causal evidence of player behavior change, the results highlight a measurable boundary where structural design risks fostering addiction-like compulsion. The study contributes theoretical insights with ethical implications for game design, by identifying critical thresholds in reward frequency and game dynamics that mark the shift toward gambling-like reinforcement. The methodologies, including quantitative analysis and empirical data, ensure robust results contributing to responsible digital entertainment discourse. Full article
(This article belongs to the Special Issue Artificial Intelligence Methods for Human-Computer Interaction)
Show Figures

Graphical abstract

21 pages, 1410 KB  
Article
Measure Student Aptitude in Learning Programming in Higher Education—A Data Analysis
by João Pires, Ana Rosa Borges, Jorge Bernardino, Fernanda Brito Correia and Anabela Gomes
Computers 2025, 14(10), 428; https://doi.org/10.3390/computers14100428 - 9 Oct 2025
Viewed by 147
Abstract
Analyzing student performance in Introductory Programming courses in Higher Education is critical for early intervention and improved learning outcomes. This study explores the potential of a cognitive test for student success in an Introductory Programming course by analyzing data from 180 students, including [...] Read more.
Analyzing student performance in Introductory Programming courses in Higher Education is critical for early intervention and improved learning outcomes. This study explores the potential of a cognitive test for student success in an Introductory Programming course by analyzing data from 180 students, including Freshmen and Repeating Students, using descriptive statistics, correlation analysis, Categorical Principal Component Analysis and Item Response Theory models analysis. Analysis of the cognitive test revealed that some reasoning questions presented a statistically significant correlation, albeit of weak magnitude, with the course grades, particularly for freshman students. The development of models for predicting student performance in Introductory Programming using cognitive tests is also being explored. This study found that reasoning skills, namely logical reasoning and sequence completion, were more predictive of success in programming than general ability. The study also showed that a Programming Cognitive Test can be a useful tool for identifying students at risk of failure, particularly for freshmen students. Full article
Show Figures

Figure 1

30 pages, 437 KB  
Article
Thriving from Work Questionnaire: Validation of a Measure of Worker Wellbeing Among Older U.S. Workers
by Maren Wright Voss, Cal J. Halvorsen, Kanchan Yadav, Stephanie M. Neidlinger, Gregory R. Wagner and Susan E. Peters
Int. J. Environ. Res. Public Health 2025, 22(9), 1428; https://doi.org/10.3390/ijerph22091428 - 12 Sep 2025
Viewed by 823
Abstract
As life expectancy and retirement ages rise globally, understanding how older workers thrive in the workplace is an increasingly vital measurement and wellbeing priority. In this study, we validated the Thriving from Work Questionnaire (TfWQ) for workers aged ≥50. A U.S. online panel [...] Read more.
As life expectancy and retirement ages rise globally, understanding how older workers thrive in the workplace is an increasingly vital measurement and wellbeing priority. In this study, we validated the Thriving from Work Questionnaire (TfWQ) for workers aged ≥50. A U.S. online panel yielded 617 older workers and 372 younger counterparts for comparison. Using item response theory alongside model-fit evaluation and correlational tests with job/life satisfaction, engagement, burnout, and turnover intent—we assessed reliability and construct validity of the long- (30 reduced to 29-item) and short- (8-item) form TfWQ versions. We recommend omitting one of the original items from the long-form for use in older workers. Instrument reliability was high (α = 0.94 long-form; 0.90 short-form). Model fit was established for both long- and short-form versions with acceptable model fit indices. Convergent validity was supported by strong, theory-consistent correlations with the external constructs. Older workers, compared with those 20–49 years, had higher scores of thriving from work as well as differences identified on nine items. These age-patterned differences highlight actionable levers for occupational-health age-sensitive policy, wellbeing interventions, and workforce planning. The TfWQ offers a robust, reliable, valid, and practically oriented tool for evaluating older workers’ wellbeing with utility across research, practice, and policy. Full article
(This article belongs to the Special Issue Workplace Health and Wellbeing Research and Evaluation)
22 pages, 1067 KB  
Article
Developing and Validating an Intercultural Student Experience Scale Using Structural Equation Modeling
by Nicolás Matus, Cristian Rusu, Virginica Rusu and Federico Botella
Sustainability 2025, 17(18), 8224; https://doi.org/10.3390/su17188224 - 12 Sep 2025
Viewed by 618
Abstract
This study proposes and validates a culturally responsive instrument for assessing Student Experience (SX) in Higher Education Institutions (HEIs). Guided by Customer Experience (CX) theory and Hofstede’s cultural framework, we drafted a thirty-item scale: nine educational, seven social, three personal, and twelve cultural [...] Read more.
This study proposes and validates a culturally responsive instrument for assessing Student Experience (SX) in Higher Education Institutions (HEIs). Guided by Customer Experience (CX) theory and Hofstede’s cultural framework, we drafted a thirty-item scale: nine educational, seven social, three personal, and twelve cultural items spanning Indulgence–Restraint, Individualism–Collectivism, Masculinity–Femininity, Uncertainty Avoidance, Long- versus Short-Term Orientation, and Power Distance. Undergraduate respondents from universities with contrasting cultural profiles completed the survey. Confirmatory factor analyses affirmed a three-dimensional SX structure (Educational, Social, Personal) and six first-order cultural dimensions. A hierarchical second-order Structural Equation Model (SEM) linked the higher-order construct Cultural Aspects (CA) to the higher-order construct SX. The path from CA to SX emerged positive and statistically relevant, indicating that national-culture orientations systematically color students’ cognitive, affective, and behavioral evaluations of institutional touchpoints. The scale enables researchers and academic managers to pinpoint SX gaps, benchmark performance internationally, and design culturally congruent and sustainability-aligned interventions. The article deepens theoretical understanding of how culture shapes service perceptions in global HEIs by explicitly integrating cultural theory and cultural studies into SX evaluation. Full article
(This article belongs to the Special Issue Service Experience and Servicescape in Sustainable Consumption)
Show Figures

Figure 1

26 pages, 921 KB  
Article
Media Exposure and Vicarious Trauma: Italian Adaptation and Validation of the Media Vicarious Traumatization Scale and Its Impact on Young Adults’ Mental Health in Relation to Contemporary Armed Conflicts
by Giorgio Maria Regnoli, Gioia Tiano and Barbara De Rosa
Eur. J. Investig. Health Psychol. Educ. 2025, 15(9), 184; https://doi.org/10.3390/ejihpe15090184 - 12 Sep 2025
Viewed by 1098
Abstract
In recent years, psychological research has increasingly focused on the impact of media exposure on mental health, identifying young adults as particularly vulnerable due to their high levels of media engagement. To explore these effects, the construct of Media Vicarious Traumatization (MVT) has [...] Read more.
In recent years, psychological research has increasingly focused on the impact of media exposure on mental health, identifying young adults as particularly vulnerable due to their high levels of media engagement. To explore these effects, the construct of Media Vicarious Traumatization (MVT) has been introduced as an extension of vicarious traumatization, aimed at capturing the psychological impact of emotionally intense media content. MVT offers a relevant framework for understanding the mental health risks of media exposure, especially in relation to socially significant issues like war, now central in contemporary media discourse. This study aims to culturally adapt and psychometrically validate the Media Vicarious Traumatization Scale (MVTS) within the Italian context, and to investigate the relationship between the war-related MVT construct, generalized anxiety, and future anxiety among young adults. Study I, conducted on a sample of 250 participants (M = 22.40, SD = 2.63), explored the latent structure of the MVTS using Parallel Analysis and Exploratory Factor Analysis (EFA), yielding promising psychometric properties in terms of reliability and factorial stability. An independent sample of 553 participants (M = 22.43, SD = 2.37) was recruited for Study II to confirm the MVTS’s latent structure via Confirmatory Factor Analysis (CFA), which indicated good model fit. This study also evaluated measurement invariance across gender, internal consistency, and convergent, discriminant, and predictive validity, alongside psychometric properties assessed through Item Response Theory (IRT). The results of both studies confirm the stable and robust psychometric properties of the scale. Furthermore, Study II provides novel insights into the predictive role played not only by the war-related MVT but also by the recently introduced construct of Worry about War in exacerbating both generalized anxiety and future anxiety among Italian young adults. Full article
Show Figures

Figure 1

29 pages, 1830 KB  
Review
An Evolutionary Preamble Towards a Multilevel Framework to Understand Adolescent Mental Health: An International Delphi Study
by Federica Sancassiani, Vanessa Barrui, Fabrizio Bert, Sara Carucci, Fatma Charfi, Giulia Cossu, Arne Holte, Jutta Lindert, Simone Marchini, Alessandra Perra, Samantha Pinna, Antonio Egidio Nardi, Alessandra Scano, Cesar A. Soutullo, Massimo Tusconi and Diego Primavera
Children 2025, 12(9), 1189; https://doi.org/10.3390/children12091189 - 5 Sep 2025
Viewed by 874
Abstract
Background/Objectives: Adolescence is a sensitive developmental window shaped by both vulnerabilities and adaptive potential. From an evolutionary standpoint, mental health difficulties in this period may represent functional responses to environmental stressors rather than mere dysfunctions. Despite increasing interest, integrative models capturing the dynamic [...] Read more.
Background/Objectives: Adolescence is a sensitive developmental window shaped by both vulnerabilities and adaptive potential. From an evolutionary standpoint, mental health difficulties in this period may represent functional responses to environmental stressors rather than mere dysfunctions. Despite increasing interest, integrative models capturing the dynamic interplay of risk and protective factors in adolescent mental health remain limited. This study presents a holistic, multi-level framework grounded in ecological and evolutionary theories to improve understanding and intervention strategies. Methods: A two-round Delphi method was used to develop and validate the framework. Twelve experts in adolescent mental health evaluated a preliminary draft derived from the literature. In Round 1, 12 items were rated across five criteria (YES/NO format), with feedback provided when consensus thresholds were not met. Revisions were made using consensus index scores. In Round 2, the revised draft was assessed across eight broader dimensions. A consensus threshold of 0.75 was used in both rounds. Results: Twelve out of thirteen experts (92%) agreed to join the panel. Round 1 item scores ranged from 0.72 to 0.85, with an average consensus index of 0.78. In Round 2, ratings improved significantly, ranging from 0.82 to 1.0, with an average of 0.95. The Steering Committee incorporated expert feedback by refining the structure, deepening content, updating sources, and clarifying key components. Conclusions: The final framework allows for the clustering of indicators across macro-, medium-, and micro-level domains. It offers a robust foundation for future research and the development of targeted, evolutionarily informed mental health interventions for adolescents. Full article
(This article belongs to the Section Pediatric Mental Health)
Show Figures

Figure 1

25 pages, 663 KB  
Article
Exploring the Multifaceted Nature of Work Happiness: A Mixed-Method Study
by Rune Bjerke
Adm. Sci. 2025, 15(9), 351; https://doi.org/10.3390/admsci15090351 - 5 Sep 2025
Viewed by 637
Abstract
Work happiness is commonly described as an umbrella concept encompassing job satisfaction, engagement, and emotional attachment to the workplace. However, few studies have explored its underlying sources and emotional experiences, raising questions about its conceptual clarity and measurement. This exploratory inductive mixed-methods study [...] Read more.
Work happiness is commonly described as an umbrella concept encompassing job satisfaction, engagement, and emotional attachment to the workplace. However, few studies have explored its underlying sources and emotional experiences, raising questions about its conceptual clarity and measurement. This exploratory inductive mixed-methods study investigates whether work happiness can be better understood by distinguishing between its sources (antecedents) and emotional expressions (outcomes). In the qualitative phase, 23 part-time adult students from Norway’s public and private sectors reflected on moments of work happiness and the emotions involved. Thematic analysis identified five source-related themes, which informed the development of 49 items. These items were tested in a quantitative survey distributed to 4000 employees, yielding 615 usable responses. Exploratory factor analysis (EFA) revealed six conceptually coherent source dimensions—such as autonomy, recognition, and togetherness—and one emotional dimension. Regression analysis demonstrated statistically significant associations between source factors and emotional experiences, offering initial support for a dual-structure model of work happiness. Notably, the findings revealed a dialectical interplay between individual (“I”) and collective (“We”) sources, suggesting that work happiness emerges from both personal agency and social belonging. While promising, these findings are preliminary and require further validation. The study contributes to theory by proposing a grounded, multidimensional framework for work happiness and invites future research to examine its psychometric robustness and cross-contextual applicability. Full article
Show Figures

Figure 1

17 pages, 485 KB  
Article
Harnessing Self-Control and AI: Understanding ChatGPT’s Impact on Academic Wellbeing
by Metin Besalti
Behav. Sci. 2025, 15(9), 1181; https://doi.org/10.3390/bs15091181 - 29 Aug 2025
Viewed by 989
Abstract
The rapid integration of generative AI, particularly ChatGPT, into academic settings has prompted urgent questions regarding its impact on students’ psychological and academic outcomes. Although generative AI holds considerable potential to transform educational practices, its effects on individual traits such as self-control and [...] Read more.
The rapid integration of generative AI, particularly ChatGPT, into academic settings has prompted urgent questions regarding its impact on students’ psychological and academic outcomes. Although generative AI holds considerable potential to transform educational practices, its effects on individual traits such as self-control and academic wellbeing remain insufficiently explored. This study addresses this gap through a sequential two-phase design. In the first phase, the ChatGPT Usage Scale was adapted and validated for a Turkish university student population (N = 413). Using confirmatory factor analysis and item response theory, the scale was confirmed as a psychometrically valid and reliable one-factor instrument. In the second phase, a separate sample (N = 449) was used to examine the relationships between ChatGPT usage, self-control, and academic wellbeing through a mediation model. The findings revealed that higher ChatGPT usage was significantly associated with lower levels of both self-control and academic wellbeing. Additionally, mediation analysis demonstrated that self-control partially mediates the negative relationship between ChatGPT usage and academic wellbeing. The study concludes that while generative AI tools are valuable, their integration into education presents a double-edged sword, highlighting the critical need to foster students’ self-regulatory skills to ensure they can harness these tools responsibly without compromising their academic and psychological health. Full article
(This article belongs to the Special Issue Artificial Intelligence and Educational Psychology)
Show Figures

Figure 1

13 pages, 567 KB  
Article
Correlation Between Dental Health and Aesthetic Components of Malocclusion in Junior High and High School Students: An Epidemiological Study Using Item Response Theory
by Hiromi Sato, Yudai Shimpo, Toshiko Sekiya, Haruna Rikitake, Minami Seki, Satoshi Wada, Yoshiaki Nomura and Hiroshi Tomonari
J. Clin. Med. 2025, 14(13), 4802; https://doi.org/10.3390/jcm14134802 - 7 Jul 2025
Viewed by 836
Abstract
Background: The Index of Orthodontic Treatment Need (IOTN) is widely used to assess the need for orthodontic treatment. IOTN consists of the Dental Health Component (DHC) and the Aesthetic Component (AC), evaluating malocclusion morphologically and aesthetically, respectively. However, the discriminatory power of individual [...] Read more.
Background: The Index of Orthodontic Treatment Need (IOTN) is widely used to assess the need for orthodontic treatment. IOTN consists of the Dental Health Component (DHC) and the Aesthetic Component (AC), evaluating malocclusion morphologically and aesthetically, respectively. However, the discriminatory power of individual DHC items and their relationship with AC grades remain unclear. Objective: This study aimed to evaluate the effectiveness of individual DHC items in school dental examinations and investigate their contribution to AC grades among junior high and high school students. Methods: A total of 726 students (443 males, 283 females; aged 12–18 years) from Tsurumi University Junior and Senior High School, excluding 168 students undergoing or having completed orthodontic treatment, were included. Nine calibrated orthodontists assessed DHC and AC using IOTN during standardized school examinations. The discriminatory power and information precision of DHC items were evaluated by Item Response Theory (IRT) analysis using three-, two-, or one-parameter logistic models depending on convergence. Correspondence analysis visualized the correlation between DHC and AC grades. Simple linear regression analyzed the contribution of each DHC item to AC grades. Results: Orthodontic treatment need was identified in 21.1% of students. Females showed a higher rate of treatment need than males. Correspondence analysis suggested that aesthetic evaluations were more lenient than morphological evaluations. IRT and regression analysis revealed that crowding (4.d), increased overjet (2.a), and increased overbite (2.f) demonstrated high discriminatory power and significant contributions to AC grades. Conclusions: Among the DHC items, crowding, increased overjet, and increased overbite had higher discriminatory power for malocclusion and contributed more significantly to AC evaluations compared to other items. Full article
(This article belongs to the Section Dentistry, Oral Surgery and Oral Medicine)
Show Figures

Figure 1

17 pages, 246 KB  
Article
The Impact of Information Acquisition on Farmers’ Drought Responses: Evidence from China
by Huiqing Han, Jianqiang Yang and Yingjia Zhang
Information 2025, 16(7), 576; https://doi.org/10.3390/info16070576 - 4 Jul 2025
Viewed by 459
Abstract
Climate change presents major challenges to agriculture, especially in economically underdeveloped regions. In these areas, farmers often lack access to resources and timely information, which limits their ability to respond effectively to drought and threatens agricultural sustainability. This study uses survey data from [...] Read more.
Climate change presents major challenges to agriculture, especially in economically underdeveloped regions. In these areas, farmers often lack access to resources and timely information, which limits their ability to respond effectively to drought and threatens agricultural sustainability. This study uses survey data from farmers in underdeveloped regions of China to examine the association between their ability to acquire information and their drought response behaviors. The results indicate that better information acquisition ability is significantly correlated with more effective and scientifically informed decision-making in drought adaptation strategies. To explore the underlying mechanism, we introduce value perception—that is, farmers’ beliefs about the usefulness and benefits of drought adaptation strategies—as a mediating variable. A mechanism model is constructed to test how information acquisition ability relates to behavior indirectly through this perception. We apply a threshold regression model to identify potential nonlinear associations, finding that the relationship between information acquisition ability and drought response behaviors becomes stronger once a certain threshold is surpassed. Additionally, we employ the Item Response Theory (IRT) model to measure the intensity and quality of farmers’ adaptation behaviors more accurately. These findings provide theoretical insights and empirical evidence for enhancing agricultural resilience, while acknowledging that causality cannot be definitively established due to the cross-sectional nature of the data. The study also offers useful guidance for policymakers seeking to strengthen farmers’ access to information, improve value recognition of adaptive actions, and promote sustainable agricultural development in underdeveloped areas. Full article
(This article belongs to the Special Issue Information Technology in Society)
17 pages, 498 KB  
Article
Assessing Standard Error Estimation Approaches for Robust Mean-Geometric Mean Linking
by Alexander Robitzsch
AppliedMath 2025, 5(3), 86; https://doi.org/10.3390/appliedmath5030086 - 4 Jul 2025
Viewed by 400
Abstract
Robust mean-geometric mean (MGM) linking methods enable reliable group comparisons in item response theory models under fixed and sparse differential item functioning. This article evaluates six alternative standard error and confidence interval (CI) estimation methods across four MGM linking approaches. Our Simulation Study [...] Read more.
Robust mean-geometric mean (MGM) linking methods enable reliable group comparisons in item response theory models under fixed and sparse differential item functioning. This article evaluates six alternative standard error and confidence interval (CI) estimation methods across four MGM linking approaches. Our Simulation Study demonstrates that CIs based on the delta method or bootstrap procedures using the normal distribution or empirical quantiles exhibit highly inflated coverage rates. In contrast, CIs derived from a weighted least squares estimation problem, as well as basic and bias-corrected bootstrap methods, yield satisfactory coverage rates in most simulation conditions for robust MGM linking. Full article
15 pages, 575 KB  
Article
Psychometric Properties of the Science Self-Efficacy Scale for STEMM Undergraduates
by Jayashri Srinivasan, Krystle P. Cobian and Minjeong Jeon
Eur. J. Investig. Health Psychol. Educ. 2025, 15(7), 124; https://doi.org/10.3390/ejihpe15070124 - 4 Jul 2025
Viewed by 723
Abstract
Biomedical research training initiatives need rigorous evaluation to achieve national goals of supporting a robust workforce in the biomedical sciences. Higher science self-efficacy is associated with the likelihood of pursuing a science-related research career, but we know little about the psychometric properties of [...] Read more.
Biomedical research training initiatives need rigorous evaluation to achieve national goals of supporting a robust workforce in the biomedical sciences. Higher science self-efficacy is associated with the likelihood of pursuing a science-related research career, but we know little about the psychometric properties of this construct. In this study, we report on a comprehensive validation study of the Science Self-Efficacy Scale using a robust sample of 10,029 undergraduates enrolled across 11 higher education institutions that were part of a biomedical training initiative funded by the National Institutes of Health in the United States. We found the scale to be unidimensional with an Omega hierarchical (ωh) reliability coefficient of 0.86 and a marginal reliability of 0.91. Within the item response theory framework, we did not detect variation in item parameters across undergraduates’ race/ethnicity; however, one item had parameters that varied across gender identity. We determined that the Science Self-Efficacy Scale can be employed across undergraduates enrolled in science, and researchers can use the scale across a diverse group of students. Implications include ensuring that the scale functions consistently across diverse populations, enhancing the validity of conclusions that can be drawn from survey data analysis. Validating this construct with item response theory models strengthens its use for future research. Full article
Show Figures

Figure 1

24 pages, 1258 KB  
Article
Enhancing Ability Estimation with Time-Sensitive IRT Models in Computerized Adaptive Testing
by Ahmet Hakan İnce and Serkan Özbay
Appl. Sci. 2025, 15(13), 6999; https://doi.org/10.3390/app15136999 - 21 Jun 2025
Viewed by 1220
Abstract
This study investigates the impact of response time on ability estimation within an Item Response Theory (IRT) framework, introducing time-sensitive formulations to enhance student assessment accuracy. Seven models were evaluated, including standard 1PL-IRT and six response-time-adjusted variants: TP-IRT, STP-IRT, TWD-IRT, NRT-IRT, DTA-IRT, and [...] Read more.
This study investigates the impact of response time on ability estimation within an Item Response Theory (IRT) framework, introducing time-sensitive formulations to enhance student assessment accuracy. Seven models were evaluated, including standard 1PL-IRT and six response-time-adjusted variants: TP-IRT, STP-IRT, TWD-IRT, NRT-IRT, DTA-IRT, and ART-IRT. Three optimization techniques—Maximum Likelihood Estimation (MLE), full parameter optimization, and K-fold Cross-Validation (CV)—were employed to assess model performance. Empirical validation was conducted using data from 150 students solving 30 mathematics items on the “TestYourself” platform, integrating response accuracy and timing metrics. Student abilities (θ), item difficulties (b), and time–effect parameters (λ) were estimated using the L-BFGS-B algorithm to ensure numerical stability. The results indicate that subtractive models, particularly DTA-IRT, achieved the lowest AIC/BIC values, highest AUC, and improved parameter stability, confirming their effectiveness in penalizing excessive response times without disproportionately affecting moderate-speed students. In contrast, multiplicative models (TWD-IRT, ART-IRT) exhibited higher variability, weaker generalizability, and increased instability, raising concerns about their applicability in adaptive testing. K-fold CV further validated the robustness of subtractive models, emphasizing their suitability for real-world assessments. These findings highlight the importance of incorporating response time as an additive factor to improve ability estimation while maintaining fairness and interpretability. Future research should explore multidimensional IRT extensions, behavioral response–time analysis, and adaptive testing environments that dynamically adjust item difficulty based on response behavior. Full article
(This article belongs to the Special Issue Applications of Smart Learning in Education)
Show Figures

Figure 1

19 pages, 1815 KB  
Article
Controlling Rater Effects in Divergent Thinking Assessment: An Item Response Theory Approach to Individual Response and Snapshot Scoring
by Gerardo Pellegrino, Janika Saretzki and Mathias Benedek
J. Intell. 2025, 13(6), 69; https://doi.org/10.3390/jintelligence13060069 - 17 Jun 2025
Viewed by 833
Abstract
Scoring divergent thinking (DT) tasks poses significant challenges as differences between raters affect the resulting scores. Item Response Theory (IRT) offers a statistical framework to handle differences in rater severity and discrimination. We applied the IRT framework by re-analysing an open access dataset [...] Read more.
Scoring divergent thinking (DT) tasks poses significant challenges as differences between raters affect the resulting scores. Item Response Theory (IRT) offers a statistical framework to handle differences in rater severity and discrimination. We applied the IRT framework by re-analysing an open access dataset including three scored DT tasks from 202 participants. After comparing different IRT models, we examined rater severity and discrimination parameters for individual response scoring and snapshot scoring using the best-fitting model—Graded Response Model. Secondly, we compared IRT-adjusted scores with non-adjusted average and max-scoring scores in terms of reliability and fluency confound effect. Additionally, we simulated missing data to assess the robustness of these approaches. Our results showed that IRT models can be applied to both individual response scoring and snapshot scoring. IRT-adjusted and unadjusted scores were highly correlated, indicating that, under conditions of high inter-rater agreement, rater variability in severity and discrimination does not substantially impact scores. Overall, our study confirms that IRT is a valuable statistical framework for modeling rater severity and discrimination for different DT scores, although further research is needed to clarify the conditions under which it offers the greatest practical benefit. Full article
(This article belongs to the Special Issue Analysis of a Divergent Thinking Dataset)
Show Figures

Figure 1

18 pages, 506 KB  
Article
Comparing Different Specifications of Mean–Geometric Mean Linking
by Alexander Robitzsch
Foundations 2025, 5(2), 20; https://doi.org/10.3390/foundations5020020 - 6 Jun 2025
Viewed by 902
Abstract
Mean–geometric mean (MGM) linking compares group differences on a latent variable θ within the two-parameter logistic (2PL) item response theory model. This article investigates three specifications of MGM linking that differ in the weighting of item difficulty differences: unweighted (UW), discrimination-weighted (DW), and [...] Read more.
Mean–geometric mean (MGM) linking compares group differences on a latent variable θ within the two-parameter logistic (2PL) item response theory model. This article investigates three specifications of MGM linking that differ in the weighting of item difficulty differences: unweighted (UW), discrimination-weighted (DW), and precision-weighted (PW). These methods are evaluated under conditions where random DIF effects are present in either item difficulties or item intercepts. The three estimators are analyzed both analytically and through a simulation study. The PW method outperforms the other two only in the absence of random DIF or in small samples when DIF is present. In larger samples, the UW method performs best when random DIF with homogeneous variances affects item difficulties, while the DW method achieves superior performance when such DIF is present in item intercepts. The analytical results and simulation findings consistently show that the PW method introduces bias in the estimated group mean when random DIF is present. Given that the effectiveness of MGM methods depends on the type of random DIF, the distribution of DIF effects was further examined using PISA 2006 reading data. The model comparisons indicate that random DIF with homogeneous variances in item intercepts provides a better fit than random DIF in item difficulties in the PISA 2006 reading dataset. Full article
(This article belongs to the Section Mathematical Sciences)
Show Figures

Figure 1

Back to TopTop