Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters

Jung, Juyoung; Lee, Yeonju; Jung, Ae Kyong; Shin, Seungwon; Lee, Won-Chan

doi:10.3390/math14101662

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters

by

Juyoung Jung

,

Yeonju Lee

,

Ae Kyong Jung

,

Seungwon Shin

and

Won-Chan Lee

^*

Educational Measurement and Statistics, Psychological and Quantitative Foundations, University of Iowa, Iowa City, IA 52242, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(10), 1662; https://doi.org/10.3390/math14101662

Submission received: 2 April 2026 / Revised: 21 April 2026 / Accepted: 11 May 2026 / Published: 13 May 2026

(This article belongs to the Special Issue Artificial Intelligence, Algorithms, and Databases: Innovations and Cross-Disciplinary Impact)

Download Versions Notes

Abstract

Predicting parameter estimates under item response theory (IRT) from expert-coded item features offers a scalable alternative to resource-intensive field testing. This study evaluates whether inferential feature selection can improve predictive accuracy for item difficulty and item discrimination using five filter methods: the Analysis of Variance (ANOVA) F-test, Kendall’s Tau, the Kolmogorov–Smirnov test, the Anderson–Darling test, and the Energy Distance test. Models were trained using K-Nearest Neighbors (KNN) and Support Vector Regression (SVR) under random split and fixed-form cold-start partitioning strategies. Results show that the distributional properties of item features, rather than train–test splitting alone, drive predictive gains: distribution-based filter approaches, particularly the Kolmogorov–Smirnov test, consistently outperformed mean-based approaches by better capturing the full probability structure of the feature-parameter relationship. KNN benefited substantially from feature selection given its reliance on Euclidean distance, while SVR showed smaller gains due to its inherent regularization. Item discrimination generalized well to previously unseen test forms that share no calibration data with the training set, whereas item difficulty prediction was considerably more sensitive to distributional shifts when predicting entirely new, operationally administered forms. The main finding is that the distributional properties of item features are more important than the quantity of features for obtaining robust IRT parameter predictions.

Keywords: feature selection; machine learning; item response theory; item parameter prediction; inferential statistics; high-dimensional data

Share and Cite

MDPI and ACS Style

Jung, J.; Lee, Y.; Jung, A.K.; Shin, S.; Lee, W.-C. Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters. Mathematics 2026, 14, 1662. https://doi.org/10.3390/math14101662

AMA Style

Jung J, Lee Y, Jung AK, Shin S, Lee W-C. Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters. Mathematics. 2026; 14(10):1662. https://doi.org/10.3390/math14101662

Chicago/Turabian Style

Jung, Juyoung, Yeonju Lee, Ae Kyong Jung, Seungwon Shin, and Won-Chan Lee. 2026. "Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters" Mathematics 14, no. 10: 1662. https://doi.org/10.3390/math14101662

APA Style

Jung, J., Lee, Y., Jung, A. K., Shin, S., & Lee, W.-C. (2026). Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters. Mathematics, 14(10), 1662. https://doi.org/10.3390/math14101662

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluating Inferential Statistics Filtering in High-Dimensional Item Feature Spaces for Predicting IRT Parameters

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI