Information-Weighted and Normal Density-Weighted Haebara Linking
Abstract
:1. Introduction
2. Haebara Linking
2.1. Haebara Linking Weighted by Normal Density
2.2. Haebara Linking Weighted by Item Information
2.3. Standard Error Estimation
3. Research Purpose
4. Simulation Study
4.1. Method
4.2. Results
4.2.1. Bias
4.2.2. RMSE
4.2.3. Coverage Rates
4.2.4. Summary
5. Discussion
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
2PL | Two-parameter logistic |
3PL | Three-parameter logistic |
DIF | Differential item functioning |
IIF | Item information function |
IRF | Item response function |
IRT | Item response theory |
MML | Marginal maximum likelihood |
RMSE | Root mean square error |
SD | Standard deviation |
References
- Bock, R.D.; Moustaki, I. Item response theory in a general framework. In Handbook of Statistics, Vol. 26: Psychometrics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2007; pp. 469–513. [Google Scholar] [CrossRef]
- Chen, Y.; Li, X.; Liu, J.; Ying, Z. Item response theory–A statistical framework for educational and psychological measurement. Stat. Sci. 2024; Epub ahead of print. Available online: https://rb.gy/1yic0e (accessed on 9 March 2025).
- van der Linden, W.J. Unidimensional logistic response models. In Handbook of Item Response Theory, Volume 1: Models; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; pp. 11–30. [Google Scholar]
- Nering, M.L.; Ostini, R. Handbook of Polytomous Item Response Theory Models; Taylor & Francis: Boca Raton, FL, USA, 2011. [Google Scholar] [CrossRef]
- Mellenbergh, G.J. Models for continuous responses. In Handbook of Item Response Theory; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; Volume 1: Models, pp. 153–163. [Google Scholar]
- Tutz, G.; Jordan, P. Latent trait item response models for continuous responses. J. Educ. Behav. Stat. 2024, 49, 499–532. [Google Scholar] [CrossRef]
- Yen, W.M.; Fitzpatrick, A.R. Item response theory. In Educational Measurement; Brennan, R.L., Ed.; Praeger Publishers: Westport, CT, USA, 2006; pp. 111–154. [Google Scholar]
- Birnbaum, A. Some latent trait models and their use in inferring an examinee’s ability. In Statistical Theories of Mental Test Scores; Lord, F.M., Novick, M.R., Eds.; MIT Press: Reading, MA, USA, 1968; pp. 397–479. [Google Scholar]
- Aitkin, M. Expectation maximization algorithm and extensions. In Handbook of Item Response Theory; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; Volume 2: Statistical Tools, pp. 217–236. [Google Scholar] [CrossRef]
- Bock, R.D.; Aitkin, M. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika 1981, 46, 443–459. [Google Scholar] [CrossRef]
- Glas, C.A.W. Maximum-likelihood estimation. In Handbook of Item Response Theory; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; Volume 2: Statistical Tools, pp. 197–216. [Google Scholar] [CrossRef]
- Holland, P.W. On the sampling theory foundations of item response theory models. Psychometrika 1990, 55, 577–601. [Google Scholar] [CrossRef]
- Baker, F.B.; Kim, S.H. Item Response Theory: Parameter Estimation Techniques; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar] [CrossRef]
- Lee, W.C.; Lee, G. IRT linking and equating. In The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test; Irwing, P., Booth, T., Hughes, D.J., Eds.; Wiley: New York, NY, USA, 2018; pp. 639–673. [Google Scholar] [CrossRef]
- Kolen, M.J.; Brennan, R.L. Test Equating, Scaling, and Linking; Springer: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
- Sansivieri, V.; Wiberg, M.; Matteucci, M. A review of test equating methods with a special focus on IRT-based approaches. Statistica 2017, 77, 329–352. [Google Scholar] [CrossRef]
- Robitzsch, A. A comparison of linking methods for two groups for the two-parameter logistic item response model in the presence and absence of random differential item functioning. Foundations 2021, 1, 116–144. [Google Scholar] [CrossRef]
- Haebara, T. Equating logistic ability scales by a weighted least squares method. Jpn. Psychol. Res. 1980, 22, 144–149. [Google Scholar] [CrossRef]
- Stocking, M.L.; Lord, F.M. Developing a common metric in item response theory. Appl. Psychol. Meas. 1983, 7, 201–210. [Google Scholar] [CrossRef]
- Liu, G.; Kim, H.J.; Lee, W.C.; Kim, Y. Comparison of Simultaneous Linking and Separate Calibration with Stocking-Lord Method; CASMA Research Report Number 57; Center for Advanced Studies in Measurement and Assessment, University of Iowa: Iowa City, IA, USA, 2024; Available online: https://tinyurl.com/2bj6pbwn (accessed on 9 March 2025).
- Jianhua, X.; Shuliang, D. Model-based methods for test equating under item response theory. In Proceedings of the 2010 International Conference on E-Business and E-Government, IEEE, Guangzhou, China, 7–9 May 2010; pp. 5458–5461. [Google Scholar] [CrossRef]
- Wang, S.; Zhang, M.; Lee, W.C.; Huang, F.; Li, Z.; Li, Y.; Yu, S. Two IRT characteristic curve linking methods weighted by information. J. Educ. Meas. 2022, 59, 423–441. [Google Scholar] [CrossRef]
- Wang, S.; Lee, W.C.; Zhang, M.; Yuan, L. IRT characteristic curve linking methods weighted by information for mixed-format tests. Appl. Meas. Educ. 2024, 37, 377–390. [Google Scholar] [CrossRef]
- Kim, S.H.; Cohen, A.S. A comparison of linking and concurrent calibration under the graded response model. Appl. Psychol. Meas. 2002, 26, 25–41. [Google Scholar] [CrossRef]
- Kim, S.; Kolen, M.J. Effects on scale linking of different definitions of criterion functions for the IRT characteristic curve methods. J. Educ. Behav. Stat. 2007, 32, 371–397. [Google Scholar] [CrossRef]
- Battauz, M. equateIRT: An R package for IRT test equating. J. Stat. Softw. 2015, 68, 1–22. [Google Scholar] [CrossRef]
- Weeks, J.P. plink: An R package for linking mixed-format tests using IRT-based methods. J. Stat. Softw. 2010, 35, 1–33. [Google Scholar] [CrossRef]
- Robitzsch, A. Sirt: Supplementary Item Response Theory Models. R Package Version 4.2-106. 2024. Available online: https://github.com/alexanderrobitzsch/sirt (accessed on 31 December 2024).
- Boos, D.D.; Stefanski, L.A. Essential Statistical Inference; Springer: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
- Ogasawara, H. Standard errors of item response theory equating/linking by response function methods. Appl. Psychol. Meas. 2001, 25, 53–67. [Google Scholar] [CrossRef]
- Ogasawara, H. Item response theory true score equatings and their standard errors. J. Educ. Behav. Stat. 2001, 26, 31–50. [Google Scholar] [CrossRef]
- Andersson, B. Asymptotic variance of linking coefficient estimators for polytomous IRT models. Appl. Psychol. Meas. 2018, 42, 192–205. [Google Scholar] [CrossRef]
- Battauz, M. IRT test equating in complex linkage plans. Psychometrika 2013, 78, 464–480. [Google Scholar] [CrossRef]
- Robitzsch, A. Estimation of standard error, linking error, and total error for robust and nonrobust linking methods in the two-parameter logistic model. Stats 2024, 7, 592–612. [Google Scholar] [CrossRef]
- Zhang, Z. Asymptotic standard errors of generalized partial credit model true score equating using characteristic curve methods. Appl. Psychol. Meas. 2021, 45, 331–345. [Google Scholar] [CrossRef]
- Robitzsch, A. Extensions to mean–geometric mean linking. Mathematics 2025, 13, 35. [Google Scholar] [CrossRef]
- Robitzsch, A. Bias-reduced Haebara and Stocking-Lord linking. J 2024, 7, 373–384. [Google Scholar] [CrossRef]
- Robitzsch, A. Does random differential item functioning occur in one or two groups? Implications for bias and variance in asymmetric and symmetric Haebara and Stocking-Lord linking. Asymmetry 2024, 1, 0005. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria. 2024. Available online: https://www.R-project.org (accessed on 15 June 2024).
- Penfield, R.D.; Camilli, G. Differential item functioning and item bias. In Handbook of Statistics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2007; Volume 26: Psychometrics, pp. 125–167. [Google Scholar] [CrossRef]
- Bauer, D.J. Enhancing measurement validity in diverse populations: Modern approaches to evaluating differential item functioning. Brit. J. Math. Stat. Psychol. 2023, 76, 435–461. [Google Scholar] [CrossRef]
Par | I | N | HA | IHA | HA2 | IHA2 | HA1 | IHA1 | HA0.5 | IHA0.5 |
---|---|---|---|---|---|---|---|---|---|---|
20 | 500 | 0.008 | −0.004 | 0.003 | −0.005 | 0.000 | −0.005 | −0.002 | −0.005 | |
1000 | 0.003 | −0.004 | 0.000 | −0.004 | −0.002 | −0.004 | −0.003 | −0.004 | ||
2000 | 0.001 | −0.002 | 0.000 | −0.002 | 0.000 | −0.002 | −0.001 | −0.001 | ||
5000 | 0.001 | −0.001 | 0.000 | −0.001 | 0.000 | −0.001 | −0.001 | −0.001 | ||
40 | 500 | 0.008 | −0.001 | 0.005 | −0.002 | 0.003 | −0.001 | 0.002 | −0.001 | |
1000 | 0.003 | −0.003 | 0.000 | −0.004 | −0.001 | −0.004 | −0.002 | −0.004 | ||
2000 | −0.001 | −0.002 | −0.001 | −0.002 | −0.001 | −0.001 | −0.001 | −0.001 | ||
5000 | 0.000 | −0.002 | −0.001 | −0.002 | −0.001 | −0.002 | −0.002 | −0.002 | ||
20 | 500 | 0.025 | −0.011 | 0.006 | −0.020 | −0.006 | −0.032 | −0.014 | −0.061 | |
1000 | 0.013 | −0.006 | 0.003 | −0.010 | −0.004 | −0.017 | −0.008 | −0.032 | ||
2000 | 0.006 | −0.003 | 0.002 | −0.005 | −0.002 | −0.008 | −0.003 | −0.016 | ||
5000 | 0.002 | −0.001 | 0.001 | −0.002 | −0.001 | −0.003 | −0.002 | −0.006 | ||
40 | 500 | 0.019 | −0.012 | 0.004 | −0.019 | −0.007 | −0.031 | −0.013 | −0.060 | |
1000 | 0.013 | −0.004 | 0.005 | −0.008 | −0.001 | −0.014 | −0.004 | −0.029 | ||
2000 | 0.005 | −0.003 | 0.001 | −0.005 | −0.002 | −0.008 | −0.003 | −0.016 | ||
5000 | 0.003 | 0.000 | 0.001 | −0.001 | 0.000 | −0.003 | −0.001 | −0.006 |
Par | I | N | HA | IHA | HA2 | IHA2 | HA1 | IHA1 | HA0.5 | IHA0.5 |
---|---|---|---|---|---|---|---|---|---|---|
20 | 500 | 100 | 98.0 | 93.6 | 95.5 | 89.7 | 92.3 | 89.0 | 91.5 | |
1000 | 100 | 98.9 | 94.4 | 96.4 | 91.0 | 93.2 | 90.6 | 92.5 | ||
2000 | 100 | 100.2 | 95.1 | 97.7 | 91.9 | 94.4 | 91.7 | 93.6 | ||
5000 | 100 | 100.3 | 94.6 | 97.6 | 91.0 | 93.9 | 90.2 | 92.5 | ||
40 | 500 | 100 | 97.1 | 96.3 | 95.8 | 94.3 | 94.4 | 94.1 | 94.6 | |
1000 | 100 | 98.2 | 96.4 | 96.8 | 94.3 | 95.1 | 94.1 | 94.8 | ||
2000 | 100 | 100.0 | 97.6 | 98.8 | 96.3 | 97.5 | 96.5 | 97.6 | ||
5000 | 100 | 99.1 | 97.2 | 97.8 | 95.6 | 96.5 | 95.7 | 96.7 | ||
20 | 500 | 100 | 88.6 | 87.3 | 89.9 | 83.5 | 93.9 | 86.2 | 115.1 | |
1000 | 100 | 92.4 | 90.2 | 93.0 | 86.8 | 94.9 | 89.0 | 109.0 | ||
2000 | 100 | 96.4 | 92.7 | 96.9 | 90.1 | 97.9 | 92.5 | 107.9 | ||
5000 | 100 | 95.3 | 93.1 | 95.3 | 90.1 | 95.4 | 92.5 | 102.9 | ||
40 | 500 | 100 | 90.8 | 90.4 | 92.5 | 87.7 | 97.6 | 89.5 | 122.3 | |
1000 | 100 | 92.1 | 91.9 | 92.8 | 89.0 | 95.1 | 90.2 | 109.8 | ||
2000 | 100 | 97.1 | 95.4 | 97.9 | 94.0 | 99.7 | 95.8 | 110.6 | ||
5000 | 100 | 96.3 | 95.0 | 96.2 | 92.8 | 96.5 | 93.8 | 102.3 |
Par | I | N | HA | IHA | HA2 | IHA2 | HA1 | IHA1 | HA0.5 | IHA0.5 |
---|---|---|---|---|---|---|---|---|---|---|
20 | 500 | 95.6 | 95.4 | 95.3 | 95.5 | 95.5 | 95.6 | 95.6 | 95.5 | |
1000 | 94.4 | 94.8 | 94.5 | 94.8 | 94.8 | 95.1 | 94.9 | 95.1 | ||
2000 | 95.3 | 95.2 | 95.2 | 95.2 | 95.1 | 95.1 | 95.1 | 95.1 | ||
5000 | 95.3 | 95.6 | 95.6 | 95.5 | 95.7 | 95.6 | 95.7 | 95.5 | ||
40 | 500 | 94.8 | 95.2 | 94.8 | 95.1 | 94.8 | 95.0 | 94.7 | 95.0 | |
1000 | 95.0 | 95.3 | 95.1 | 95.3 | 95.3 | 95.6 | 95.5 | 95.7 | ||
2000 | 95.4 | 95.3 | 95.3 | 95.3 | 95.2 | 95.2 | 95.1 | 95.1 | ||
5000 | 95.2 | 95.5 | 95.3 | 95.5 | 95.3 | 95.4 | 95.3 | 95.1 | ||
20 | 500 | 94.9 | 95.1 | 94.8 | 95.2 | 95.0 | 95.3 | 95.2 | 95.7 | |
1000 | 95.0 | 95.0 | 94.9 | 95.2 | 95.0 | 95.3 | 95.0 | 95.6 | ||
2000 | 95.2 | 94.9 | 95.3 | 95.0 | 95.3 | 95.3 | 95.2 | 95.5 | ||
5000 | 94.8 | 94.9 | 95.1 | 95.3 | 95.0 | 95.2 | 95.0 | 95.3 | ||
40 | 500 | 94.2 | 94.7 | 94.4 | 94.9 | 94.6 | 95.2 | 95.1 | 95.9 | |
1000 | 94.7 | 95.0 | 95.0 | 95.0 | 95.1 | 95.1 | 95.0 | 95.5 | ||
2000 | 94.9 | 94.8 | 94.7 | 94.8 | 94.8 | 94.9 | 94.9 | 95.1 | ||
5000 | 94.7 | 94.8 | 94.7 | 94.8 | 94.9 | 94.7 | 94.8 | 95.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Robitzsch, A. Information-Weighted and Normal Density-Weighted Haebara Linking. Information 2025, 16, 273. https://doi.org/10.3390/info16040273
Robitzsch A. Information-Weighted and Normal Density-Weighted Haebara Linking. Information. 2025; 16(4):273. https://doi.org/10.3390/info16040273
Chicago/Turabian StyleRobitzsch, Alexander. 2025. "Information-Weighted and Normal Density-Weighted Haebara Linking" Information 16, no. 4: 273. https://doi.org/10.3390/info16040273
APA StyleRobitzsch, A. (2025). Information-Weighted and Normal Density-Weighted Haebara Linking. Information, 16(4), 273. https://doi.org/10.3390/info16040273