Factors Associated with the Equivalence of the Scores of Computer-Based Test and Paper-and-Pencil Test: Presentation Type, Item Difficulty and Administration Order
Abstract
:1. Introduction
- What are the effects of test format (CBT and PPT), computerised presentation type, difficulty of item group, and administration order of item groups of different difficulty levels on students’ answering performance in CBT and PPT?
2. Methodology
2.1. Participants
2.2. Instruments
2.2.1. Achievement Test
2.2.2. Computer-Based Test Environment
2.3. Research Design
2.4. Data Collection and Analysis
3. Results
3.1. Test Item Analysis
3.2. Analysis of Answering Performance of Simple-Item Group and Difficult-Item Group by Test Format and Administration Order of Item Groups of Different Difficulty Levels
3.2.1. Simple-Item Group
3.2.2. Difficult-Item Group
3.3. Analysis of Answering Performance of the Achievement Test by Difficulty of Item Group and Computerised Presentation Type
4. Concluding Remarks
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Treatment | Administration Order in the Achievement Test | |
---|---|---|
First Part | Second Part | |
1 | Simple-item group in PPT | Difficult-item group in CS |
2 | Simple-item group in PPT | Difficult-item group in CM |
3 | Difficult-item group in PPT | Simple-item group in CS |
4 | Difficult-item group in PPT | Simple-item group in CM |
5 | Simple-item group in CS | Difficult-item group in PPT |
6 | Simple-item group in CM | Difficult-item group in PPT |
7 | Difficult-item group in CS | Simple-item group in PPT |
8 | Difficult-item group in CM | Simple-item group in PPT |
References
- Elsalem, L.; Al-Azzam, N.; Jum’ah, A.A.; Obeidat, N.; Sindiani, A.M.; Kheirallah, K.A. Stress and behavioral changes with remote E-exams during the Covid-19 pandemic: A cross-sectional study among undergraduates of medical sciences. Ann. Med. Surg. 2020, 60, 271–279. [Google Scholar] [CrossRef] [PubMed]
- Gamage, K.A.; Silva, E.K.D.; Gunawardhana, N. Online delivery and assessment during COVID-19: Safeguarding academic integrity. Educ. Sci. 2020, 10, 301. [Google Scholar] [CrossRef]
- Guangul, F.M.; Suhail, A.H.; Khalit, M.I.; Khidhir, B.A. Challenges of remote assessment in higher education in the context of COVID-19: A case study of Middle East College. Educ. Assess. Eval. Account. 2020, 32, 519–535. [Google Scholar] [CrossRef]
- Parshall, C.G.; Spray, J.A.; Kalohn, J.C.; Davey, T. Practical Considerations in Computer-Based Testing; Springer: New York, NY, USA, 2020; Available online: https://link.springer.com/book/10.1007%2F978-1-4613-0083-0 (accessed on 18 July 2021).
- Wang, T.H. Developing a web-based assessment system for evaluating examinee’s understanding of the procedure of scientific experiments. Eurasia J. Math. Sci. Technol. Educ 2018, 14, 1791–1801. [Google Scholar] [CrossRef]
- Wang, T.H. Developing web-based assessment strategies for facilitating junior high school students to perform self-regulated learning in an e-learning environment. Comput. Educ. 2011, 57, 1801–1812. [Google Scholar] [CrossRef]
- Wang, T.H.; Kao, C.H.; Dai, Y.L. Developing a web-based multimedia assessment system for facilitating science laboratory instruction. J. Comput. Assist. Learn. 2019, 35, 529–539. [Google Scholar] [CrossRef]
- Zou, X.L.; Ou, L. EFL reading test on mobile versus on paper: A study from metacognitive strategy use to test-media impacts. Educ. Assess. Eval. Acc. 2020, 32, 373–394. [Google Scholar] [CrossRef]
- Association of Test Publishers. ATP Computer-Based Testing Guidelines. 2002. Available online: http://www.testpublishers.org (accessed on 18 July 2021).
- International Test Commission (ITC). International Guidelines on Computer-Based and Internet Delivered Testing. 2004. Available online: http://www.intestcom.org (accessed on 18 July 2021).
- American Educational Research Association; American Psychological Association; National Council on Measurement in Education. Standards for Educational and Psychological Testing; American Educational Research Association: Washington, DC, USA, 2014. [Google Scholar]
- Leeson, H.V. The mode effect: A literature review of human and technological issues in computerized testing. Int. J. Test. 2006, 6, 1–24. [Google Scholar] [CrossRef]
- Wang, S.; Jiao, H.; Young, M.J.; Brooks, T.; Olson, J. A meta-analysis of testing mode effects in grade K-12 mathematics tests. Educ. Psychol. Meas. 2007, 67, 219–238. [Google Scholar] [CrossRef]
- Dadey, N.; Lyons, S.; DePascale, C. The comparability of scores from different digital devices: A literature review and synthesis with recommendations for practice. Appl. Meas. Educ. 2018, 31, 30–50. [Google Scholar] [CrossRef]
- Pommerich, M. Developing computerized versions of paper-and-pencil tests: Mode effects for passage-based tests. J. Tech. Learn. Assess. 2004, 2. Available online: https://ejournals.bc.edu/index.php/jtla/article/view/1666 (accessed on 18 July 2021).
- Pommerich, M. The effect of using item parameters calibrated from paper administrations in computer adaptive test administrations. J. Tech. Learn. Assess. 2007, 5. Available online: https://ejournals.bc.edu/index.php/jtla/article/view/1646 (accessed on 18 July 2021).
- Russell, M.; Plati, T. Does it matter with what I write? Comparing performance on paper, computer and portable writing devices. Curr. Issues Educ. 2002, 5. Available online: http://cie.ed.asu.edu/volume5/number4/ (accessed on 18 July 2021).
- Wang, S.; Young, M.J.; Brooks, T.E. Administration Mode Comparability Study for Stanford Diagnostic Reading and Mathematics Tests (Research Report); Harcourt Assessment: San Antonio, TX, USA, 2004. [Google Scholar]
- Kingston, N.M. Comparability of computer-and-paper-administered multiple-choice tests for K-12 populations: A synthesis. Appl. Meas. Educ. 2009, 22, 22–37. [Google Scholar] [CrossRef]
- Hensley, K.K. Examining the Effects of Paper-Based and Computer-Based Modes of Assessment of Mathematics Curriculum-Based Measurement. Ph.D. Thesis, University of Iowa, Iowa, IA, USA, 2015. [Google Scholar] [CrossRef] [Green Version]
- Logan, T. The influence of test mode and visuospatial ability on mathematics assessment performance. Math. Educ. Res. J. 2015, 27, 423–441. [Google Scholar] [CrossRef]
- Hosseini, M.; Abidin, M.J.Z.; Baghdarnia, M. Comparability of test results of computer based tests (CBT) and paper and pencil tests (PPT) among English language learners in Iran. Pro. Sco. Behav. Sci. 2014, 98, 659–667. [Google Scholar] [CrossRef] [Green Version]
- Hamhuis, E.; Glas, C.; Meelissen, M. Tablet assessment in primary education: Are there performance differences between TIMSS’paper-and-pencil test and tablet test among Dutch grade-four students? Br. J. Educ. Technol. 2020, 51, 2340–2358. [Google Scholar] [CrossRef] [Green Version]
- Retnawati, H. The comparison of accuracy scores on the paper and pencil testing vs. computer-based testing. Turk. Online J. Educ. Technol.-TOJET 2015, 14, 135–142. [Google Scholar]
- Khoshsima, H.; Hashemi Toroujeni, S.M.; Thompson, N.; Reza Ebrahimi, M. Computer-based (CBT) vs. paper-based (PBT) testing: Mode effect, relationship between computer familiarity, attitudes, aversion and mode preference with CBT test scores in an Asian private EFL context. Teach. Engl. Technol. 2019, 19, 86–101. [Google Scholar]
- Miller, M.D.; Linn, R.L.; Gronlund, N.E. Measurement and Assessment in Teaching, 11th ed.; Pearson: New York, NY, USA, 2012. [Google Scholar]
- Ollennu, S.N.N.; Etsey, Y.K.A. The impact of item position in multiple-choice test on student performance at the basic education certificate examination (BECE) level. Univers. J. Educ. Res. 2015, 3, 718–723. [Google Scholar] [CrossRef] [Green Version]
- Nie, Y.; Lau, S.; Liau, A.K. Role of academic self-efficacy in moderating the relation between task importance and test anxiety. Learn. Individ. Differ. 2011, 21, 736–741. [Google Scholar] [CrossRef]
- Camara, W. Never let a crisis go to waste: Large-scale assessment and the response to COVID-19. Educ. Meas. 2020, 39, 10–18. [Google Scholar] [CrossRef]
- Nardi, A.; Ranieri, M. Comparing paper-based and electronic multiple-choice examinations with personal devices: Impact on students’ performance, self-efficacy and satisfaction. Br. J. Educ. Technol. 2019, 50, 1495–1506. [Google Scholar] [CrossRef]
- Sweller, J.; Ayres, P.; Kalyuga, S. Measuring cognitive load. In Cognitive Load Theory; Springer: New York, NY, USA, 2011; pp. 71–85. Available online: https://link.springer.com/chapter/10.1007/978-1-4419-8126-4_6 (accessed on 18 July 2021).
- Sweller, J. Element interactivity and intrinsic, extraneous, and germane cognitive load. Educ. Psychol. Rev. 2010, 22, 123–138. [Google Scholar] [CrossRef]
- Mayer, R.E. Using multimedia for e-Learning. J. Comput. Assist. Learn. 2017, 33, 403–423. [Google Scholar] [CrossRef]
- Singh, A.M.; Marcus, N.; Ayres, P. The transient information effect: Investigating the impact of segmentation on spoken and written text. Appl. Cogn. Psychol. 2012, 26, 848–853. [Google Scholar] [CrossRef]
- Raje, S.; Stitzel, S. Strategies for effective assessments while ensuring academic integrity in general chemistry courses during COVID-19. J. Chem. Educ. 2020, 97, 3436–3440. [Google Scholar] [CrossRef]
Category | Factor | Explanation | Significant Influence | Study |
---|---|---|---|---|
Presentation factors | Display | Screen size, font size and style, resolution of graphics and screen, multiscreen, graphical or complex displays, line length, number of lines, interline spacing, whitespace, scrolling | Yes | Wang et al.(2007) Leeson (2006) Dadey (2018) |
Answering strategy | Reviewing and revising previous responses | Yes | Wang et al.(2007) Leeson (2006) | |
Content factors | Subjects | English language, arts, reading, social studies, mathematics | No | Kingston (2009) |
Reading comprehension | No | Pommerich (2007) Hosseini et al. (2014) | ||
English proficiency | No | Retnawati (2015) Khoshsima et al. (2019) | ||
Mathematics | No | Wang et al., (2007) Hamhuis et al. (2020) | ||
Science | No | Hamhuis et al. (2020) | ||
Mathematics | Yes | Hensley (2015) | ||
Science reasoning | Yes | Pommerich(2007) |
Test Format | Simple-Item Group | Difficult-Item Group | Total | ||||||
---|---|---|---|---|---|---|---|---|---|
N | ACAR | SD | N | ACAR | SD | N | ACAR | SD | |
PPT | 8 | 0.726 | 0.163 | 8 | 0.291 | 0.130 | 16 | 0.509 | 0.266 |
CS | 8 | 0.709 | 0.163 | 8 | 0.348 | 0.130 | 16 | 0.528 | 0.235 |
PPT | 8 | 0.726 | 0.166 | 8 | 0.299 | 0.114 | 16 | 0.530 | 0.276 |
CM | 8 | 0.734 | 0.135 | 8 | 0.353 | 0.139 | 16 | 0.543 | 0.237 |
Test Format | Administration Order of Simple-Item Group | Total | |||||||
---|---|---|---|---|---|---|---|---|---|
First | Second | ||||||||
N | ACAR | SD | N | ACAR | SD | N | ACAR | SD | |
PPT | 48 | 0.711 | 0.197 | 49 | 0.742 | 0.237 | 97 | 0.727 | 0.218 |
CS | 49 | 0.689 | 0.224 | 49 | 0.725 | 0.227 | 98 | 0.707 | 0.225 |
PPT | 42 | 0.789 | 0.145 | 49 | 0.740 | 0.187 | 91 | 0.762 | 0.170 |
CM | 50 | 0.750 | 0.219 | 49 | 0.717 | 0.185 | 99 | 0.734 | 0.203 |
Source | SS | df | MS | F | p | η2 | ||
---|---|---|---|---|---|---|---|---|
SI | PPT&CS | Between | ||||||
Test format | 0.020 | 1 | 0.020 | 0.397 | 0.530 | 0.002 | ||
Administration order | 0.055 | 1 | 0.055 | 1.116 | 0.292 | 0.006 | ||
Test format x Administration order | 0.000 | 1 | 0.000 | 0.005 | 0.946 | 0.000 | ||
Error | 9.396 | 191 | 0.049 | |||||
PPT&CM | Between | |||||||
Test format | 0.045 | 1 | 0.045 | 1.278 | 0.260 | 0.007 | ||
Administration order | 0.080 | 1 | 0.080 | 2.265 | 0.134 | 0.012 | ||
Test format x Administration order | 0.003 | 1 | 0.003 | 0.083 | 0.773 | 0.000 | ||
Error | 6.534 | 186 | 0.035 | |||||
DI | PPT&CS | Between | ||||||
Test format | 0.156 | 1 | 0.156 | 4.917 | 0.028 * | 0.025 | ||
Administration order | 0.061 | 1 | 0.061 | 1.924 | 0.167 | 0.010 | ||
Test format x Administration order | 0.181 | 1 | 0.181 | 5.705 | 0.018 * | 0.029 | ||
Error | 6.043 | 191 | 0.032 | |||||
PPT&CM | Between | |||||||
Test format | 0.159 | 1 | 0.159 | 4.633 | 0.033 * | 0.024 | ||
Administration order | 0.001 | 1 | 0.001 | 0.036 | 0.850 | 0.000 | ||
Test format x Administration order | 0.000 | 1 | 0.000 | 0.004 | 0.950 | 0.000 | ||
Error | 6.408 | 187 | 0.034 |
Test Format | Administration Order of Difficult-Item Group | Total | |||||||
---|---|---|---|---|---|---|---|---|---|
First | Second | ||||||||
N | ACAR | SD | N | ACAR | SD | N | ACAR | SD | |
PPT | 48 | 0.339 | 0.177 | 49 | 0.242 | 0.145 | 97 | 0.290 | 0.168 |
CS | 49 | 0.334 | 0.192 | 49 | 0.360 | 0.194 | 98 | 0.347 | 0.192 |
PPT | 49 | 0.299 | 0.165 | 51 | 0.292 | 0.138 | 100 | 0.295 | 0.151 |
CM | 49 | 0.355 | 0.214 | 42 | 0.351 | 0.218 | 91 | 0.353 | 0.215 |
Difficulty of Item Group | Computerised Presentation Type | Total | |||||||
---|---|---|---|---|---|---|---|---|---|
CS | CM | ||||||||
N | ACAR | SD | N | ACAR | SD | N | ACAR | SD | |
SI | 98 | 0.707 | 0.225 | 91 | 0.739 | 0.203 | 189 | 0.722 | 0 |
DI | 98 | 0.347 | 0.192 | 91 | 0.353 | 0.215 | 189 | 0.350 | 0.203 |
Source | SS | df | MS | F | p | η2 |
---|---|---|---|---|---|---|
Between | ||||||
Computerised presentation type | 0.035 | 1 | 0.035 | 0.931 | 0.336 | 0.005 |
Error | 7.013 | 187 | 0.038 | |||
Within | ||||||
Difficulty of item group | 13.118 | 1 | 13.118 | 263.023 | 0.000 ** | 0.584 |
Difficulty of item group x Computerised presentation type | 0.016 | 1 | 0.016 | 0.327 | 0.568 | 0.002 |
Error | 9.327 | 187 | 0.050 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, T.-H.; Kao, C.-H.; Chen, H.-C. Factors Associated with the Equivalence of the Scores of Computer-Based Test and Paper-and-Pencil Test: Presentation Type, Item Difficulty and Administration Order. Sustainability 2021, 13, 9548. https://doi.org/10.3390/su13179548
Wang T-H, Kao C-H, Chen H-C. Factors Associated with the Equivalence of the Scores of Computer-Based Test and Paper-and-Pencil Test: Presentation Type, Item Difficulty and Administration Order. Sustainability. 2021; 13(17):9548. https://doi.org/10.3390/su13179548
Chicago/Turabian StyleWang, Tzu-Hua, Chien-Hui Kao, and Hsiang-Chun Chen. 2021. "Factors Associated with the Equivalence of the Scores of Computer-Based Test and Paper-and-Pencil Test: Presentation Type, Item Difficulty and Administration Order" Sustainability 13, no. 17: 9548. https://doi.org/10.3390/su13179548