To support novice learners, the
Java programming learning assistant system (JPLAS) has been developed with various features. Among them,
code writing problem (CWP) assigns writing an answer code that passes a given
test code. The correctness of an answer code is validated
[...] Read more.
To support novice learners, the
Java programming learning assistant system (JPLAS) has been developed with various features. Among them,
code writing problem (CWP) assigns writing an answer code that passes a given
test code. The correctness of an answer code is validated by running it on
JUnit. In previous works, we implemented a
code plagiarism checking function that calculates the
similarity score for each pair of answer codes based on the
Levenshtein distance. When the score is higher than a given threshold, this pair is regarded as
plagiarism. However, a method for finding the proper threshold has not been studied. In addition,
AI-generated codes have become threats in
plagiarism, as AI has grown in popularity, which should be investigated. In this paper, we propose a
threshold selection method based on Tukey’s
IQR fences. It uses a
custom upper threshold derived from the statistical distribution of
similarity scores for each assignment. To better accommodate skewed similarity distributions, the method introduces a simple
percentile-based adjustment for determining the upper threshold. We also design prompts to generate answer codes using
generative AI and apply them to four AI models. For evaluation, we used a total of 745 source codes of two datasets. The first dataset consists of 420 answer codes across 12 CWP instances from 35 first-year undergraduate students in the State Polytechnic of Malang, Indonesia (POLINEMA). The second dataset includes 325 answer codes across five CWP assignments from 65 third-year undergraduate students at Okayama University, Japan. The applications of our proposals found the following: (1) any pair of student codes whose score is higher than the selected threshold has some evidence of
plagiarism, (2) some student codes have a higher similarity than the threshold with
AI-generated codes, indicating the use of
generative AI, and (3) multiple
AI models can generate code that resembles student-written code, despite adopting different implementations. The validity of our proposal is confirmed.
Full article