3.1. Student Questionnaire
The survey findings offer a comprehensive overview of students’ experiences, perceptions, and perceived challenges associated with peer assessment practices. Descriptive statistical analyses were conducted to examine participants’ responses across all survey items. Specifically, mean scores were calculated to determine the central tendency of responses, while standard deviation values were used to assess the degree of variability among participants. Mean values approaching 5 indicate stronger levels of agreement and more positive perceptions of peer assessment, whereas values approaching 1 reflect stronger disagreement and less favorable attitudes. A total of 60 students participated in this phase of the study. In addition, Pearson correlation analysis (
Table 1 was used to demonstrate the strength and direction of relationships among selected perception variables related to feedback literacy, confidence, trust, and perceived learning benefits).
The results presented in
Table 1 indicate that students generally had positive experiences with peer assessment activities. A discrepancy was identified in the reporting of Item 1. The correct mean score for Item 1 is M = 2.75 (SD = 0.78), which reflects a moderate frequency of participation in peer assessment activities. The previously stated value of 0.82 was incorrect and does not correspond to the reported data. All interpretations have been revised accordingly to ensure consistency between the descriptive statistics and their explanation. The findings further show that students perceived the assessment criteria and rubrics as relatively clear (M = 3.90, SD = 0.78) and found the peer assessment rubric reasonably easy to use (M = 3.48, SD = 0.73). Although not all students received formal guidance on how to provide feedback, Item 4 is a dichotomous variable (Yes/No) and is reported using frequencies and percentages. Among those students, the usefulness of the training was evaluated positively (M = 3.19, SD = 1.01).
Overall, the relatively low standard deviation values suggest moderate consistency in students’ responses, indicating that participants shared similar views regarding their experiences with peer assessment.
The findings in
Table 2 indicate that students generally held positive perceptions of peer assessment. The highest mean score was reported for the statement that peer assessment encouraged students to reflect on their own work (M = 3.87, SD = 0.89), suggesting that reflective learning was one of the most valued outcomes of the process. Students also agreed that receiving feedback from classmates improved their work (M = 3.83, SD = 0.83) and that peer assessment enhanced collaboration among students (M = 3.83, SD = 0.81). Similarly, students perceived peer assessment as beneficial for improving critical thinking skills (M = 3.67, SD = 0.88) and increasing confidence in evaluating academic work (M = 3.71, SD = 0.95).
Although students generally viewed peer assessment positively, trust in the scores assigned by classmates received the lowest mean score (M = 3.34, SD = 0.84), indicating some reservations about the fairness or accuracy of peer evaluations. Nevertheless, the relatively high mean score for including peer assessment in future courses (M = 3.69, SD = 0.88) suggests that students largely supported its continued use in higher education settings.
Table 3 illustrates students’ responses which show moderate levels of nervousness when evaluating classmates (M = 2.90, SD = 1.11), indicating that some students felt uncomfortable giving feedback to peers. Language differences were not viewed as a major barrier (M = 2.90, SD = 0.96), suggesting that communication issues were relatively limited in the peer assessment process. Students moderately agreed that some peers provided limited or unclear feedback (M = 3.27, SD = 0.98), which indicates a need for clearer guidance and stronger scaffolding on how to give constructive comments. There was also some preference for teacher assessment over peer assessment (M = 3.36, SD = 0.89), implying that students may still perceive teacher feedback as more reliable. Students also believed that some classmates were too generous in their scoring (M = 3.32, SD = 0.78), while fewer believed that classmates were too strict (M = 2.81, SD = 0.73). This suggests that leniency may be a more common issue than harshness in peer assessment practices.
The survey results indicate that students generally held positive attitudes toward peer assessment and recognized its value in supporting learning. Participants reported that peer assessment helped them better understand assessment criteria, reflect on their own work, improve collaboration, and develop critical thinking skills. Students also perceived peer feedback as beneficial for improving the quality of their work and expressed support for the inclusion of peer assessment in future courses. At the same time, the findings reveal several challenges that may limit the effectiveness of peer assessment. Some students experienced nervousness when evaluating their classmates, questioned the fairness and reliability of peer scores, and continued to prefer teacher assessment. Concerns were also raised regarding unclear feedback and overly generous scoring by some peers. Overall, the findings suggest that peer assessment can be an effective pedagogical practice when supported by clear rubrics, training, and structured guidance. Strengthening students’ feedback skills and increasing transparency in assessment procedures may further enhance trust in the process and improve the overall effectiveness of peer assessment activities.
In addition to descriptive statistics, the Pearson correlation analysis was conducted to explore relationships among key perceived dimensions of peer assessment, rather than to test causality or a structural model.
Table 4 presents the Pearson correlation coefficients among the key dimensions of students’ peer assessment experiences. The results indicate generally positive associations between all variables, although the strength of these relationships varies. Strong correlations are observed between understanding assessment criteria, critical thinking improvement, and self-reflection (r = 0.75–0.79), suggesting that these aspects of feedback literacy tend to co-occur in students’ perceptions. Moderate correlations are also evident between these core evaluative dimensions and confidence in evaluating academic work, indicating that students who report better understanding and reflective engagement also tend to feel more confident in assessment tasks.
In contrast, collaboration, trust in classmates’ scores, and support for future inclusion of peer assessment show weaker but still positive relationships with the other constructs (r = 0.21–0.58). These findings suggest that while affective and social dimensions such as collaboration and trust are related to cognitive aspects of peer assessment, they are more loosely connected.
To conclude, the correlations should be interpreted as associations among perceived constructs rather than evidence of causality. They provide an exploratory overview of how different dimensions of peer assessment experience relate to one another within the present sample.
3.2. Peer Assessment Rubric Results
The second instrument used in this study was a peer assessment form, completed by both the teacher and the students. A total of 22 forms were used with students at SEE University. This instrument helped to better understand the effectiveness of peer assessment. Comparing students’ evaluations with instructor evaluations helped determine whether students were able to assess their classmates fairly and accurately. In this study, both peer assessment and teacher-assessment rubrics were used to evaluate similar criteria, including content quality, organization, language use, critical thinking, participation, and task completion. By comparing the two sets of scores, it became possible to identify patterns, similarities, and differences in how students and instructors perceived academic performance.
Students assess their classmates using the following criteria on a 1–5 scale:
Table 5 presents the student assessment rubric used in the study. The rubric includes five criteria: content quality, organization, language accuracy, critical thinking, and participation. Each criterion was rated on a scale from 1 to 5. This rubric provided students with a clear framework for evaluating their classmates’ work fairly and consistently.
Table 5 highlights the instructor evaluation of students’ work using the following criteria on a 1–5 scale.
The teacher assessment table evaluates student performance across five key criteria, each rated on a 1–5 scale. These include content knowledge, task completion, language use, presentation/organization, and engagement. Overall, the rubric provides a balanced evaluation of students’ understanding, skills, and active participation in learning tasks (
Table 6).
3.3. Results for 22 Students
The comparison of peer assessment and teacher assessment in
Table 7 results showed that the scores were generally close across most students. In many cases, the difference between peer and teacher scores ranged from only 0.10 to 0.40 points, indicating that students were relatively objective when evaluating their classmates. Teacher-assessment scores were slightly higher overall, particularly in the areas of content knowledge, task completion, and language accuracy. This may suggest that instructors recognized strengths that peers did not always notice. On the other hand, peer assessment scores were sometimes lower because students may have been more cautious or uncertain when assigning marks to classmates.
The largest differences appeared in criteria such as critical thinking and organization. Teachers often awarded higher scores in these areas because they may have had a clearer understanding of the expected academic standards. Students, however, may have focused more on visible aspects such as participation and presentation style. In contrast, the smallest differences were found in participation and engagement. Both peers and instructors tended to agree on which students actively contributed to discussions, group work, and classroom activities.
Overall, the findings suggest that peer assessment can provide results similar to teacher assessment, especially when students are given clear rubrics and guidance. Although teacher scores remain slightly higher and more consistent, peer assessment appears to be a reliable method that can complement instructor evaluation and encourage greater student reflection and responsibility.
Paired T–Test Results
To examine differences between peer assessment and instructor-assessment scores, a paired samples
t-test was conducted. Prior to analysis, assumptions of normality of difference scores were checked and met. The analysis compared the mean scores assigned by peers and the instructor for the same group of students in order to determine whether statistically significant differences existed between the two evaluation methods. According to
Cohen’s (
1988) guidelines, the observed effect size (d = 1.04) indicates a large practical difference between peer and instructor assessment (
Figure 2).
A paired-samples t-test was conducted to examine whether there was a statistically significant difference between peer assessment and instructor-assessment scores for the same 22 students. Prior to analysis, assumptions of normality were assessed using the Shapiro–Wilk test and inspection of Q–Q plots. The results indicated that the distribution of difference scores did not significantly deviate from normality, W(22) = 0.97, p = 0.58, supporting the use of a parametric test. The analysis revealed a statistically significant difference between peer assessment scores (M = 3.78, SD = 0.36) and instructor-assessment scores (M = 4.32, SD = 0.42), t(21) = −4.87, p < 0.001. The mean difference was −0.54 points (95% CI [−0.77, −0.31]), indicating that instructors consistently awarded higher scores than peer assessors. The magnitude of this difference was large, d = 1.04, suggesting a substantial discrepancy between peer and instructor evaluations in this sample.
These findings indicate a systematic difference in scoring patterns between peer and instructor assessment. In practical terms, peer assessors tended to assign lower scores than instructors, suggesting variation in how assessment criteria were interpreted and applied. This pattern may reflect differences in evaluative experience and familiarity with assessment standards, as instructor judgments typically represent more established calibration with disciplinary expectations.
Rather than indicating a deficiency in peer assessment, this discrepancy may be understood as part of the developmental nature of students’ assessment practices when evaluative judgment becomes more refined through experience and feedback exposure. Peer assessment in this context appears to reflect an emerging stage of assessment literacy, in which students are still developing consistency in applying criteria in a manner aligned with instructor benchmarks.
Although the present study is situated within an EMI context, it does not directly examine the cognitive or linguistic processes underlying assessment decisions. Therefore, no causal claims can be made regarding the role of language in shaping assessment judgments. Future research using qualitative or mixed-method designs would be needed to explore how language proficiency and cognitive processing may interact with assessment practices in greater depth.
3.4. Reflective Interviews Results
After completing the peer assessment activity, the 37 students were asked to respond individually to a set of interview reflection questions. The questions encouraged them to describe their experiences of giving and receiving feedback, explain what they learned from reviewing peers’ work, discuss any challenges they faced, and reflect on the impact of peer assessment on their confidence, communication, and academic performance. A total of 37 students participated in the reflective interview component. Thematic analysis identified recurring patterns across student responses. The qualitative data were analyzed using thematic analysis (
Braun & Clarke, 2006). Responses were read several times to achieve familiarity with the data, after which initial codes were generated and grouped into broader categories. These categories were reviewed and refined to identify recurring themes across participants’ reflections. To enhance the credibility and consistency of the findings, the coding and thematic interpretations were discussed and reviewed by the research team until agreement was reached.
The thematic frequency analysis of reflective interviews (
n = 37) presented in
Table 8 reveals a highly consistent pattern of student engagement with peer assessment across cognitive, affective, linguistic, and contextual dimensions. The distribution of responses indicates that peer assessment was not perceived as a single-dimensional activity but rather as a multidimensional learning experience integrating emotional regulation, evaluative judgment, and language-mediated interaction.
The most frequently reported theme was Learning through Peer Evaluation (91.9%), suggesting that students overwhelmingly perceived peer assessment as a cognitively productive activity. This high frequency indicates that the primary value of peer assessment lies in its capacity to foster analytical awareness, particularly in recognizing writing structures, identifying errors, and developing evaluative judgment. This finding supports the view that peer assessment functions as a form of learning-by-assessing, consistent with feedback literacy frameworks (
Boud & Molloy, 2013;
Carless, 2015), where learners develop understanding through active engagement with assessment criteria. One participant explained:
“When I assessed my classmates, I started noticing mistakes and good ideas that I had not paid attention to before. It also made me think more carefully about my own work.”
Another student stated:
“Giving feedback helped me understand what makes an assignment strong and what I should improve in future tasks.”
Closely related to this, Perceived Impact on Learning (89.2%) further confirms that students experienced peer assessment as a meaningful contributor to skill development, particularly in confidence, communication, and critical thinking. The proximity of these two high-frequency themes suggests a strong alignment between cognitive engagement and perceived academic growth, indicating that evaluative participation translates into self-reported learning benefits.
Students commonly associated peer assessment with increased confidence and greater awareness of learning processes.
“After several peer assessment activities, I felt more confident in judging academic work and understanding what teachers expect.”
Another participant reflected:
“Peer assessment helped me become more reflective and think about how I can improve my own performance.”
A third prominent theme, Perceived Value of Feedback Received (83.8%), highlights the importance of feedback quality in shaping student learning experiences. Students emphasized the usefulness of specific, structured, and actionable feedback, indicating that the pedagogical effectiveness of peer assessment is strongly dependent on the clarity and depth of peer comments. This finding reinforces prior research suggesting that feedback effectiveness is closely linked to its specificity and usability rather than its source alone. One student stated:
“Some comments from my classmates helped me see problems in my presentation that I did not notice myself.”
Another participant noted:
“I liked receiving different opinions because they gave me ideas about how I could improve my work.”
The Multilingual Classroom Influence (81.1%) theme underscores the contextual role of linguistic diversity in shaping peer assessment experiences. Students generally viewed multilingual environments as enriching, particularly in terms of exposure to different perspectives and communication styles.
However, this positive perception coexisted with reported Language-Related Challenges (73.0%), indicating that linguistic limitations and difficulties in expressing feedback in academic English remain a persistent constraint. This dual pattern reflects a key characteristic of EMI environments: multilingualism operates simultaneously as a resource for meaning-making and as a source of communicative tension. One student stated that:
“Sometimes I knew what I wanted to say, but it was difficult to explain my ideas clearly in English.”
Another participant observed:
“I was worried that my feedback would not be understood because of my language mistakes.”
Finally, the Emotional Experience of Giving Feedback (78.4%) reveals that peer assessment is also an affectively charged activity. Students frequently reported nervousness, hesitation, and concern about politeness when delivering feedback. This emotional dimension highlights the social sensitivity of peer evaluation, where interpersonal relationships and cultural norms influence academic judgment. The presence of this theme indicates that evaluative learning is not purely cognitive but deeply embedded in affective and relational dynamics. One student commented that:
“Students from different language backgrounds sometimes noticed things that I would not have considered, which made the feedback more interesting and useful.”
Similarly, another student commented:
“Working with classmates who speak different languages helped me see different perspectives on the same task.”
Overall, the distribution of thematic frequencies suggests a coherent structure in which cognitive benefits (learning and reflection) are most dominant, while emotional and linguistic challenges remain secondary but still substantial. Importantly, the co-occurrence of high-frequency cognitive and contextual themes indicates that peer assessment effectiveness in multilingual EMI settings is shaped by an interplay between learning engagement, feedback quality, linguistic mediation, and emotional regulation.
From a theoretical perspective, these findings support sociocultural and feedback literacy perspectives, suggesting that peer assessment functions as a socially mediated learning practice rather than a purely evaluative mechanism. In multilingual EMI classrooms, this process is further shaped by linguistic diversity, which both enables richer interaction and introduces communicative complexity.
In sum, the frequency distribution demonstrates that students primarily perceive peer assessment as a learning-enhancing activity, but its effectiveness is contingent upon linguistic accessibility, emotional comfort, and the clarity of peer-generated feedback.