A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence

Turós, Mátyás; Kenyeres, Attila Zoltán; Balla, Georgina; Gazdag, Emma; Szabó, Emília; Szűts, Zoltán

doi:10.3390/educsci15091226

Open AccessArticle

A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence

by

Mátyás Turós

^1,*

,

Attila Zoltán Kenyeres

²

,

Georgina Balla

³,

Emma Gazdag

³

,

Emília Szabó

³ and

Zoltán Szűts

¹

Department of Educational Innovation, Eszterhazy Karoly Catholic University, 3300 Eger, Hungary

²

Department of Andragogy and Community Education, Eszterhazy Karoly Catholic University, 3300 Eger, Hungary

³

Doctoral School of Education, Eszterhazy Karoly Catholic University, 3300 Eger, Hungary

^*

Author to whom correspondence should be addressed.

Educ. Sci. 2025, 15(9), 1226; https://doi.org/10.3390/educsci15091226

Submission received: 10 July 2025 / Revised: 9 September 2025 / Accepted: 15 September 2025 / Published: 16 September 2025

(This article belongs to the Special Issue AI-Enhanced Didactics: Transforming Education Through Intelligent Technologies and Fostering 21st Century Skills)

Download Versions Notes

Abstract

This study examines the structure of student argumentation on artificial intelligence (AI) within the framework of the Toulmin model. We analyzed essays on AI written by 452 Hungarian secondary school students, coding for the presence of the six Toulmin components (claim, data, warrant, backing, qualifier, rebuttal). The results show that students frequently use fundamental argumentation components such as claim, data, and rebuttal. However, elements that provide deeper, more nuanced argumentation, such as backing and qualifiers, appear rarely. Using hierarchical cluster analysis, we identified three distinct argumentation profiles: Critical Arguers, who construct complex structures that also reflect on counterarguments; Minimal Arguers, who follow a simplified, primarily claim-based strategy; and Direct Rebutters, who employ a confrontational style of argumentation that omits the warrant but focuses on rebuttal. Based on our findings, we propose differentiated pedagogical strategies to foster the development of critical thinking in students with different argumentation styles.

Keywords:

written argumentation; argument structure; secondary school students; cluster analysis

1. Introduction

This study undertakes a Toulmin model-based analysis of student argumentation on artificial intelligence. The topic is timely, as the rapid development and proliferation of AI have brought about a radical transformation in all areas of life, including domains previously considered exclusively human. Although the technology has long been widely used, embedded in various applications, for most of society, the presence of artificial intelligence only became a conscious reality in 2022 with the debut of ChatGPT (version 3.5). This has sparked a lively debate in both public and academic discourse about AI and its positive and negative consequences. When narrowing the phenomenon to the dimension of education, it is a common assertion that education can be made more effective through interactive simulations and personalized learning (Cserkó et al., 2024; Kővári et al., 2024; Rajcsányi-Molnár et al., 2024a, 2024b, 2024c). It is also frequently claimed that students use the tool effectively to support their learning (Lindbäck et al., 2025; Zulkarnain & Mansor, 2024). At the same time, student use of AI raises serious ethical concerns (Montenegro et al., 2024), for instance, in determining the true author of school papers (Krašna & Gartner, 2024). The technology is therefore ubiquitous, affecting students’ lives both inside and outside of school. Consequently, it is particularly important for educators to know what students think about AI. This knowledge is necessary to develop students’ understanding and critical attitude towards the technology, and also to adapt the technology itself to their needs. Analyzing student arguments and argumentation techniques provides insight into students’ knowledge, perceptions, and attitudes regarding AI. A knowledge gap exists in the analysis of student opinions through the lens of argumentative consistency, as claims made without backing and evidence are merely beliefs that can be revised through education. The Toulmin model is an effective method for investigating argumentation.

Given the rapid proliferation of AI, particularly generative models like chatbots, the central pedagogical challenge has shifted from merely integrating these tools to ensuring their responsible and critical use. As the Special Issue’s call highlights, improper use can lead to the uncritical acceptance of erroneous data, biases, and ethical issues. Therefore, training students in the appropriate use of AI is paramount. This training, however, cannot begin without a foundational understanding of students’ existing argumentative skills. Before educators can design effective AI-enhanced learning environments or develop curricula for AI literacy, they must first answer a fundamental question: How do students reason and argue without the aid of these tools when confronted with a complex, technology-related topic?

This study addresses this foundational gap. By examining the spontaneous argumentative structures students employ when writing about AI, we provide a crucial diagnostic baseline. Our research is not about the application of an AI tool but rather about understanding the cognitive prerequisites for its critical use. The identification of students’ inherent argumentation profiles serves as a necessary first step for developing the targeted didactic sequences and instructional scaffolding that future AI-based tutoring systems will require. In this sense, our work contributes to the application of AI in its “broadest sense” by providing the empirical groundwork needed to build effective, engaging, and equitable educational environments in the age of AI.

2. Theoretical Framework

2.1. The Concept and Significance of Argumentation

The importance of forming persuasive arguments was recognized as early as Aristotle, who distinguished three interconnected principles of persuasion: logos (logical reasoning), ethos (the speaker’s authority and credibility), and pathos (emotional impact and expressiveness). Persuasive argumentation is an integral part of everyday communication and thinking. It appears in decision-making, negotiations, and constructive civic dialogue alike. Argumentation promotes a deeper understanding of problems and develops the critical analytical skills necessary for effective decision-making (Guo et al., 2023). Argumentation is a fundamental tool of communication and thought that enables the examination of various topics from multiple perspectives (Darmawansah et al., 2025). Its central element is the claim, the acceptance of which is the goal (Wambsganss et al., 2025). The process of argumentation is complex, meaning it is composed of several elements: an individual formulates claims based on evidence and facts, evaluates the validity of the evidence (Ayalon & Hershkowitz, 2018), justifies the connection between the claim and the evidence, and critically analyzes the constructed arguments (Martin et al., 2024). Moreover, effective argumentation involves not only the well-founded development of one’s own arguments but also the knowledge, presentation, and addressing of counterarguments (Noroozi et al., 2023a; Qin et al., 2025).

Scientific argumentation—a specialized form of general argumentation—is also characterized by rational thinking within a specific scientific discipline. Its elements include a scientific claim (a statement or conclusion), scientific data (evidence) supporting the claim, and a warrant, which explains why the data serve as evidence for the claim. The purpose of scientific argumentation is to support or refute scientific claims with evidence and reasoning (Lin & Hung, 2025). The argumentation process involves constructing connections between scientific claims and evidence, their critical analysis (Haudek & Zhai, 2024), and developing scientific explanations for empirical evidence and phenomena. Other elements of scientific argumentation, such as posing questions, analyzing and interpreting data, developing and applying various models, and using evidence, also appear in classroom practice.

In school settings, argumentation performance can be developed through methods such as debates or AI-based learning approaches (Loyens et al., 2023). The significance of argumentation skills is evident in foreign language learning (Darmawansah et al., 2025; Nusivera et al., 2025), in chemistry and other science subjects (Martin et al., 2024), and also in academic writing (M. K. Kim et al., 2022). Students acquire the methods of constructing and critically evaluating argumentation by formulating claims, evidence, and reasoning (García-Carmona, 2025). However, a lack of critical thinking skills presents a challenge for teachers in developing analytical reasoning (Annamalai, 2025). The development of well-founded arguments can be hindered by several factors: insufficient knowledge of argumentation construction techniques and elements (Latifi et al., 2023), a lack of the cognitive and social skills necessary for effectively applying existing argumentation knowledge (Kerman et al., 2024), and a lack of domain-specific knowledge in a given field (Valero Haro et al., 2019).

2.2. Perceptions of Artificial Intelligence in Education

Based on our scoping review of research examining opinions and experiences related to artificial intelligence in teaching and learning, we have categorized the findings into two main groups. The first group analyzes the attitudes and perceptions of students—primarily those at the university level—regarding artificial intelligence.

The vast majority of research exploring student opinions about AI targets university students, with less focus on younger cohorts. In this area, our literature review identified three major themes: how AI is used, its perceived benefits, and the difficulties associated with its use, along with common misconceptions among students. A significant portion of the studies examining the use of AI in learning showed that students generally have a positive but critical attitude toward the technology. For example, in a classroom debate setting, the majority of students prefer learning with the help of AI over traditional educational methods (Zulkarnain & Mansor, 2024). Higher education students generally view AI as an inspiring tool, seeing it as helpful for mastering course material, developing problem-solving skills, and enabling personalized learning (Lindbäck et al., 2025). As for the intention to use ChatGPT, it is influenced by the technology’s perceived usefulness and credibility, as well as by hedonic motivation (Abdi et al., 2025; Hussain & Anwar, 2025). Students typically perceive the algorithms’ operation as fast and efficient (Aguiar et al., 2025), but they rate ChatGPT’s performance as only moderately positive in terms of argumentation, accuracy, evidence gathering, openness, and systematicity (Oliveira et al., 2025). They view the integration of the technology into classroom education positively (Maqbool et al., 2025) and find working with virtual learning agents to be humorous and authentic (Dai et al., 2024). Among the challenges cited, students deem a critical approach necessary when using the technology (Lindbäck et al., 2025) and worry that AI might create distance between them and their teachers (Montenegro et al., 2024).

Belghith et al. (2024) analyzed the free-form communication of 24 secondary school students with ChatGPT. The results indicated that students mainly discussed personal topics (school, study-related subjects, hobbies) and social issues (such as AI itself, current world events, and problems). Regarding misconceptions among students, the authors identified erroneous ideas in three categories: what AI is, what knowledge it possesses, and how it works. The findings revealed that common misconceptions included anthropomorphizing the technology, making flawed comparisons to other tools or entities, and holding false assumptions about its intelligence, capabilities, and modes of interaction.

Rong et al. (2024) tested a five-day theoretical and practical AI-based course designed to enhance children’s thinking skills among elementary school students (N = 36). According to the results, 75% of respondents said AI helped them think from new perspectives, enrich their opinions on various subjects, and think “outside the box”. 56% said it enhanced their creativity, encouraged them to try new techniques and methods, deepened their understanding of concepts, and enabled “live” learning. 53% felt that incorporating AI into the course had a novelty effect and aided comprehension. According to the authors, 27% of students felt that using AI departed from the serious atmosphere of traditional classroom lessons and that the technology tended to distract them from the curriculum (Rong et al., 2024).

Valeri et al. (2025) examined students’ (N = 453) experiences with ChatGPT in the context of STEM subjects. The results showed that 84% of students believe the school use of AI tools will become more common in the coming years, and 69% consider it important to discuss the technology’s use in school. 65.3% believe AI helps in understanding complex concepts in STEM subjects, and 64.7% would like to learn more about how to use AI to support their studies. However, only 20.4% thought AI could replace traditional teaching materials. The authors also pointed out that the effective use of AI requires certain knowledge about how the technology works, and one must be aware that it is prone to inaccuracies and hallucinations. The interviewed students believed that AI is less effective in certain types of tasks (e.g., calculations). They were unaware of how machine learning works or how AI generates responses, and they primarily described the technology as a search engine. The young people stated that they were initially skeptical of ChatGPT, a skepticism that has since eased, but they still treat the content generated by the platform with reservations (Valeri et al., 2025).

To gain a more complete understanding of the topic, it is instructive to briefly review teacher attitudes alongside student perceptions, as educators play an important role in the educational integration of the technology. A significant portion of teachers already actively use AI tools like ChatGPT (Krašna & Gartner, 2024) and generally have a positive view of its potential: they primarily see it as a tool for innovating pedagogical methods, engaging students more effectively (Annamalai, 2025), and creating content (Clos & Chen, 2024). Alongside these positive attitudes, however, they also express serious concerns that are closely related to students’ learning processes. These include the difficulty of identifying plagiarism and authorship (Krašna & Gartner, 2024), maintaining teacher credibility, and, most importantly, students’ uncritical reliance on generated content (Del A. Mundo et al., 2024). These concerns highlight that for the responsible and effective use of the technology, it is essential to develop students’ critical thinking and argumentation skills, which, in our view, underscores the importance of our research.

2.3. Theoretical Rationale for Examining Background Variables

In educational research, it is a well-established assumption that students’ academic skills are significantly influenced by their sociocultural background. Numerous studies have consistently demonstrated that socio-economic status (SES)—of which parental education is a key component—plays a significant role in shaping academic achievement outcomes (Munir et al., 2023). Higher parental education is often associated with greater access to educational resources and a more supportive home learning environment, which can foster the development of complex cognitive skills such as argumentation. Therefore, one of the aims of our study was to investigate whether this well-documented relationship holds true in the context of argumentation about a novel and rapidly evolving topic like artificial intelligence. Similarly, the role of gender in argumentation is a topic of ongoing debate. While some studies suggest that female students may outperform males in writing tasks, the literature on argumentative writing shows conflicting and complex findings, with differences often depending on the context and the specific skills being measured (Noroozi et al., 2023b). Given these inconsistent findings, our second research question (RQ2) also aimed to explore the potential influence of gender on students’ argumentation profiles. Our study thus contributes to this ongoing discussion by examining these demographic factors in a novel content domain.

2.4. Statement of Contribution

A common feature of the aforementioned studies is that while they discuss opinions and arguments about artificial intelligence, they do not address the mechanics of argumentation itself, nor its pedagogical and psychological implications. Therefore, our research focuses on the argumentative structure of student essays on artificial intelligence. Our study contributes to a deeper understanding of the relationship between argumentation skills and background variables in the context of attitudes toward digital technologies. The study seeks to identify which argumentative components represent student strengths and which require further development. Our research questions are as follows: What patterns can be identified in student argumentation regarding artificial intelligence? (RQ1) How do demographic variables (gender, parental education level, age, settlement type, planned occupation) influence students’ argumentation skills? (RQ2).

2.5. The Toulmin Model of Argumentation

In his 1958 work The Uses of Argument, Stephen Toulmin criticized traditional formal logic for being, in his view, too abstract and detached from the logic of practical reasoning. Toulmin proposed an approach to the study of argumentation that considers its context-dependent and procedural nature. A key element of his concept is field-dependency, which posits that the evaluation criteria for an argument are not universal but vary depending on the subject area—a concept closely related to Wittgenstein’s theory of language-games (Zarębski, 2024). The Toulminian approach thus stands in contrast to the geometric model of formal logic, which considers the form of arguments to be primary, regardless of their content or context (van Eemeren et al., 2021).

The Toulmin model distinguishes six fundamental components in the analysis of argumentation: claim (C), data/grounds (D), warrant (W), backing (B), qualifier (Q), and rebuttal (R). The claim is the position or conclusion that the arguer seeks to prove or have accepted. The data/grounds are the specific evidence, information, or observations on which the claim is based. The warrant is the implicit or explicit principle, rule, or connection that explains why the data lead to the claim. The backing consists of additional information, sources, or authorities that support the credibility or applicability of the warrant. The qualifier indicates the strength, certainty, or scope of applicability of the claim (e.g., “probably”, “generally”). The rebuttal comprises counterarguments, exceptions, or circumstances that could weaken or limit the validity of the claim.

These components function as a system, mutually supporting or weakening each other. In other words, the Toulmin model goes beyond examining simple pro and con arguments and also considers the rational and non-rational aspects of argumentation. For example, elements such as backing often rely on values and linguistic constructs that lack a strictly rational basis but are nonetheless fundamental to constructing persuasive arguments: both rational and non-rational factors (such as the values and language use of a scientific community) play a role in accepting or rejecting theories. Today, the Toulmin model also serves as a general framework (Bhardwaj, 2025): it is widely recognized and applied in the fields of rhetoric and communication theory (Al Fraidan, 2025; Duran, 2024; Liu & Xiong, 2024; Nera et al., 2024; Rapanta et al., 2025; Yu & Chen, 2023). Toulmin’s model has also influenced educational research (Gómez-Blancarte & Tobías-Lara, 2023; el Majidi et al., 2021; Zhang & Browne, 2023), and consequently, it provides not only a theoretical framework but also a sound methodology and a practical tool for the complex analysis of student argumentation on artificial intelligence.

3. Materials and Methods

3.1. Procedure and Ethical Considerations

Data were collected via an online questionnaire. After obtaining consent from the schools, parents were informed about the data collection and the option to refuse participation. Students completed the questionnaire during class under standardized conditions, supervised by a teacher. They had no contact with the researchers. Students were also given prior information and were aware that they could withdraw from the process at any time. Participation was not associated with any compensation, benefit, or disadvantage. Participation was anonymous and voluntary; by completing the questionnaire, students consented to their answers being used for research. All present students consented to participate; thus, the only students missing from the sample were those absent on the day of data collection. The data collection method prevented multiple submissions. No personally identifiable data were collected. The research ethics approval number is RK/1560/2024.

3.2. Participants and Sampling

A total of 15 secondary schools were selected from three of Hungary’s 19 counties. From each participating school, one class was randomly selected. This sample was expanded through the assistance of student teachers, who distributed the questionnaire to their classes at their respective internship sites, resulting in responses from five additional institutions. The final sample therefore comprised 20 classes from 20 different secondary schools across various counties in the country. The number of respondents was 463. During data cleaning, one respondent was removed (due to direct contact with the researchers). The dataset suitable for analysis contained responses from 462 individuals. From this, an additional 10 responses were excluded during coding due to being exceptionally short or ambiguous. Thus, the final analysis sample consisted of 452 participants. The demographic characteristics of the sample were as follows: the mean age was 16.1 years (SD = 1.18; range: 14–19 years). Of the participants, 63.1% (N = 285) were female and 36.9% (N = 167) were male. The distribution by place of residence was: 176 individuals (38.9%) residing in villages and 276 (61.1%) in cities.

3.3. Measures

The online questionnaire collected data on the respondent’s gender, parents’ (or guardians’) educational attainment, age, settlement type (village, city, or capital), and planned occupation. Parents’ educational attainment was measured on a ten-point ordinal scale corresponding to the levels of the Hungarian education system. Given the scale’s resolution and the two data points (for parents or guardians), the values were averaged and treated as a scale-level variable in the dataset. Regarding the type of planned occupation, the following options were provided (in a mixed order to mitigate potential biases, such as social desirability, that could arise from an ascending sequence): (1) manual or skilled labor not necessarily requiring a high school diploma (e.g., roofer, maintenance worker, kitchen/garden work, healthcare worker other than a doctor or pharmacist); (2) work requiring a secondary education degree (e.g., salesperson, hospitality/tourism worker, law enforcement officer, administrator, organizer, office work, accountant); (3) non-graduate managerial work requiring some form of post-secondary training/self-education (e.g., owning a business, team leader); (4) employee with a university degree (e.g., teacher, IT specialist, pharmacist, doctor); and (5) manager with a university degree and postgraduate qualifications (e.g., senior manager in a company, leadership position in public administration, scientist, etc.). Finally, the questionnaire prompted students to provide an open-ended response detailing their opinion on AI-generated content (e.g., articles, videos, music, news, school essays).

3.4. Data Analysis

The six main components of the Toulmin model were identified by three independent coders—all PhD students in educational sciences—using a detailed coding guide. The coding was conducted in two rounds. In the first round, after familiarizing themselves with the guide and the Toulmin model, the coders independently coded an initial set of 50 essays. They then met to discuss and resolve any discrepancies. This discussion served as the coders’ training. In the second round, the coders coded the entire corpus. Subsequently, Fleiss’ kappa was calculated to measure inter-rater reliability. As three coders were involved, a fourth was not required to resolve disagreements; instead, the majority code was used for the final dataset. Inter-rater reliability was excellent, with Fleiss’ kappa values for the 12 coded items (κ_C, κ_Cexplicit, κ_D, κ_Dexplicit, κ_W, κ_Wexplicit, κ_B, κ_Bexplicit, κ_Q, κ_Qexplicit, κ_R, κ_Rexplicit) ranging from κ = 0.817–1, CI_95% [0.764, 1]. This indicates that even the lowest value (κ_Bexplicit = 0.817, CI_95% [0.764, 0.869]) surpassed the conventional 0.70 threshold.

As the Toulmin model has been successfully adapted for examining the argumentation of children and adolescents, we adopted the approach of Rapanta et al. (2025) in developing our analytical framework. For each component, we examined its presence (0 = not present, 1 = present) and explicitness (0 = implicitly present, 1 = explicitly present). An element was coded as explicit when its function was signaled by specific linguistic markers (e.g., conjunctions like ‘because’ or ‘therefore’) or its logical role was overtly stated. In contrast, an element was considered implicit when it was not directly stated but could be clearly and necessarily inferred from the context or the juxtaposition of other statements. For example, in the sequence “AI will take many jobs (Claim). It can already write university-level essays (Data)”, the warrant connecting the data to the claim is implicit. An explicit warrant would be “AI will take many jobs because its ability to write essays proves it can perform high-level tasks.” The absence of a component naturally precluded the analysis of its explicitness.

Furthermore, an argument was considered present if the student’s essay contained at least a claim or data. If neither was present, the response was coded as not containing argumentation. This was the case even if a qualifier was present, as in the statement, “I don’t think it’s a good idea.” A total of ten such responses, deemed too short or ambiguous, were excluded from the analysis. All other responses were of appropriate length, typically 5–25 sentences. The final dataset for analysis contained data from 452 individuals.

Based on the coding of the six main components, two quality dimensions were created. First, argumentation complexity was rated on a three-point scale (1 = low, 2 = medium, 3 = high) (the “complexity” variable). An argument was considered low complexity if only a claim and/or data were present. It was classified as medium complexity if, in addition to claim and data, a warrant, backing, or qualifier also appeared. It was rated as high complexity if a rebuttal was present, alongside at least one of the core elements (claim or data). Second, argumentation explicitness was also rated on a three-point scale (1 = low, 2 = medium, 3 = high), based on the number of argumentative elements that appeared explicitly (the “explicitness” variable). It was considered low explicitness if one element was explicit, medium if two elements were explicit, and high if three or more elements were explicit. The values for complexity and explicitness were not coded directly but were derived automatically from the coding of the six base categories (see examples in Table 1), ensuring the consistency and objectivity of this process. The coding manual, which details the coding criteria with detailed examples, and the raw data are available in a public repository.

The analysis strategy was designed to answer the two research questions. To address the first research question (RQ1: What patterns can be identified in student argumentation regarding artificial intelligence?), we proceeded in several steps. First, descriptive statistics were calculated for the presence and explicitness of the six core Toulmin components. Additionally, we examined the distribution of the two derived quality dimensions: Argumentation Complexity and Explicitness. To identify the combinations of components actually occurring in student argumentation, a new variable was created to represent the unique set of explicit components present in each essay. The frequency distribution of this variable was then used to identify the most common argumentation patterns. Finally, to group individual students based on the similarity of their overall argumentation profile, a hierarchical cluster analysis, based on the six binary Toulmin component variables, was performed. It is important to note that our aim with this analysis was exploratory: we sought to create an empirical typology of students by grouping them into distinct argumentation profiles. This approach allowed us to reduce the complexity of the 15 observed argumentation patterns into a smaller number of meaningful categories for subsequent analysis with demographic variables. Given this exploratory goal of typology creation, hierarchical cluster analysis was deemed an appropriate method. The goal was not to identify latent, unobservable psychological constructs, for which other methods like Latent Class Analysis would be more suitable. To answer the second research question (RQ2: How do demographic variables (gender, parental education level, age, settlement type, planned occupation) influence students’ argumentation skills?), we employed Chi-square tests of independence. For all statistical analyses, the significance level was set at p < 0.05. Data analysis was performed using IBM SPSS Statistics 27.

4. Results

The first research question sought to determine what patterns could be identified in student argumentation about artificial intelligence (see Table 2).

The results indicate that students frequently employed fundamental argumentation components. Claims and their supporting data/grounds appeared in nearly every essay (100% and 89.8%, respectively), and the vast majority were presented in an explicit form (92.5% and 91.9%, respectively). Warrants were also explicitly present in more than half of the texts (56.2%), occurring in 85.4% of the cases where they were used, demonstrating students’ ability to establish logical connections. However, while rebuttals constituted a significant part of student argumentation (67.5% presence, with 74.1% explicitness within this group), the elements that add depth and sophistication to an argument, such as backing and qualifiers, appeared far less frequently. Qualifiers were present in only 7.3% of cases, though when they did appear, they were typically explicit (90.9%). Backing was the least frequently identified component (2.4%), and in the rare instances it appeared, it was predominantly (72.7%) implicit.

More than two-thirds of the essays (67.5%) fell into the high complexity category, meaning that the majority of students did not just employ basic argumentative structures (claim, data) but were also able to integrate rebuttals into their reasoning. The proportion of medium (13.7%) and low complexity (18.8%) arguments was lower. In terms of argumentation explicitness, the high level was also dominant, with 60.8% of essays in this category. This indicates that the majority of students explicitly articulated three or more argumentative elements. The proportion of medium-explicitness arguments was 22.6%, while low explicitness (where only one element was explicit, or only implicit elements were used) characterized 16.6% of the texts.

For a more in-depth exploration of argumentation patterns, we created a variable that uniquely identifies the combination of explicitly appearing Toulmin components (C, D, W, B, Q, R) for each essay. Table 3 presents the frequency distribution of the identified argumentation patterns across the 452 student essays.

The most prevalent argumentation pattern was C + D + W + R (Claim + Data + Warrant + Rebuttal), which appeared in over a third of the essays (36.9%). This was followed by C + D + R (Claim + Data + Rebuttal) at 20.1%, C + D + W (Claim + Data + Warrant) at 12.2%, and C + D (Claim + Data) at 11.9%. Each of the four most common patterns contains a claim and supporting data, and three of them also include a rebuttal. This underscores the central role of rebuttal in students’ arguments about AI, as well as their preference for both basic (C + D, C + D + W) and critically oriented (C + D + R, C + D + W + R) structures. In contrast, patterns that would have added only backing (B) or backing and a qualifier (B + Q) to the core C + D + W structure (e.g., C + D + W + B or C + D + W + B + Q) did not appear at all in the data. The most complex structure, containing all six Toulmin elements (C + D + W + B + Q + R), proved to be extremely rare, identified in only 0.7% of cases. The results in Table 3 show that while theoretically numerous (2⁶ = 64) component combinations are possible, in practice only 15 different patterns emerged in the 452 essays. This concentration suggests that students employ a limited set of typical argumentative schemas. Based on the five most common patterns (C + D + W + R, C + D + R, C + D + W, C + D, C) and an “Other” category consolidating the rarer patterns, we created a categorical variable to serve as the basis for subsequent analyses involving background variables (RQ2).

However, to move beyond examining only the most frequent, predefined patterns and instead group students based on the similarity of their entire component profile (the presence/absence of all six Toulmin elements), we performed a hierarchical cluster analysis. This exploratory, data-driven (bottom-up) approach allows for the identification of empirically derived clusters (i.e., argumentation styles) that may not align with the most common individual patterns but rather comprise students who, despite exhibiting rarer patterns, share a similar overall argumentative strategy. Our goal, therefore, was to explore similarities and differences among individual cases (students) based on their six-component argumentation profile, without making a priori assumptions about specific patterns. For the analysis, we applied Ward’s hierarchical clustering method with a squared Euclidean distance measure, based on the 6 binary (0 = not present, 1 = present) Toulmin component variables. An examination of the dendrogram indicated that a three-cluster solution was the most meaningful and interpretable. The selection of the optimal number of clusters was based on a multi-step evaluation. First, a visual inspection of the dendrogram indicated that two, three, and four-cluster solutions were the most viable options. We then assessed each of these solutions based on their theoretical coherence and pedagogical relevance. The two-cluster solution offered a simple dichotomy, separating students into ‘low-complexity’ and ‘higher-complexity’ arguers. While defensible, this solution lacked the necessary nuance to capture the qualitative distinctions between different argumentation styles. In contrast, the three-cluster solution provided the most distinct and meaningful result. The resulting profiles—‘Critical Arguers’, ‘Minimal Arguers’, and ‘Direct Rebutters’—were qualitatively different not just in the quantity of argumentative elements but in their underlying strategic approach, making them highly relevant from a pedagogical standpoint. Proceeding to a four-cluster solution did not yield an additional, theoretically interpretable group. Instead, the fourth cluster emerged as a small subgroup that appeared to be a variant of the ‘Critical Arguers’ rather than a distinct argumentative style. Therefore, the three-cluster solution was selected as it provided the most informative and robust framework for addressing our research questions (see Table 4). To quantitatively validate this decision, we performed a post hoc internal validation using silhouette analysis. The analysis yielded an average silhouette coefficient of 0.57, indicating a reasonable structure in the data. The individual clusters also showed adequate cohesion and separation (Cluster 1: 0.49; Cluster 2: 0.40; Cluster 3: 1.00), which provides further empirical support for the validity of our three-cluster solution. This approach further nuances the answer to the first research question by not only identifying frequent patterns but also identifying distinct groups of students who follow characteristically similar argumentation strategies.

The total sample (N = 452) was distributed among the three clusters as follows: Cluster 1 contained 57.5% of the students (N = 260), Cluster 2 contained 22.4% (N = 101), and Cluster 3 contained 20.1% (N = 91).

The arguments of students in the first cluster almost always featured a claim, data, and a warrant (98–100% presence), and they frequently employed rebuttals as well (77%). In contrast, the use of backing (3%) and qualifiers (13%) was negligible in this group. This profile suggests a thorough, logically constructed argumentation style that reflects on counterarguments but places less emphasis on deeper justification or nuance. We labeled this cluster “Critical Arguers”.

Members of the second cluster consistently formulated claims, but only about half (56%) used data. Remarkably, they used no explicit warrants or qualifiers (0%), and rebuttals appeared only rarely (14%) in their arguments. This pattern reflects an extremely simplified argumentation strategy based primarily on assertions and occasional evidence. We labeled this cluster “Minimal Arguers”.

The third group, with its distinct profile, was distinguished by the consistent use of claims, data, and rebuttals (100% presence). However, similar to the second cluster, its members used no explicit warrants, backing, or qualifiers (0%). This suggests a very direct, confrontational argumentation style where the emphasis is on supporting the claim and immediately refuting counterarguments, without articulating the underlying logical connections (warrants). We labeled this cluster “Direct Rebutters”.

To determine whether the three identified argumentation styles were associated with students’ demographic backgrounds, we conducted Chi-square tests of independence. To ensure the validity of the Chi-square tests by avoiding cells with low expected frequencies, several demographic variables were collapsed into fewer categories. Specifically, parental education level, originally measured on a ten-point scale, was collapsed into three categories: Low (parents did not complete secondary education), Medium (parents completed secondary education), and High (parents hold a higher education degree). Planned occupation was collapsed into three broader categories: Non-graduate jobs (combining manual/skilled labor and work requiring a secondary degree), Graduate employees, and Graduate managers. Age was grouped into three categories: Early adolescence (14–15 years), Mid-adolescence (16–17 years), and Late adolescence (18–19 years). Settlement type was collapsed into two categories as originally measured: Village and City. The results indicated no statistically significant association between cluster membership and student’s gender (χ²(2) = 0.037, p = 0.982, V = 0.009), parental education level (χ²(4) = 5.151, p = 0.272, V = 0.075), age (χ²(4) = 5.319, p = 0.256, V = 0.077), or settlement type (χ²(2) = 0.402, p = 0.818, V = 0.030). The Pearson Chi-square test for the association between planned occupation and cluster membership reached the conventional significance threshold (χ²(4) = 9.469, p = 0.050). However, the Cramér’s V value indicating the strength of the association was low (V = 0.102), suggesting the association is of little practical significance. In sum, the demographic variables examined did not show a strong or clear association with cluster membership. The very low Cramér’s V values and non-significant p-values (with one marginal exception) indicate that these demographic factors were not significant predictors of cluster membership. The assumptions for the Chi-square tests were met in all cases.

A detailed examination of the crosstabulation percentages also revealed no distinct patterns suggesting that students with different career aspirations prefer markedly different argumentation styles, although the proportion of “Critical Arguers” (Cluster 1) was somewhat lower among those interested in “Non-graduate jobs” (42.9%) compared to “Graduate employees” (61.4%) or “Graduate managers” (58.5%). Overall, the demographic variables examined—with the exception of a weak trend related to planned occupation—did not show a strong or clear association with which of the component-based clusters the students belonged to.

The second research question (RQ2) aimed to explore how students’ demographic background and the general qualitative characteristics of their arguments (complexity, explicitness) influence the argumentative structures they employ.

To examine the relationships between demographic variables and argumentation characteristics, we conducted Chi-square tests of independence and correlation analyses. To ensure the applicability of the Chi-square tests, the categories of several demographic variables (parental education, academic performance, planned occupation, age, settlement type) were collapsed to avoid cells with low expected frequencies. Parental education, academic performance, planned occupation, and age were organized into three categories, while settlement type was organized into two, based on their distribution and interpretability.

Subsequently, we directly examined the association between the demographic variables and the six-category variable representing the most common argumentation patterns. The Chi-square tests of independence revealed that neither gender (χ²(5) = 2.843, p = 0.724; V = 0.079), parental education level (χ²(10) = 13.231, p = 0.211; γ = 0.048), planned occupation (χ²(10) = 12.081, p = 0.280; γ = 0.037), age (χ²(10) = 10.920, p = 0.364; γ = 0.079), nor settlement type (χ²(5) = 2.579, p = 0.765; γ = 0.032) had a significant association with the distribution of argumentation patterns. (Where applicable, Gamma values also indicated a weak or non-existent association; the assumptions for the Chi-square tests were met in all cases after collapsing the categories.) These results collectively suggest that, in our study, the primary demographic factors did not play a decisive role in the specific argumentative structures students preferred.

Finally, to assess the relationship between the general qualitative characteristics of the arguments, we calculated Spearman’s rank correlation between structural complexity and argumentation explicitness. The analysis revealed a moderately strong, positive, and statistically highly significant correlation (r_s = 0.577, p < 0.001). This indicates that structurally more complex arguments were also typically more explicit, and vice versa. The two quality indicators are thus intertwined, collectively characterizing the sophistication of an argument.

5. Discussion

Regarding the components used in the arguments, our analysis shows that the vast majority of arguments contained four of the six components. All arguments included a claim, even if 7.5% were implicit, and a significant proportion (89.8%) also featured data/grounds. In terms of the high prevalence of claims and data/grounds, our findings are consistent with previous research using the Toulmin model to examine young people’s argumentation on other topics (Jackson, 2024; Santoso et al., 2022; Wallander, 2022). The student essays also contained a significant proportion of rebuttals (67.5%) and warrants (56.2%)—a finding that diverges from the much lower rates reported in previous studies (Alcock & Attridge, 2025; Härmä et al., 2021; Jackson, 2024; de Waard et al., 2020). Thus, the majority of students argued both for and against a position. However, the essays rarely contained qualifiers or backing, which aligns with previous research findings (Santoso et al., 2022; de Waard et al., 2020; Wallander, 2022). This could be due to a lack of familiarity with more sophisticated argumentation techniques, but the complexity of the topic and its limited connection to the school curriculum may also be contributing factors (Alcock & Attridge, 2025). The strikingly low occurrence of backing and qualifier elements suggests a pattern rather than a random occurrence. This phenomenon can likely be attributed to several mutually reinforcing factors with deep pedagogical, thematic, and developmental roots. First, this pattern may reflect pedagogical influences. Secondary education often prioritizes the development of debate skills, which are built upon the dynamic triad of claim, data, and rebuttal. In contrast, scientific or source-based argumentation, which demands the conscious use of backing and qualifiers, represents a much more complex cognitive task, the mastery of which is typically an expectation reserved for higher education elvárássá (Hirvela, 2017; el Majidi et al., 2023; Yang & Pan, 2023). Second, the nature of the topic itself—artificial intelligence—likely contributes to this pattern. As a rapidly evolving field that is relatively new to students, AI offers few established, “authoritative” sources of knowledge they can confidently cite as backing. Their knowledge is often derived from popular media, personal experience, and common misconceptions rather than from formal scientific discourse, making source-based backing difficult to construct (Belghith et al., 2024). Finally, the phenomenon may have roots in developmental psychology. The critical evaluation of sources and the self-reflective limitation of a claim’s validity require a higher-order epistemological understanding that develops gradually during adolescence and often does not reach a mature level by the end of secondary school (Barzilai & Weinstock, 2015; Kuhn et al., 2000). These three factors—pedagogical focus, the novelty of the topic, and the level of cognitive maturity—collectively explain why students tend to rely on a more direct argumentative schema, and why deeper, more nuanced logical structures remain underdeveloped.

The overwhelming majority (72.7%) of the few instances of backing were implicit, meaning that even most students who provided backing did not consciously employ this argumentative component. There was a stark difference in how students used various argumentative components. They employed rebuttals frequently and explicitly. In contrast, backing and qualifiers appeared far less often and were generally less developed. This pattern suggests that when arguing about AI, students tend to focus more on confronting counterarguments. They place less emphasis on providing deeper justification for their warrants (backing) or fine-tuning the scope of their claims (qualifiers). This trend raises the question of the extent to which the topic of AI elicits argumentative strategies from students that are debate-provoking yet lack depth.

The majority of the essays we analyzed (67.5%) were classified as having high complexity, a significant proportion when compared to previous research where low-complexity arguments dominated among 5–15-year-olds (Rapanta et al., 2025) and medium-complexity arguments were prevalent among secondary school students (el Majidi et al., 2021). Since higher-quality texts tend to contain more counterarguments, conflicting data, and rebuttals (Qin et al., 2025), our results indicate that most of the students in our sample were capable of discussing AI from multiple perspectives by applying complex argumentative structures that utilized several components. Both deeper knowledge about AI and age likely play a role in this complexity. Indeed, Rapanta et al. (2025) demonstrated that argumentation complexity increases with age, making it inherently higher in the secondary school cohort compared to younger students. Overall, it can be concluded that, compared to the few similar studies available, the student arguments we analyzed were remarkably complex, a finding that may be linked to the age of the participants.

Regarding component combinations, the analyzed arguments exhibited only 15 patterns out of a theoretically possible 64. Among these, only four reached a prevalence of at least 10%, and a single four-component pattern (C + D + W + R) was dominant. While Rapanta et al. (2025) found a prevalence of one- or two-component patterns in younger age groups, a four-component pattern (C + D + W + R) was the most common among the secondary school students we studied. Variations in the component patterns used by individual students might be explained by factors such as academic performance; for instance, Ho et al. (2019) showed that higher-achieving students used a wider variety of components than their lower-achieving peers. Our results suggest that students’ argumentative strategies are not randomly or evenly distributed across all possible structures but are concentrated around a few typical schemas. These arguments often contain a critical element (rebuttal) but less frequently incorporate deeper justification (backing) or nuance (qualifier). Real-world argumentation appears to favor a few dominant combinations, often integrating rebuttals, rather than a systematic and comprehensive application of all elements of the Toulmin model.

A particularly noteworthy finding of our study is the lack of a significant association between the examined demographic variables—such as gender and parental education level—and students’ argumentation styles. This result is surprising. It contrasts with extensive research showing that sociocultural background strongly predicts academic skills (Munir et al., 2023). While the literature on gender differences in argumentation presents a more complex and often contradictory picture (Noroozi et al., 2023b), the absence of any discernible effect in our sample is still remarkable.

This finding does not necessarily refute previous research but suggests a compelling alternative interpretation: argumentation about novel, contemporary topics like AI may be a more “democratic” skill. Unlike traditional academic subjects where family background and accumulated cultural capital can create significant advantages, AI is a domain where knowledge is new, rapidly evolving, and often acquired through informal, digital channels accessible to students from all backgrounds. This could level the playing field, making individual factors like cognitive abilities, personal interest, and digital literacy more influential than traditional sociodemographic markers. This result has important pedagogical implications, suggesting that topics rooted in current digital culture may provide an equitable context for developing and assessing critical thinking skills. Furthermore, the moderately strong positive correlation between structural complexity and explicitness indicates that more elaborate argumentation goes hand-in-hand with clearer formulation. Students who are able to use more argumentative components (such as rebuttals) also tend to express these elements clearly and unambiguously, which is an important hallmark of mature argumentative ability.

We classified the arguments into three clusters, each representing a different argumentative philosophy: a balanced-critical, a minimal, and a direct-confrontational approach. The majority of arguments (57.5%) belong to the first cluster, representing the most complex and balanced style. These are the “Critical Arguers”, who use nearly all core components: claim, data, warrant, and rebuttal. This indicates that the vast majority of students we studied are capable of constructing thorough, logically sound arguments that reflect on counterarguments regarding AI. However, this does not necessarily imply deep, knowledge-based justification, as their use of backing and qualifiers was low. The second cluster (22.4%) consists of “Minimal Arguers”, who use simpler structures—mainly claims and sometimes data, but little else. Although they always formulate a claim, just over half use data, and they employ no explicit warrants or qualifiers, with rebuttals being rare. These cluster members primarily rely on assertions and occasional evidence. The third cluster comprises one-fifth of the arguments (20.1%). These are the “Direct Rebutters”, who follow a very specific, narrow style: they always use a claim, data, and a rebuttal, but never a warrant or other refining elements. This is a confrontational, more direct approach that can be effective in certain contexts. The results reveal that students think differently, applying distinct approaches and strategies to construct arguments of varying complexity about AI. The effectiveness of these different argument types may be context-dependent.

5.1. Pedagogical Implications

The three distinct argumentation profiles identified in this research require targeted, differentiated pedagogical intervention. For the Critical Arguers, who are at the highest level, the goal is fine-tuning: enhancing the sophistication of their reasoning through the more conscious application of backing and the integration of qualifiers. This can be facilitated by tasks specifically designed to develop these skills. For example, instructors can assign activities that require students to find and integrate scholarly sources to support their warrants (backing), moving beyond personal opinion. Another effective strategy involves exercises where students must reformulate a claim using a range of qualifiers (e.g., from ‘always’ to ‘often’ or ‘in some cases’) and then debate how these modifiers alter the strength and scope of the argument. In contrast, the group of Minimal Arguers requires the development of foundational argumentation skills, with a focus on incorporating the warrant to connect data and claims. Effective tools for this include targeted exercises such as using sentence starters (e.g., ‘This evidence supports the claim because…’) or employing graphic organizers that visually map the logical path from data, through the warrant, to the claim. These strategies make the implicit connection explicit and provide a clear structure for students to follow. The third group, the Direct Rebutters, already possesses a critical mindset but requires development in constructive argumentation, specifically in strengthening the warrant. The goal is for them to not only rebut but also to provide clear explanations for their own positions. This can be best served by specific cooperative debate formats. For instance, employing a ‘constructive controversy’ activity, where pairs must first argue for one position, then switch sides to argue for the opposing view, and finally synthesize a shared solution from the strongest arguments of both sides. This process compels them to construct warrants for positions they may have initially disagreed with, strengthening their ability to provide explanations rather than just rebuttals.

Beyond specific argumentation development, our broader pedagogical recommendation is the systematic integration of the topic of AI into the curriculum, either as a standalone course or as an integrated part of IT education. In line with research identifying students’ knowledge gaps (Belghith et al., 2024; Del A. Mundo et al., 2024; Valeri et al., 2025), such a course should cover the operational principles of the technology, the basics of machine learning, an understanding of AI’s capabilities and limitations, and the ethics of its use. Special emphasis should be placed on the critical appraisal of AI-generated content. This is crucial because, as other research points out (Oliveira et al., 2025), students who rely too heavily on AI may experience a reduction in cognitive effort, which can lead to a decline in critical thinking and academic performance. Similar AI literacy courses targeting teacher competencies are also needed for educators and pre-service teachers (J. Kim, 2024). Finally, the theoretical framework used in our research, the Toulmin model, can itself serve as a valuable pedagogical tool. It can be used for the formative assessment of students’ knowledge and argumentation structures regarding AI, as it offers a clear and unambiguous set of criteria. It can also be effectively used in teacher education to develop the argumentation skills of future educators, preparing them to consciously shape the critical thinking of their own students.

5.2. Methodological Implications

The Toulmin model proved to be an effective and robust framework for analyzing student argumentation on artificial intelligence. Its advantage is that it goes beyond identifying superficial “pro and con” opinions and allows for the exploration of the underlying logical structure of an argument and the relationships between its components. However, we also encountered certain limitations in its application. The greatest methodological challenge was the reliable identification of implicit components—those that appear only allusively or latently in the text. While coding explicit elements is straightforward, coding implicit ones requires a higher degree of researcher interpretation, which can reduce the objectivity of the coding. The model is rooted in the Western, Anglo-Saxon tradition of argumentation. This raised the question of the extent to which argumentation norms—such as the role and acceptability of backing or rebuttal—might be culture-specific. Therefore, we propose the following recommendations for future research on similar topics. To ensure the replicability of the research and the comparability of results, it is essential to develop and publish a detailed coding manual that illustrates coding decisions with examples. This is particularly important for establishing the identification criteria for implicit components. To gain a deeper understanding of implicit argumentative elements, it may be worthwhile to combine text analysis with other methods. These could include recording think-aloud protocols or conducting in-depth interviews after the text has been written, which could shed light on students’ unstated thoughts and reasoning steps. For the international application of the model, a preliminary cultural adaptation of the framework is recommended. It would be beneficial to explore which argumentation norms and styles are dominant in the target culture and, if necessary, refine the interpretation of the components accordingly.

6. Limitations and Future Directions

It is important to highlight three main limitations of our research. First, the cross-sectional research design captured students’ argumentation skills at a single point in time. While this approach revealed existing patterns, it does not allow for tracking the development of argumentation skills over time. Understanding this would require longitudinal studies. Second, the analysis was limited to a single, specific topic: artificial intelligence. Although this is a current and debate-provoking topic, it is possible that students would employ different argumentative structures in other subject areas. Third, the study was conducted exclusively among Hungarian secondary school students, so the results are primarily valid within the Hungarian educational and cultural context; generalizing them to an international level requires caution. Despite acknowledging these limitations, we believe our study provides valuable insights into the argumentation patterns of Hungarian students and serves as a foundation for future, more extensive, and comparative research. Fourth, our data has a nested structure (students within schools) which our analysis did not account for. Due to the anonymous data collection protocol designed to protect participant privacy, school-level identifiers were not recorded, which precluded the use of multilevel modeling or the calculation of intraclass correlation coefficients (ICC). This limitation could increase the risk of Type I errors in statistical tests that assume independence. However, as our key findings for RQ2 were non-significant results, this issue is less likely to have altered our main conclusions regarding the lack of association between argumentation styles and demographic variables. Future research should address this nested structure by employing multilevel modeling (MLM). This would enable a more nuanced analysis of how both student-level characteristics and school-level factors (e.g., specific instructional practices) contribute to the development of argumentation skills. The student profiles identified in our study could serve as outcome variables in such models, offering deeper insights into the interplay between individual and contextual influences.

Several promising avenues for future research emerge. A noteworthy finding is that our analysis found no significant association between the examined demographic variables and argumentation styles, which may suggest a democratic nature of this skill. However, based on previous research (Ho et al., 2019), it can be assumed that other factors not measured in our study—such as academic performance, field of interest, prior knowledge of the topic, or attitude—might influence the quality and structure of argumentation. Exploring these factors clearly warrants further investigation. A particularly interesting area for research would be to examine how a given student’s argumentative structure changes or remains consistent when they are asked to form opinions on entirely different topics.

7. Conclusions

By applying the Toulmin model, our study successfully revealed that the argumentation of Hungarian secondary school students about artificial intelligence is not random but is organized into well-defined structures with internal logic. Our primary finding is the identification of three student groups representing different argumentative philosophies: the Critical Arguers, the Minimal Arguers, and the Direct Rebutters. Our findings provide a detailed picture of the argumentation patterns of secondary school students regarding digital technologies, which has significant implications for education in the era of artificial intelligence. While this study did not involve a direct AI intervention, its results offer a crucial foundation for the design of future AI-based educational tools and curricula focused on critical thinking. The identified argumentation profiles—the “Critical Arguers”, the “Minimal Arguers”, and the “Direct Rebutters”—can serve as data-driven student personas for developing personalized learning paths within intelligent tutoring systems. For instance, an AI-powered writing assistant could be designed to provide tailored scaffolding based on a student’s profile: it might prompt a “Minimal Arguer” to develop a warrant by asking “Why is this evidence relevant to your claim?”, while encouraging a “Critical Arguer” to nuance their position by suggesting the use of qualifiers. By understanding the typical weaknesses and strengths of students’ unaided reasoning, we can build more effective AI tools that facilitate targeted interventions and foster the critical thinking skills necessary to navigate an information landscape increasingly shaped by AI. This typology demonstrates that students are active thinkers employing different strategies as they grapple with the interpretation of a complex technological and social phenomenon. Our research also showed that students’ argumentation styles are not significantly influenced by demographic background variables, suggesting the relative independence of this competence from sociocultural background. The results also confirm that the majority of students are capable of critical reflection (evidenced by the frequent use of rebuttal), yet the deeper justification (backing) and fine-tuning (qualifier) of their arguments require further development. The contribution of this research is threefold. Empirically, it provides a detailed picture of the argumentation patterns of Hungarian students regarding digital technologies. Methodologically, it validates the effectiveness of the Toulmin model while also highlighting its challenges (e.g., coding implicit elements), thereby charting a course for future research. From a pedagogical perspective, the most important takeaway is that the identified argumentation profiles are not merely descriptive categories but can also serve as diagnostic tools that enable targeted, differentiated development. Our study confirms that in the age of new technologies, developing argumentation skills is pedagogically crucial. Understanding how students think and argue is an essential prerequisite for fostering a generation that is both critically minded and digitally competent.

Author Contributions

Conceptualization: M.T.; Methodology: M.T.; Software: M.T.; Validation: M.T.; Formal analysis: M.T.; Investigation: G.B., E.G., E.S.; Resources: Z.S.; Data Curation: M.T.; Writing—Original Draft: M.T., A.Z.K.; Writing—Review & Editing: M.T.; Visualization: M.T.; Project administration: Z.S.; Funding acquisition: Z.S. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by Eszterházy Károly Catholic University.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Eszterházy Károly Catholic University (Project identification code: RK/1560/2024, date: 15 June 2024).

Informed Consent Statement

In line with the approved ethical protocol, the participating schools provided prior information to the parents/legal guardians, who retained the right to refuse their child’s participation. The students’ assent to participate was obtained electronically. At the beginning of the survey, participants received comprehensive information about the research objectives, the voluntary and anonymous nature of their participation, and how the data would be used. They were explicitly informed that their participation was without remuneration and would result in no advantages or disadvantages. The voluntary completion of the questionnaire constituted their informed assent to participate.

Data Availability Statement

The data presented in this study are openly available in Harvard Dataverse at https://doi.org/10.7910/DVN/MMQNAB (accessed on 9 July 2025).

Acknowledgments

During the preparation of this manuscript, the authors used artificial intelligence to judge some linguistic correctness issues. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Abdi, A.-N. M., Omar, A. M., Ahmed, M. H., & Ahmed, A. A. (2025). The predictors of behavioral intention to use ChatGPT for academic purposes: Evidence from higher education in Somalia. Cogent Education, 12(1), 2460250. [Google Scholar] [CrossRef]
Aguiar, C., Carpenter, D., Vandenberg, J., Goslen, A., Min, W., Cateté, V., & Mott, B. (2025). “Like a GPS”: Analyzing middle school student responses to an interactive pathfinding activity. In J. A. Stone, T. Yuen, L. Shoop, S. A. Rebelsky, & J. Prather (Eds.), SIGCSETS 2025: Proceedings of the 56th ACM technical symposium on computer science education V. 2 (pp. 1353–1354). ACM. [Google Scholar] [CrossRef]
Alcock, L., & Attridge, N. (2025). Refutations and reasoning in undergraduate mathematics. International Journal of Research in Undergraduate Mathematics Education, 11(1), 25–54. [Google Scholar] [CrossRef]
Al Fraidan, A. (2025). Ai and uncertain motivation: Hidden allies that impact EFL argumentative essays using the toulmin model. Acta Psychologica, 252, 104684. [Google Scholar] [CrossRef]
Annamalai, N. (2025). Factors affecting English language high school teachers switching intention to ChatGPT: A Push-Pull-Mooring theory perspective. Interactive Learning Environments, 33(2), 1367–1384. [Google Scholar] [CrossRef]
Ayalon, M., & Hershkowitz, R. (2018). Mathematics teachers’ attention to potential classroom situations of argumentation. The Journal of Mathematical Behavior, 49, 163–173. [Google Scholar] [CrossRef]
Barzilai, S., & Weinstock, M. (2015). Measuring epistemic thinking within and across topics: A scenario-based approach. Contemporary Educational Psychology, 42, 141–158. [Google Scholar] [CrossRef]
Belghith, Y., Mahdavi Goloujeh, A., Magerko, B., Long, D., Mcklin, T., & Roberts, J. (2024). Testing, socializing, exploring: Characterizing middle schoolers’ approaches to and conceptions of ChatGPT. In F. F. Mueller, P. Kyburz, J. R. Williamson, C. Sas, M. L. Wilson, P. T. Dugas, & I. Shklovski (Eds.), CHI’24: Proceedings of the CHI conference on human factors in computing systems (pp. 1–17). ACM. [Google Scholar] [CrossRef]
Bhardwaj, A. (2025). The art of persuasion: Theorizing as argumentation. Strategic Organization. advance online publication. [Google Scholar] [CrossRef]
Clos, J., & Chen, Y. Y. (2024). Investigating the impact of generative AI on students and educators: Evidence and insights from the literature. In TAS’24: Proceedings of the second international symposium on trustworthy autonomous systems (pp. 1–6). ACM. [Google Scholar] [CrossRef]
Cserkó, J., Rajcsányi-Molnár, M., András, I., Benyák, A., Pongrácz, A., & Molnár, G. (2024, October 17–18). The transformation of digital culture and learning habits in higher education, digital methods and tools. 2024 IEEE 7th International Conference and Workshop Óbuda on Electrical and Power Engineering (CANDO-EPE) (pp. 109–114), Budapest, Hungary. [Google Scholar] [CrossRef]
Dai, C.-P., Ke, F., Zhang, N., Barrett, A., West, L., Bhowmik, S., Southerland, S. A., & Yuan, X. (2024). Designing conversational agents to support student teacher learning in virtual reality simulation: A case study. In F. F. Mueller, P. Kyburz, J. R. Williamson, & C. Sas (Eds.), CHI EA’24: Extended abstracts of the CHI conference on human factors in computing systems (pp. 1–8). ACM. [Google Scholar] [CrossRef]
Darmawansah, D., Rachman, D., Febiyani, F., & Hwang, G.-J. (2025). ChatGPT-supported collaborative argumentation: Integrating collaboration script and argument mapping to enhance EFL students’ argumentation skills. Education and Information Technologies, 30(3), 3803–3827. [Google Scholar] [CrossRef]
Del A. Mundo, M., Delos Reyes, E. F., Gervacio, E. M., Manalo, R. B., Jervy A. Book, R., Chavez, J. V., Espartero, M. M., & Sayadi, D. S. (2024). Discourse analysis on experience-based position of science, mathematics, and Tech-Voc educators on generative AI and academic integrity. Environment and Social Psychology, 9(8). [Google Scholar] [CrossRef]
de Waard, E. F., Prins, G. T., & van Joolingen, W. R. (2020). Pre-university students’ perceptions about the life cycle of bioplastics and fossil-based plastics. Chemistry Education Research and Practice, 21(3), 908–921. [Google Scholar] [CrossRef]
Duran, V. (2024). Analyzing teacher candidates’ arguments on AI integration in education via different chatbots. Digital Education Review, 45, 68–83. [Google Scholar] [CrossRef]
el Majidi, A., de Graaff, R., & Janssen, D. (2023). Debate pedagogy as a conducive environment for L2 argumentative essay writing. Language Teaching Research. advance online publication. [Google Scholar] [CrossRef]
el Majidi, A., Janssen, D., & de Graaff, R. (2021). The effects of in-class debates on argumentation skills in second language education. System, 101, 102576. [Google Scholar] [CrossRef]
García-Carmona, A. (2025). Scientific thinking and critical thinking in science education. Science & Education, 34(1), 227–245. [Google Scholar] [CrossRef]
Gómez-Blancarte, A. L., & Tobías-Lara, M. G. (2023). The integration of undergraduate students’ informal and formal inferential reasoning. Educational Studies in Mathematics, 113(2), 251–269. [Google Scholar] [CrossRef]
Guo, K., Zhong, Y., Li, D., & Chu, S. K. W. (2023). Effects of chatbot-assisted in-class debates on students’ argumentation skills and task motivation. Computers & Education, 203, 104862. [Google Scholar] [CrossRef]
Haudek, K. C., & Zhai, X. (2024). Examining the effect of assessment construct characteristics on machine learning scoring of scientific argumentation. International Journal of Artificial Intelligence in Education, 34(4), 1482–1509. [Google Scholar] [CrossRef]
Härmä, K., Kärkkäinen, S., & Jeronen, E. (2021). The dramatic arc in the development of argumentation skills of upper secondary school students in geography education. Education Sciences, 11(11), 734. [Google Scholar] [CrossRef]
Hirvela, A. (2017). Argumentation & second language writing: Are we missing the boat? Journal of Second Language Writing, 36, 69–74. [Google Scholar] [CrossRef]
Ho, H.-Y., Chang, T.-L., Lee, T.-N., Chou, C.-C., Hsiao, S.-H., Chen, Y.-H., & Lu, Y.-L. (2019). Above- and below-average students think differently: Their scientific argumentation patterns. Thinking Skills and Creativity, 34, 100607. [Google Scholar] [CrossRef]
Hussain, F., & Anwar, M. A. (2025). Towards informed policy decisions: Assessing student perceptions and intentions to use ChatGPT for academic performance in higher education. Journal of Asian Public Policy, 18(2), 377–404. [Google Scholar] [CrossRef]
Jackson, D. O. (2024). The longitudinal development of argumentative writing in an English for academic purposes course in Japan. System, 126, 103482. [Google Scholar] [CrossRef]
Kerman, N. T., Noroozi, O., Banihashem, S. K., Karami, M., & Biemans, H. J. (2024). Online peer feedback patterns of success and failure in argumentative essay writing. Interactive Learning Environments, 32(2), 614–626. [Google Scholar] [CrossRef]
Kim, J. (2024). Leading teachers’ perspective on teacher-AI collaboration in education. Education and Information Technologies, 29(7), 8693–8724. [Google Scholar] [CrossRef]
Kim, M. K., Kim, N. J., & Heidari, A. (2022). Learner experience in artificial intelligence-scaffolded argumentation. Assessment & Evaluation in Higher Education, 47(8), 1301–1316. [Google Scholar] [CrossRef]
Kővári, A., András, I., & Rajcsányi-Molnár, M. (2024). The role of AI in ecohumanistic education. Journal of Ecohumanism, 3(3), 1361–1370. [Google Scholar] [CrossRef]
Krašna, M., & Gartner, S. (2024, May 20–24). The effects of AI services to the educational processes—Survey analysis. 2024 47th MIPRO ICT and Electronics Convention (MIPRO) (pp. 496–501), Opatija, Croatia. [Google Scholar] [CrossRef]
Kuhn, D., Cheney, R., & Weinstock, M. (2000). The development of epistemological understanding. Cognitive Development, 15(3), 309–328. [Google Scholar] [CrossRef]
Latifi, S., Noroozi, O., & Talaee, E. (2023). Worked example or scripting? Fostering students’ online argumentative peer feedback, essay writing and learning. Interactive Learning Environments, 31(2), 655–669. [Google Scholar] [CrossRef]
Lin, Y.-R., & Hung, C.-Y. (2025). The synergistic effects in an AI-supported online scientific argumentation learning environment. Computers & Education, 229, 105251. [Google Scholar] [CrossRef]
Lindbäck, Y., Schröder, K., Engström, T., Valeskog, K., & Sonesson, S. (2025). Generative artificial intelligence in physiotherapy education: Great potential amidst challenges—A qualitative interview study. BMC Medical Education, 25(1), 603. [Google Scholar] [CrossRef]
Liu, D., & Xiong, M. (2024). Keeping balance between loyalty and modification: A toulminian model as analytical framework. Humanities and Social Sciences Communications, 11(1), 639. [Google Scholar] [CrossRef]
Loyens, S. M. M., van Meerten, J. E., Schaap, L., & Wijnia, L. (2023). Situating higher-order, critical, and critical-analytic thinking in problem- and project-based learning environments: A systematic review. Educational Psychology Review, 35(2), 39. [Google Scholar] [CrossRef]
Maqbool, T., Ishaq, H., Shakeel, S., Zaib Un Nisa, A., Rehman, H., Kashif, S., Sadia, H., Naveed, S., Mumtaz, N., Siddiqui, S., & Jamshed, S. (2025). Future pharmacy practitioners’ insights towards integration of artificial intelligence in healthcare education: Preliminary findings from Karachi, Pakistan. PLoS ONE, 20(2), e0314045. [Google Scholar] [CrossRef]
Martin, P. P., Kranz, D., Wulff, P., & Graulich, N. (2024). Exploring new depths: Applying machine learning for the analysis of student argumentation in chemistry. Journal of Research in Science Teaching, 61(8), 1757–1792. [Google Scholar] [CrossRef]
Montenegro, S., Loor, J., Posligua, K., Mendoza, K., Chancay, C., & Suastegui, S. (2024). The impact of artificial intelligence on the educational process of university students. In N. Callaos, N. Callaos, J. Horne, B. Sánchez, & M. Savoie (Eds.), Proceedings of the international multi-conference on society, cybernetics and informatics, proceedings of the 18th international multi-conference on society, cybernetics and informatics: IMSCI 2024 (pp. 86–90). International Institute of Informatics and Cybernetics. [Google Scholar] [CrossRef]
Munir, J., Faiza, M., Jamal, B., Daud, S., & Iqbal, K. (2023). Spring 2023. Journal of Social Sciences Review, 3(2), 695–705. [Google Scholar] [CrossRef]
Nera, K., Frenay, M., & Paquot, M. (2024). Improving argumentation quality on MOOC discussion forums: Does learning to identify components of arguments help? Research and Practice in Technology Enhanced Learning, 20, 17. [Google Scholar] [CrossRef]
Noroozi, O., Banihashem, S. K., Biemans, H. J. A., Smits, M., Vervoort, M. T. W., & Verbaan, C.-L. (2023a). Design, implementation, and evaluation of an online supported peer feedback module to enhance students’ argumentative essay quality. Education and Information Technologies, 28, 12757–12784. [Google Scholar] [CrossRef]
Noroozi, O., Banihashem, S. K., Taghizadeh Kerman, N., Parvaneh Akhteh Khaneh, M., Babaee, M., Ashrafi, H., & Biemans, H. J. (2023b). Gender differences in students’ argumentative essay writing, peer review performance and uptake in online learning environments. Interactive Learning Environments, 31(10), 6302–6316. [Google Scholar] [CrossRef]
Nusivera, E., Hikmat, A., & Ghani, A. R. A. (2025). Integration of Chat-GPT usage in language learning model to improve argumentation skills, complex comprehension skills, and critical thinking skills. International Journal of Learning, Teaching and Educational Research, 24(2), 375–390. [Google Scholar] [CrossRef]
Oliveira, L., Tavares, C., Strzelecki, A., & Silva, M. (2025). Prompting minds: Evaluating how students perceive generative AI’s critical thinking dispositions. Electronic Journal of E-Learning, 23(2), 1–18. [Google Scholar] [CrossRef]
Qin, W., Wang, W., Yang, Y., & Gui, T. (2025). Machine-assisted writing evaluation: Exploring pre-trained language models in analyzing argumentative moves. Computer Assisted Language Learning. [Google Scholar] [CrossRef]
Rajcsányi-Molnár, M., Balázs, L., & András, I. (2024a). Online leadership training in higher education environment. Acta Polytechnica Hungarica, 21(3), 39–52. [Google Scholar] [CrossRef]
Rajcsányi-Molnár, M., Balázs, L., András, I., & Czifra, S. (2024b, October 17–18). AVATAR—A digital tool for reducing dropout rates at a higher education institution. 2024 IEEE 7th International Conference and Workshop Óbuda on Electrical and Power Engineering (CANDO-EPE) (pp. 185–190), Budapest, Hungary. [Google Scholar] [CrossRef]
Rajcsányi-Molnár, M., Balázs, L., András, I., & Czifra, S. (2024c, April 4–6). Competition as an effective motivational tool in online education. 2024 IEEE 11th International Conference on Computational Cybernetics and Cyber-Medical Systems (ICCC) (pp. 83–88), Hanoi, Vietnam. [Google Scholar] [CrossRef]
Rapanta, C., Macagno, F., & Jenset, G. (2025). A close look at children’s and adolescents’ arguments: Combining a developmental, educational, and philosophical perspective. European Journal of Psychology of Education, 40(1), 3. [Google Scholar] [CrossRef]
Rong, J., Terzidis, K., & Ding, J. (2024). Kids AI design thinking education for creativity development. Archives of Design Research, 37(3), 119–133. [Google Scholar] [CrossRef]
Santoso, A. M., Primandiri, P. R., Susantini, E., Zubaidah, S., & Amin, M. (2022). Revealing the effect of ASICC learning model on scientific argumentation skills of low academic students. In AIP conference proceedings, proceeding of international conference on frontiers of science and technology 2021 (p. 30010). AIP Publishing. [Google Scholar] [CrossRef]
Valeri, F., Nilsson, P., & Cederqvist, A.-M. (2025). Exploring students’ experience of ChatGPT in STEM education. Computers and Education: Artificial Intelligence, 8, 100360. [Google Scholar] [CrossRef]
Valero Haro, A., Noroozi, O., Biemans, H., & Mulder, M. (2019). First- and second-order scaffolding of argumentation competence and domain-specific knowledge acquisition: A systematic review. Technology, Pedagogy and Education, 28(3), 329–345. [Google Scholar] [CrossRef]
van Eemeren, F. H., Garssen, B., Krabbe, E. C. W., Henkemans, A. F. S., Verheij, B., & Wagemans, J. H. M. (2021). Toulmin’s model of argumentation. In F. H. van Eemeren, B. Garssen, B. Verheij, E. C. W. Krabbe, A. F. Snoeck Henkemans, & J. H. M. Wagemans (Eds.), Handbook of argumentation theory (pp. 1–47). Springer. [Google Scholar] [CrossRef]
Wallander, L. (2022). Uncovering social workers’ knowledge use. Social Work and Social Sciences Review, 22(3). [Google Scholar] [CrossRef]
Wambsganss, T., Janson, A., Söllner, M., Koedinger, K., & Leimeister, J. M. (2025). Improving students’ argumentation skills using dynamic machine-learning-based modeling. Information Systems Research, 36(1), 474–507. [Google Scholar] [CrossRef]
Yang, R., & Pan, H. (2023). Whole-to-part argumentation instruction: An action research study aimed at Improving Chinese college students’ English argumentative writing based on the toulmin model. Sage Open, 13(4), 21582440231207738. [Google Scholar] [CrossRef]
Yu, S., & Chen, X. (2023). How to justify a backing’s eligibility for a warrant: The justification of a legal interpretation in a hard case. Artificial Intelligence and Law, 31(2), 239–268. [Google Scholar] [CrossRef]
Zarębski, T. (2024). Wittgenstein and toulmin’s model of argument: The riddle explained away. Argumentation, 38(4), 435–455. [Google Scholar] [CrossRef]
Zhang, J., & Browne, W. J. (2023). Exploring Chinese high school students’ performance and perceptions of scientific argumentation by understanding it as a three-component progression of competencies. Journal of Research in Science Teaching, 60(4), 847–884. [Google Scholar] [CrossRef]
Zulkarnain, N. Z., & Mansor, M. (2024). Introducing artificial intelligence through classroom debates: A student-centric approach. In M. F. b. Romlie, S. H. Shaikh Ali, Z. B. Hari, & M. C. Leow (Eds.), Lecture notes in educational technology, Proceedings of the international conference on advancing and redesigning education 2023 (pp. 615–622). Springer Nature. [Google Scholar] [CrossRef]

Table 1. Examples of Coded Toulmin Components from Student Essays.

Toulmin Component	Illustrative Excerpt from a Student Essay
Claim	Artificial intelligence is a useful tool for humanity.
Data	ChatGPT can already write essays.
Warrant	Since ChatGPT can imitate human text, we can assume that AI will soon replace some human jobs.
Backing	According to scientists, the development of AI is exponential.
Qualifier	AI will probably change many aspects of our lives.
Rebuttal	Although AI is useful, it also hides dangers.

Table 2. Occurrence and Explicitness of Toulmin Elements in Student Argumentation.

Component	Presence (%)	Explicitness (%)
Claim	100	92.5
Data/grounds	89.8	91.9
Warrant	56.2	85.4
Backing	2.4	27.3
Qualifier	7.3	90.9
Rebuttal	67.5	74.1

N = 452. The Explicitness (%) column indicates the proportion of present components that were coded as explicit (implicit components account for the remainder).

Table 3. Frequency Distribution of Component Combinations (Argumentation Patterns) Identified in Student Argumentation.

Patterns	N	%
C	31	6.9
C + D	54	11.9
C + D + B	2	0.4
C + D + B + R	1	0.2
C + D + Q	1	0.2
C + D + Q + R	3	0.7
C + D + R	91	20.1
C + D + W	55	12.2
C + D + W + B + Q + R	3	0.7
C + D + W + B + R	5	1.1
C + D + W + Q	2	0.4
C + D + W + Q + R	22	4.9
C + D + W + R	167	36.9
C + Q	2	0.4
C + R	13	2.9

N = 452.

Table 4. Characterization of Student Clusters from Hierarchical Cluster Analysis Based on the Average Presence Rate of Toulmin Components.

Cluster	Claim	Data/Grounds	Warrant	Backing	Qualifier	Rebuttal	N
1	1.00	0.99	0.98	0.03	0.13	0.77	260
2	1.00	0.56	0.00	0.03	0.00	0.14	101
3	1.00	1.00	0.00	0.00	0.00	1.00	91

N = 452. The table displays the average presence rate of the six Toulmin components within each cluster. As the components are binary (0 = not present, 1 = present), the mean values correspond to the percentage of presence (e.g., a mean of 0.98 indicates 98% presence).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Turós, M.; Kenyeres, A.Z.; Balla, G.; Gazdag, E.; Szabó, E.; Szűts, Z. A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence. Educ. Sci. 2025, 15, 1226. https://doi.org/10.3390/educsci15091226

AMA Style

Turós M, Kenyeres AZ, Balla G, Gazdag E, Szabó E, Szűts Z. A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence. Education Sciences. 2025; 15(9):1226. https://doi.org/10.3390/educsci15091226

Chicago/Turabian Style

Turós, Mátyás, Attila Zoltán Kenyeres, Georgina Balla, Emma Gazdag, Emília Szabó, and Zoltán Szűts. 2025. "A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence" Education Sciences 15, no. 9: 1226. https://doi.org/10.3390/educsci15091226

APA Style

Turós, M., Kenyeres, A. Z., Balla, G., Gazdag, E., Szabó, E., & Szűts, Z. (2025). A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence. Education Sciences, 15(9), 1226. https://doi.org/10.3390/educsci15091226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Toulmin Model Analysis of Student Argumentation on Artificial Intelligence

Abstract

1. Introduction

2. Theoretical Framework

2.1. The Concept and Significance of Argumentation

2.2. Perceptions of Artificial Intelligence in Education

2.3. Theoretical Rationale for Examining Background Variables

2.4. Statement of Contribution

2.5. The Toulmin Model of Argumentation

3. Materials and Methods

3.1. Procedure and Ethical Considerations

3.2. Participants and Sampling

3.3. Measures

3.4. Data Analysis

4. Results

5. Discussion

5.1. Pedagogical Implications

5.2. Methodological Implications

6. Limitations and Future Directions

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI