Toward Sustainable Education: Generative AI-Powered Argument Mining in Student Writing
Abstract
1. Introduction
- RQ1. How effectively do LLMs perform in identifying argument components and argument strategies in students’ argumentative writing?
- RQ2. What are the relationships among argument components, argument strategies, and writing quality?
2. Related Work
2.1. Argument Components and Strategies in Student Essays
2.2. Automated Argumentation Analysis Techniques
2.3. Large Language Models for Sustainable Education
3. Methods
3.1. Research Design
3.2. Research Data
3.3. Coding Scheme
3.3.1. Argument Component
3.3.2. Argument Strategy
3.3.3. Coding Process and Result
3.4. Automated Classification of Argument Component and Strategy Using LLMs
3.5. Data Analysis
4. Results
4.1. Empirical Comparison of Leading LLMs in Identifying Argument Components and Strategies
4.2. The Relationships Among Argument Components, Strategies, and Writing Quality
4.3. Case Study of LLM Prediction Results
5. Discussion
5.1. Automated Classification of Argument Components and Strategies Using LLMs
5.2. Variations in Argumentation Between Different Writing Qualities
5.3. Theoretical and Practical Implications
5.4. Limitations and Future Directions
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. More Details of Research Data
| Level | Scoring Criteria |
|---|---|
| I (63–70) | Accurately comprehends the material, appropriate perspective, profound thesis, prominent central idea, substantial content, genuine emotion, rigorous structure, originality, and literary merit. |
| II (52–62) | Basically correct understanding of the material, relatively appropriate perspective, relatively profound thesis, clear central idea, relatively substantial content, authentic emotion, complete structure, and fluent language. |
| III (39–51) | Fairly comprehends the material, fairly appropriate perspective, general thesis, fairly clear central idea, fairly substantial content, fairly authentic emotion, basically complete structure, basically fluent language, with occasional grammatical errors. |
| IV (21–38) | Deviates from the material, inappropriate thesis or perspective, unclear central idea, thin content, incomplete structure, inadequate language fluency, and frequent grammatical errors. |
| V (0–20) | Completely off-topic from the material, incoherent writing, and total word count less than 400 words. |
Appendix B. More Details of Coding Scheme
Appendix B.1. Argument Component
Appendix B.2. Argument Strategy
| Coarse | Fine | Definition | Example |
|---|---|---|---|
| Assertion | Major Claim | The theme or thesis of an article, i.e., the most significant point that the author aims to convey and argue. | Life needs a sense of ritual because it can counter mediocrity. |
| Claim | Supporting ideas or subsidiary claims articulated around the major claim. | In my opinion, life needs a sense of ritual, but not blindly pursued. | |
| Restated Claim | A restatement or rephrasing of an already stated Major Claim or Claim, for the purpose of emphasis or clarification. | Life needs a sense of ritual, but can not blindly pursue, the continuous pursuit and progress, lively and vivid, this is life. | |
| Evidence | Fact | Specific cases, generalized facts, and reliable historical events, etc. | Regrettably, in today’s society, many have fallen into the trap of exaggerating their sense of ritual to fulfill short-lived material satisfactions and the envy of others, leading to chaos in their personal lives. In pursuit of luxury, they spare no expense, ultimately trading for nothing but emptiness and stress. |
| Anecdote | Experiences from oneself or from friends and family. | And on our own part, we may have let our nerves get in the way of our performance in the exam or put ourselves under a lot of unnecessary stress. | |
| Quotation | Citing others’ writings, research, ideas or theories. | The ground is all sixpence, there is always someone to look up to see the moon. | |
| Proverb | Sentences or phrases that are widely circulated among the populace, carrying educational value or reflecting social experience. | Without rules, nothing can be accomplished. | |
| Axiom | Recognized common sense or scientific axioms or laws. | In addition to this, the theoretical knowledge of science has become synonymous with authority in most cases, a simple example, no would argue that 1 + 1 does not equal 2. | |
| Elaboration | - | Explanation, analysis, or discussion of the assertion or evidence, providing detailed clarification or establishing the connection between arguments. | Life needs to be down-to-earth, but if you always keep your head down to earn that tiny “sixpence”, and forget to look up to appreciate the bright “moon”, just in the mediocrity of the numbness of the self, to become a zombie, what is the meaning of life? |
| Others | - | None of the above, i.e., non-argument components within argumentative essays. | May the wind guide our path. |
| Aspect | Label | Definition | Example |
|---|---|---|---|
| Stance-Based | Positive | A method that directly validates the correctness of a viewpoint by using elaboration or evidence consistent with the viewpoint to support it, emphasizing direct affirmation of the viewpoint. | Quotation: Nietzsche once said, “Every day that you do not dance is a betrayal of life.” → Claim: Exploring the spiritual world is an individual’s journey of self-awareness—a process of exercising subjective initiative to recognize one’s own uniqueness. |
| Negative | A method that indirectly proves the correctness of a viewpoint through elaboration or evidence that are contrary to the viewpoint. It emphasizes the negation of opposing viewpoints, thereby achieving the purpose of the argumentation. | Quotation: As Shakespeare said, “Without surprises, life would have no luster.” → Claim: Under a certain sense of ceremony, people can become more passionate about life, helping them cherish the moment and look forward to the future. | |
| Comparative | A shorthand for positive and negative argumentation, is an argumentative approach that involves contrasting and comparing two items to highlight their differences, thereby making the conclusion more evident and persuasive. | Fact: Take the recent marathon as an example: many contestants did not finish the race, some even quitting midway. This occurred because one runner started accelerating early on, prompting others not to fall behind, a manifestation of tension. Conversely, those who maintained their composure and were undisturbed ended up securing better positions, illustrating the benefits brought by a sense of relaxation. → claim: In real life, we need a sense of relaxation more than tension. | |
| Evidence-Based | Example | An argumentation method that proves a thesis through concrete, or typical examples. | Fact: The flourishing Tang Dynasty, despite its grandeur, is reduced to fleeting pages in historical records. Without ritualistic significance and the poetic brilliance of Li Bai, Du Fu, and others, how could we today appreciate the splendor of ancient Chang’an or comprehend the complex emotions embedded in phrases like ’returning to Chang’an as one’s homeland’? → Claim: Ritualistic significance adds brilliance to mundane life, liberating individuals from mediocrity in that moment and infusing dull emotions with romantic yearning for beauty. |
| Citation | An argumentation method that proves a thesis by using quotations or axioms. | Quotation: Nietzsche once said, “Every day that you do not dance is a betrayal of life.” → Claim: Exploring the spiritual world is an individual’s journey of self-awareness—a process of exercising subjective initiative to recognize one’s own uniqueness. | |
| Discourse-Based | Metaphorical | By employing metaphorical rhetoric, familiar things are used as metaphors to argue the correctness of a viewpoint. In drawing parallels between two items with similar characteristics, the artful use of metaphors often serves to better elucidate concepts, making the argument more vivid and interesting. | Elaboration: If understanding objects is likened to baking a cake, then the method of comprehension is the mold. Those who only heed the words of authoritative experts apply others’ molds; thus, no matter how sweet the resulting cake is, it will not be in a shape that suits them. → Claim: A deep-rooted reliance on authoritative experts also reflects a more profound issue—a lack of fundamental methods for understanding things oneself. |
| Hypothetical | Analyzing evidence from the opposite side based on hypothesis to infer its authenticity and reliability, thus robustly supporting a thesis. | Fact: The grandeur and brilliance of the Tang Dynasty, though but a fleeting mention in the annals of history, would be lost to us without the ceremonial gravitas and the exquisite verses of poets like Li Bai and Du Fu. How else could we, in the present day, glimpse the golden splendor of ancient Chang’an or grasp the myriad emotions encapsulated in the phrase “Returning to Chang’an, my homeland” ? → Claim: Ceremony adds a luster to the mundane, lifting those numbed by the monotony of daily life out of their mediocrity, infusing their arid emotions with a romantic yearning for the beautiful. | |
| Restatement | For argument of the type restated claim, its relation with the target argument (major claim or claim) is defined as restatement relation. | Restated Claim: Rituals are never unnecessary or superfluous. → Major claim: In life, rituals are just so indispensable. | |
| Detail | When an argument (elaboration type) primarily aims to further explain or analyze other content, it establishes a detail relation with the corresponding argument (assertion or evidence type). | Elaboration: Nietzsche’s words actually tell us to know thyself and become thyself, which all but maps out the exploration of the spiritual world of self. → Quotation: Nietzsche once said, “Every day that you don’t dance is a failure of life. | |
| Background | When an argument (elaboration type) primarily serves the function of introducing background, it constructs a background relation with the corresponding argument (assertion or evidence type). | Elaboration: It’s just that is such a mode of exploration really beneficial to people’s perceptions?” → Claim: This process of transformation essentially reflects the expansion of instrumental rationality and people’s active abandonment of “thinking”. |
| Label | Definition | Example |
|---|---|---|
| Coherence | Describing several aspects of the same event, related events, or contrasting situations that coexist, co-occur, or oppose in meaning. These aspects can be reordered without altering the overall significance. | Fact: The idea of a commonwealth of nations, as proposed by Confucius, is also what we aspire to nowadays. → Fact: Another example is Wang Mang’s seizure of power and his promulgation of a series of new measures, which were denied at the time, but in fact he referred to Western countries for these initiatives. |
| Progression | The subsequent argument represents an advance in scope or meaning than the preceding one, intended to emphasize a deepening, expansion, or reinforcement of logic, and the order of the arguments is usually non-interchangeable. | Claim: However, the negative impacts caused by the pursuit of rituals are not few. → Claim: Only by getting rid of the solidified idea that a sense of ritual is necessary in life can they focus on the abundance of the spiritual world and climb higher. |
| Contrast | Comparison and selection are made by examining the similarities or differences between two or more things, situations, or viewpoints, emphasizing the contrast between them. | Fact: We all know that Wei Liangfu improved the Kunqu opera, leaving brilliant cultural treasures for future generations, we all know that Yuan Longping broke through a technical barrier to solve the food problem in many areas, they are not precisely in the ancients and the authority of the forefathers under the influence of their own chapter? → Fact: There are great men, naturally, there are also small people, those so-called good learning in fact, “thick ancient and thin” academic molecules, those who listen to the authority of the scientific molecules do not understand the development of adaptability, which one has made achievements? |
| Concession | An argument posits a certain situation or viewpoint, followed by a shift where the subsequent argument presents an opposing or contrasting perspective, emphasizing the content of the latter argument. | Claim: Therefore, while inheritance is important, breakthroughs and development are also indispensable. → Claim: However, should those ideas and factors that have been tested be recognized in their entirety? No. |
Appendix C. More Details of Methods
Appendix C.1. Concept of BIO

Appendix C.2. STL Prompt Data
Appendix C.3. CoT Prompt Data
Appendix C.4. MTL Prompt Data
Appendix C.5. Concept of ENA and Two-Mode Network
| Instruction | Input | Output |
|---|---|---|
| You are an experienced high school Chinese language teacher. Please analyze the following argumentative essay and determine the argument component type for each sentence. The argument component types include Major Claim, Claim, Restated Claim, Fact, Anecdote, Quotation, Proverb, Axiom, Elaboration, and Others. Multiple consecutive sentences of the same type may form a argument unit. Combine BIO tagging to indicate the boundaries of the corresponding argument units. Please note that only output the sentence number “#ID” and the corresponding B/I-component type. | [Essay Title]: Seeking Paths Through Flowers, Straight to the Depths of White Clouds [Essay Content]: #1 I believe that in the pursuit of noble aspirations and grandeur, one must pay attention to the sense of ritual, though overindulgence in this can lead to a distorted self-perception, letting clouds obscure one’s vision and covering the present with a veil of emptiness. #2 Rituals give a sense of ceremony, making ordinary actions more refined and precise, undertaken with greater care and seriousness, presenting an air of dignity and extraordinariness to others. #3 This is not inherently a derogatory term, as rituals can bring happiness to those who don’t normally engage in such practices. However, how much of this happiness stems from external validation, and how much is rooted in genuine self-discipline, requires further exploration. #4 Certainly, rituals can bring one short or lasting joy, but Camus once warned, “If noble actions are overly exaggerated, they may ultimately become an indirect yet powerful ode to sin.” …… | #1: B-Major Claim, #2: B-Elaboration, #3: I-Elaboration, #4: B-Quotation, …… |
| Instruction: You are an experienced high school Chinese language teacher. Please analyze the following argumentative essay and determine the types of relations between the argument components. The argument component types include Major Claim, Claim, Restated Claim, Fact, Anecdote, Quotation, Proverb, Axiom, Elaboration, and Others. Multiple consecutive sentences of the same type may form a argument unit. The relations between argument components can be categorized as follows: based on whether the content directly supports the assertion or indirectly strengthens it by addressing opposing views, the relations can be divided into “Positive Argumentation”, “Negative Argumentation”, and “Comparative Argumentation”. These commonly occur between the Assertion (Major Claim, Claim, and Restated Claim) and Evidence (Fact, Anecdote, Quotation, Proverb, and Axiom), Claim and Major Claim, as well as between Elaborations and Assertions. Based on the type of evidence, the relation can be classified into “Example Argumentation” and “Citation Argumentation”, which appear between Evidences and Assertions. |
| Based on the rhetorical methods used in elaboration components, they can be categorized into “Metaphorical Argumentation” and “Hypothetical Argumentation”, which typically occur between Elaborations and Assertions. When the Elaboration further elaborates on the Assertion or Evidence, it forms a “Detail Relation”. When the Elaboration precedes other types of units to provide background information or serve structural purposes, it forms an “Background Relation”. Restatement component and its corresponding Major Claim or Claim form a “Restatement Relation”. Additionally, it is necessary to identify the logical relation between adjacent claims, including “Coherence Relation”, “Progression Relation”, “Contrast Relation”, and “Concession Relation”. When there is a clear hierarchical logical relation between the units of the same type of argument component, it is also necessary to indicate it. Note: Multiple consecutive sentences of the same type may form a argument unit, and the relations between argument units may involve multiple types. The input argumentative essay is divided into sentences and numbered. Only output the sentence numbers corresponding to the argument units and the relation types between the argument unit pairs; do not output any extra details. |
| Output: #2, #3 → #1: [“Detail Relation”] [SEP] #4 → #1: [“Positive Argumentation”, “Citation Argumentation”] [SEP] …… |
| Instruction: You are an experienced high school Chinese language teacher. Please analyze the following argumentative essay, identify its argument components, and determine the types of relations between the argument components. The argument components consist of 4 coarse-grained and 10 fine-grained types (i.e., Assertion: major claim, claim and restated claim; Evidence: fact, anecdote, quotation, proverb, and axiom; Elaboration; and Others). The relations between argument components can be categorized as follows: based on whether the content directly supports the assertion or indirectly strengthens it by addressing opposing views, the relations can be divided into “Positive Argumentation”, “Negative Argumentation”, and “Comparative Argumentation”. These commonly occur between the Assertion (Major Claim, Claim, and Restated Claim) and Evidence (Fact, Anecdote, Quotation, Proverb, and Axiom), Claim and Major Claim, as well as between Elaborations and Assertions. Based on the type of evidence, the relation can be classified into “Example Argumentation” and “Citation Argumentation”, which appear between Evidences and Assertions. Based on the rhetorical methods used in elaboration components, they can be categorized into “Metaphorical Argumentation” and “Hypothetical Argumentation”, which typically occur between Elaborations and Assertions. When the Elaboration further elaborates on the Assertion or Evidence, it forms a “Detail Relation”. When the Elaboration precedes other types of units to provide background information or serve structural purposes, it forms an “Background Relation”. Restatement component and its corresponding Major Claim or Claim form a “Restatement Relation”. Additionally, it is necessary to identify the logical relation between adjacent claims, including “Coherence Relation”, “Progression Relation”, “Contrast Relation”, and “Concession Relation”. When there is a clear hierarchical logical relation between the units of the same type of argument component, it is also necessary to indicate it. Note: Multiple consecutive sentences of the same type may form a argument unit, and the relations between argument units may involve multiple types. Please follow the two steps below to complete the argumentative essay analysis: Step 1: Argument component detection and classification. Combine BIO tagging to indicate the boundaries of the corresponding argument units. Please note that only output the sentence number “#ID” and the corresponding B/I-component type. Step 2: Relation detection and classification. Identify and classify the relations between the argument components. The input argumentative essay is divided into sentences and numbered. Only output the sentence numbers corresponding to the argument units and the relation types between the argument unit pairs; do not output any extra details. |
| Output: Step 1: Argument component detection and classification Result: #1: B-Major Claim, #2: B-Elaboration, #3: I-Elaboration, #4: B-Quotation, …… Step 2: Relation detection and classification Result: #2, #3 → #1: [“Detail Relation”] [SEP] #4 → #1: [“Positive Argumentation”, “Citation Argumentation”] [SEP] …… |
| Instruction | Output |
|---|---|
| You are an experienced high school Chinese language teacher. Please analyze the following argumentative essay and determine the argument component type for each sentence. Task1: Argument Component Prediction. The argument component types include Major Claim, Claim, Restated Claim, Fact, Anecdote, Quotation, Proverb, Axiom, Elaboration, and Others. Multiple consecutive sentences of the same type may form a argument unit. Combine BIO tagging to indicate the boundaries of the corresponding argument units. Please note that only output the sentence number “#ID” and the corresponding B/I-component type. | Task 1: Argument Component Prediction Results. #1: B-Major Claim, #2: B-Elaboration, #3: I-Elaboration, #4: B-Quotation, …… |
| You are an experienced high school Chinese language teacher. Please analyze the following argumentative essay and determine the types of relations between the argument components. Task2: Argument Strategy Prediction. The argument component types include Major Claim, Claim, Restated Claim, Fact, Anecdote, Quotation, Proverb, Axiom, Elaboration, and Others. Multiple consecutive sentences of the same type may form a argument unit. The relations between argument components can be categorized as follows: based on whether the content directly supports the assertion or indirectly strengthens it by addressing opposing views, the relations can be divided into “Positive Argumentation”, “Negative Argumentation”, and “Comparative Argumentation”. These commonly occur between the Assertion (Major Claim, Claim, and Restated Claim) and Evidence (Fact, Anecdote, Quotation, Proverb, and Axiom), Claim and Major Claim, as well as between Elaborations and Assertions. Based on the type of evidence, the relation can be classified into “Example Argumentation” and “Citation Argumentation”, which appear between Evidences and Assertions. Based on the rhetorical methods used in elaboration components, they can be categorized into “Metaphorical Argumentation” and “Hypothetical Argumentation”, which typically occur between Elaborations and Assertions. When the Elaboration further elaborates on the Assertion or Evidence, it forms a “Detail Relation”. When the Elaboration precedes other types of units to provide background information or serve structural purposes, it forms an “Background Relation”. Restatement component and its corresponding Major Claim or Claim form a “Restatement Relation”. Additionally, it is necessary to identify the logical relation between adjacent claims, including “Coherence Relation”, “Progression Relation”, “Contrast Relation”, and “Concession Relation”. When there is a clear hierarchical logical relation between the units of the same type of argument component, it is also necessary to indicate it. Note: Multiple consecutive sentences of the same type may form a argument unit, and the relations between argument units may involve multiple types. The input argumentative essay is divided into sentences and numbered. Only output the sentence numbers corresponding to the argument units and the relation types between the argument unit pairs; do not output any extra details. | Task2: Argument Strategy Prediction Results. #2, #3 → #1: [“Detail Relation”][SEP] #4 → #1: [“Positive Argumentation”, “Citation Argumentation”] [SEP] …… |
References
- Zheng, X.L.; Huang, J.; Xia, X.H.; Hwang, G.J.; Tu, Y.F.; Huang, Y.P.; Wang, F. Effects of online whiteboard-based collaborative argumentation scaffolds on group-level cognitive regulations, written argument skills and regulation patterns. Comput. Educ. 2023, 207, 104920. [Google Scholar] [CrossRef]
- Turós, M.; Kenyeres, A.Z.; Balla, G.; Gazdag, E.; Szabó, E.; Szűts, Z. A toulmin model analysis of student argumentation on artificial intelligence. Educ. Sci. 2025, 15, 1226. [Google Scholar] [CrossRef]
- Thomas, D.P. Structuring written arguments in primary and secondary school: A systemic functional linguistics perspective. Linguist. Educ. 2022, 72, 101120. [Google Scholar] [CrossRef]
- Liu, M.; Zhang, L.J.; Biebricher, C. Investigating students’ cognitive processes in generative AI-assisted digital multimodal composing and traditional writing. Comput. Educ. 2024, 211, 104977. [Google Scholar] [CrossRef]
- Wu, T.T.; Silitonga, L.M.; Murti, A.T. Enhancing English writing and higher-order thinking skills through computational thinking. Comput. Educ. 2024, 213, 105012. [Google Scholar] [CrossRef]
- Anderson, R.C.; Chaparro, E.A.; Smolkowski, K.; Cameron, R. Visual thinking and argumentative writing: A social-cognitive pairing for student writing development. Assess. Writ. 2023, 55, 100694. [Google Scholar] [CrossRef]
- Ulfa, S.M.; Purwati, O. Argumentative Essay Patterns Produced by University Students. J. Engl. Educ. Teach. 2023, 7, 595–612. [Google Scholar] [CrossRef]
- Morris, C.; Deehan, J.; MacDonald, A. Written argumentation research in English and science: A scoping review. Cogent Educ. 2024, 11, 2356983. [Google Scholar] [CrossRef]
- Lin, Y.R.; Fan, B.; Xie, K. The influence of a web-based learning environment on low achievers’ science argumentation. Comput. Educ. 2020, 151, 103860. [Google Scholar] [CrossRef]
- Latifi, S.; Noroozi, O.; Hatami, J.; Biemans, H.J. How does online peer feedback improve argumentative essay writing and learning? Innov. Educ. Teach. Int. 2021, 58, 195–206. [Google Scholar] [CrossRef]
- Zhang, R.; Zou, D.; Cheng, G. Chatbot-based training on logical fallacy in EFL argumentative writing. Innov. Lang. Learn. Teach. 2023, 17, 932–945. [Google Scholar] [CrossRef]
- Li, X.; Jiang, S.; Hu, Y.; Feng, X.; Chen, W.; Ouyang, F. Investigating the impact of structured knowledge feedback on collaborative academic writing. Educ. Inf. Technol. 2024, 29, 19005–19033. [Google Scholar] [CrossRef]
- Schaller, N.J.; Horbach, A.; Höft, L.I.; Ding, Y.; Bahr, J.L.; Meyer, J.; Jansen, T. DARIUS: A Comprehensive Learner Corpus for Argument Mining in German-Language Essays. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; pp. 4356–4367. [Google Scholar]
- Xiao, C.; Ma, W.; Song, Q.; Xu, S.X.; Zhang, K.; Wang, Y.; Fu, Q. Human-ai collaborative essay scoring: A dual-process framework with llms. In Proceedings of the 15th International Learning Analytics and Knowledge Conference, Dublin, Ireland, 3–7 March 2025; pp. 293–305. [Google Scholar]
- Song, W.; Song, Z.; Liu, L.; Fu, R. Hierarchical multi-task learning for organization evaluation of argumentative student essays. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 7–15 January 2021; pp. 3875–3881. [Google Scholar]
- Ren, Y.; Zhou, X.; Zhang, N.; Zhao, S.; Lan, M.; Bai, X. Towards Comprehensive Argument Analysis in Education: Dataset, Tasks, and Method. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Che, W., Nabende, J., Shutova, E., Pilehvar, M.T., Eds.; Association for Computational Linguistics: Vienna, Austria, 2025; pp. 14215–14231. [Google Scholar] [CrossRef]
- Özalp, D. Preservice Teachers Learn to Engage in Argument from Evidence through the Science Writing Heuristic. Int. J. Sci. Math. Educ. 2025, 23, 949–986. [Google Scholar] [CrossRef]
- Majidi, A.E.; Graaff, R.D.; Janssen, D. Debate pedagogy as a conducive environment for L2 argumentative essay writing. Lang. Teach. Res. 2023, 13621688231156998. [Google Scholar] [CrossRef]
- Iqbal, S.; Rakovic, M.; Chen, G.; Li, T.; Bajaj, J.; Mello, R.F.; Fan, Y.; Aljohani, N.R.; Gasevic, D. Towards Improving Rhetorical Categories Classification and Unveiling Sequential Patterns in Students’ Writing. In Proceedings of the 14th Learning Analytics and Knowledge Conference, Kyoto, Japan, 18–22 March 2024; pp. 656–666. [Google Scholar]
- Yang, G.; Zheng, X.Q.; Li, Q.; Han, M.; Tu, Y.F. An empirical study on how cognitive diagnostic feedback affects primary school pupils’ learning of Chinese writing. Interact. Learn. Environ. 2024, 32, 2758–2775. [Google Scholar]
- Guo, J.; Cheng, L.; Zhang, W.; Kok, S.; Li, X.; Bing, L. AQE: Argument Quadruplet Extraction via a Quad-Tagging Augmented Generative Approach. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023; Rogers, A., Boyd-Graber, J., Okazaki, N., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 932–946. [Google Scholar] [CrossRef]
- Chuang, P.L.; Yan, X. An investigation of the relationship between argument structure and essay quality in assessed writing. J. Second. Lang. Writ. 2022, 56, 100892. [Google Scholar] [CrossRef]
- Auliya, P.K.; Amrullah, Q.L. Analyzing the flow of ideas in university students’ cause and effect essay. Innov. Res. J. 2024, 5, 1–9. [Google Scholar] [CrossRef]
- Ren, Y.; Wu, H.; Long, Z.; Zhao, S.; Zhou, X.; Yin, Z.; Zhuang, X.; Bai, X.; Lan, M. CEAMC: Corpus and Empirical Study of Argument Analysis in Education via LLMs. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, FL, USA, 12–16 November 2024; pp. 6949–6966. [Google Scholar]
- Lawrence, J.; Reed, C. Argument mining: A survey. Comput. Linguist. 2019, 45, 765–818. [Google Scholar]
- Farsani, M.A.; Stapleton, P.; Jamali, H.R. Charting L2 argumentative writing: A systematic review. J. Second Lang. Writ. 2025, 68, 101208. [Google Scholar] [CrossRef]
- Ervas, F.; Mosca, O. An experimental study on the evaluation of metaphorical ad hominem arguments. Informal Log. 2024, 44, 249–277. [Google Scholar] [CrossRef]
- Stahl, M.; Michel, N.; Kilsbach, S.; Schmidtke, J.; Rezat, S.; Wachsmuth, H. A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers); Association for Computational Linguistics: Mexico City, Mexico, 2024; pp. 2661–2674. [Google Scholar]
- Iqbal, S.; Rakovic, M.; Chen, G.; Li, T.; Ferreira Mello, R.; Fan, Y.; Fiorentino, G.; Radi Aljohani, N.; Gasevic, D. Towards automated analysis of rhetorical categories in students essay writings using Bloom’s taxonomy. In Proceedings of the LAK23: 13th International Learning Analytics and Knowledge Conference, Arlington, TX, USA, 13–17 March 2023; pp. 418–429. [Google Scholar]
- Ferreira Mello, R.; Fiorentino, G.; Miranda, P.; Oliveira, H.; Raković, M.; Gašević, D. Towards automatic content analysis of rhetorical structure in brazilian college entrance essays. In Proceedings of the International Conference on Artificial Intelligence in Education, Online, 6–10 June 2021; Springer: Cham, Switzerland, 2021; pp. 162–167. [Google Scholar]
- Ferreira Mello, R.; Fiorentino, G.; Oliveira, H.; Miranda, P.; Rakovic, M.; Gasevic, D. Towards automated content analysis of rhetorical structure of written essays using sequential content-independent features in Portuguese. In Proceedings of the LAK22: 12th International Learning Analytics and Knowledge Conference, Online, 21–25 March 2022; pp. 404–414. [Google Scholar]
- Oliveira, H.; Ferreira Mello, R.; Barreiros Rosa, B.A.; Rakovic, M.; Miranda, P.; Cordeiro, T.; Isotani, S.; Bittencourt, I.; Gasevic, D. Towards explainable prediction of essay cohesion in portuguese and english. In Proceedings of the LAK23: 13th International Learning Analytics and Knowledge Conference, Arlington, TX, USA, 13–17 March 2023; pp. 509–519. [Google Scholar]
- Chen, G.; Cheng, L.; Tuan, L.A.; Bing, L. Exploring the Potential of Large Language Models in Computational Argumentation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics: Bangkok, Thailand, 2024; pp. 2309–2330. [Google Scholar]
- Shi, L.; Giunchiglia, F.; Luo, R.; Shi, D.; Song, R.; Diao, X.; Xu, H. An empirical study of LLMs via in-context learning for stance classification. Inf. Process. Manag. 2026, 63, 104322. [Google Scholar]
- Gorur, D.; Rago, A.; Toni, F. Can Large Language Models perform Relation-based Argument Mining? In Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates, 19–24 January 2025; pp. 8518–8534. [Google Scholar]
- Cabessa, J.; Hernault, H.; Mushtaq, U. Argument mining with fine-tuned large language models. In Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates, 19–24 January 2025; pp. 6624–6635. [Google Scholar]
- Meyer, J.; Jansen, T.; Schiller, R.; Liebenow, L.W.; Steinbach, M.; Horbach, A.; Fleckenstein, J. Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions. Comput. Educ. Artif. Intell. 2024, 6, 100199. [Google Scholar] [CrossRef]
- Mansour, W.A.; Albatarni, S.; Eltanbouly, S.; Elsayed, T. Can large language models automatically score proficiency of written essays? In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; pp. 2777–2786. [Google Scholar]
- Lin, C.C.; Huang, A.Y.; Lu, O.H. Artificial intelligence in intelligent tutoring systems toward sustainable education: A systematic review. Smart Learn. Environ. 2023, 10, 41. [Google Scholar] [CrossRef]
- Liu, X. The Difference Between Chinese and American Secondary Education Curriculum System. In Proceedings of the 2025 International Conference on Mental Growth and Human Resilience (MGHR 2025); Atlantis Press: Fukui, Japan, 2025; pp. 871–878. [Google Scholar]
- Lu, Y. Comparative analysis of teaching methods: A cross-cultural study of Chinese and American educational systems. Trans. Soc. Sci. Educ. Humanit. Res. 2024, 4, 59–64. [Google Scholar] [CrossRef]
- Jurišević, N.; Nikolić, N.; Nemś, A.; Gordić, D.; Rakić, N.; Končalović, D.; Kocsis, D. Bridging LLMs, Education, and Sustainability: Guiding Students in Local Community Initiatives. Sustainability 2025, 17, 10148. [Google Scholar] [CrossRef]
- Park, B.; Seo, K. Assessing critical thinking through a multi-agent llm-based debate chatbot. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 26 April–1 May 2025; pp. 1–13. [Google Scholar]
- Reid, J.W.; Parrish, J.; Syed, S.B.; Couch, B. Finding the connections: A scoping review of epistemic network analysis in science education. J. Sci. Educ. Technol. 2025, 34, 937–955. [Google Scholar] [CrossRef]
- Singh, S.S.; Muhuri, S.; Mishra, S.; Srivastava, D.; Shakya, H.K.; Kumar, N. Social Network Analysis: A Survey on Process, Tools, and Application. Acm Comput. Surv. 2024, 56, 1–39. [Google Scholar]
- Mann, W.C.; Thompson, S.A. Rhetorical structure theory: Toward a functional theory of text organization. Text-Interdiscip. J. Study Discourse 1988, 8, 243–281. [Google Scholar] [CrossRef]
- Walton, D. Using argumentation schemes to find motives and intentions of a rational agent. Argum. Comput. 2020, 10, 233–275. [Google Scholar] [CrossRef]
- Kennard, N.N.; O’Gorman, T.; Das, R.; Sharma, A.; Bagchi, C.; Clinton, M.; Yelugam, P.K.; Zamani, H.; McCallum, A. DISAPERE: A dataset for discourse structure in peer review discussions. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics: Seattle, WA, USA, 2022; pp. 1234–1249. [Google Scholar]
- Cheng, L.; Bing, L.; He, R.; Yu, Q.; Zhang, Y.; Si, L. IAM: A comprehensive and large-scale dataset for integrated argument mining tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics: Dublin, Ireland, 2022; pp. 2277–2287. [Google Scholar]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. In Proceedings of the International Conference on Learning Representations, Virtual Conference, 25–29 April 2022. [Google Scholar]
- Bahri, Y.; Dyer, E.; Kaplan, J.; Lee, J.; Sharma, U. Explaining neural scaling laws. Proc. Natl. Acad. Sci. USA 2024, 121, e2311878121. [Google Scholar] [CrossRef]
- Ding, Y.; Kashefi, O.; Somasundaran, S.; Horbach, A. When argumentation meets cohesion: Enhancing automatic feedback in student writing. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; pp. 17513–17524. [Google Scholar]
- Chen, S.; Zhang, Y.; Yang, Q. Multi-task learning in natural language processing: An overview. Acm Comput. Surv. 2024, 56, 1–32. [Google Scholar] [CrossRef]
- Elmoazen, R.; Saqr, M.; Tedre, M.; Hirsto, L. A systematic literature review of empirical research on epistemic network analysis in education. IEEE Access 2022, 10, 17330–17348. [Google Scholar] [CrossRef]
- Wu, Y.; Lan, W.; Fan, X.; Fang, K. Bipartite network influence analysis of a two-mode network. J. Econom. 2024, 239, 105562. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, Y.; Yu, R.; Zuo, J.; Dong, N. Managing the high capital cost of prefabricated construction through stakeholder collaboration: A two-mode network analysis. Eng. Constr. Archit. Manag. 2025, 32, 556–577. [Google Scholar] [CrossRef]









| Model | Method | Micro-Precision | Micro-Recall | Micro- | Macro- |
|---|---|---|---|---|---|
| DeepSeek | Zero-shot | ||||
| STL | |||||
| CoT | |||||
| MTL | |||||
| Qwen | Zero-shot | ||||
| STL | |||||
| CoT | |||||
| MTL | |||||
| ChatGLM | Zero-shot | ||||
| STL | |||||
| CoT | |||||
| MTL |
| Model | Method | Micro-Precision | Micro-Recall | Chunk- | Sentence- |
|---|---|---|---|---|---|
| DeepSeek | Zero-shot | ||||
| STL | |||||
| CoT | |||||
| MTL | |||||
| Qwen | Zero-shot | ||||
| STL | |||||
| CoT | |||||
| MTL | |||||
| ChatGLM | Zero-shot | ||||
| STL | |||||
| CoT | |||||
| MTL |
| Gold | STL | CoT | MTL |
|---|---|---|---|
| Argument Component Prediction | |||
| #1 B-Elaboration | #1 B-Quotationn | ✓ | ✓ |
| #2 I-Elaboration | #2 I-Quotationn | ✓ | ✓ |
| #3 B-Claim | ✓ | ✓ | ✓ |
| #10 B-Claim | #10 B-Major Claim | ✓ | ✓ |
| #13 B-Fact | ✓ | ✓ | ✓ |
| #14 B-Elaboration | ✓ | ✓ | ✓ |
| #15 I-Elaboration | ✓ | ✓ | ✓ |
| #16 B-Claim | ✓ | ✓ | ✓ |
| Argument Strategy Prediction | |||
| #1, #2 -> #3: [’Background’] | ✓ | ✓ | ✓ |
| #14, #15 -> #13: [’Detail’] | #15 -> #14: [’Detail’] | #14, #15 -> #16: [’Background’] | ✓ |
| #3 -> #10: [’Coherence’] | None | ✓ | #3 -> #10: [’Progression’] |
| #10 -> #16: [’Progression’] | None | #10 -> #16: [’Coherence’] | ✓ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ren, Y.; Zhang, N.; Li, X.; Zhang, Y.; Chen, Y.; Lan, M. Toward Sustainable Education: Generative AI-Powered Argument Mining in Student Writing. Sustainability 2026, 18, 3338. https://doi.org/10.3390/su18073338
Ren Y, Zhang N, Li X, Zhang Y, Chen Y, Lan M. Toward Sustainable Education: Generative AI-Powered Argument Mining in Student Writing. Sustainability. 2026; 18(7):3338. https://doi.org/10.3390/su18073338
Chicago/Turabian StyleRen, Yupei, Ning Zhang, Xiaoyu Li, Yadong Zhang, Yuqing Chen, and Man Lan. 2026. "Toward Sustainable Education: Generative AI-Powered Argument Mining in Student Writing" Sustainability 18, no. 7: 3338. https://doi.org/10.3390/su18073338
APA StyleRen, Y., Zhang, N., Li, X., Zhang, Y., Chen, Y., & Lan, M. (2026). Toward Sustainable Education: Generative AI-Powered Argument Mining in Student Writing. Sustainability, 18(7), 3338. https://doi.org/10.3390/su18073338

