Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing

Sterling, Leon; Ye, Chunchun; Ying, Haoxuan; Chen, Zhe

doi:10.3390/info16040289

Open AccessArticle

Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing

School of Computing and Information Systems, University of Melbourne, Parkville, Melbourne, VIC 3010, Australia

^*

Author to whom correspondence should be addressed.

Information 2025, 16(4), 289; https://doi.org/10.3390/info16040289

Submission received: 17 February 2025 / Revised: 23 March 2025 / Accepted: 31 March 2025 / Published: 3 April 2025

(This article belongs to the Special Issue Generative AI Technologies: Shaping the Future of Higher Education)

Download

Browse Figure

Versions Notes

Abstract

Students are faced with a wide range of writing tasks during their studies, including writing literature reviews, summarising papers and producing reflective reports. Writing tasks present a challenge for students who are not writing in their native language due to studying overseas. Indeed, students writing in their native language have a distinct advantage in assignments involving writing. The rapid emergence of Generative Artificial Intelligence (Gen-AI) over the past three years has the potential to significantly impact the quality and efficiency of writing of non-native English speakers by providing international students with an opportunity to minimise the language barrier when writing in academia. This paper reports on a series of structured exercises we developed to determine how using Gen-AI tools built on large language models (LLMs) such as ChatGPT and Claude might improve student writing in the context of computing degrees. Two of the exercises were successfully repeated with a second and independent group of students. We analyse some issues to be aware of when using Gen-AI tools and make suggestions as to their effective use. The key underlying message is that students need to develop their own distinct voice.

Keywords:

generative artificial intelligence; academic writing; international students; LLMs

1. Introduction

Gen-AI tools built on large language models (LLMs), led by ChatGPT, have rapidly emerged over the last couple of years [1]. These tools present both opportunities and challenges for the entire academic community, including students and teachers. For students, the main opportunities are improving their ability to write reports of high quality, and becoming more efficient when undertaking writing tasks. The challenge is to ensure that students develop their writing capabilities and not just rely on a tool. Students need to ensure that the writing reflects their own voices. A secondary challenge is using Gen-AI resources ethically.

For teachers, the challenge is to develop new ways of teaching while preserving academic integrity. Teachers need to be aware of the capabilities of Gen-AI tools and learn to create assignments that develop the capabilities of students. There have been some initial methods for recognizing when writing is from a Gen-AI tool, such as [2], but no definitive method exists for determining whether or not a student’s work submitted for an assignment involved writing by a Gen-AI tool. Teachers need to be able to advise students on how to use Gen-AI tools effectively. All stakeholders including students, teachers and educational institutions need to ensure that the use of Gen-AI tools is transparent and governed by clear and reasonable policies. In general, academics should teach communication skills as well as discipline-specific skills. Students need to learn research methods and writing.

If used well, Gen-AI tools can help increase the quality of students’ work, and improve the efficiency of assessments. Students will be able to complete more complex tasks in shorter time. If used poorly, Gen-AI tools can lead to an explosion of mediocrity and loss of skills for students [3]. Many papers have discussed the strengths and limitations of ChatGPT and other Gen-AI tools. Examples are [1,2,4]. In this paper, we advocate learning how to use such systems to improve student performance and learning.

The primary objective of this paper is to help international students with academic writing, though we believe that the guidance that can be derived from using Gen-AI tools concerning the nuance of expressions can also be valuable for native English speaking students. We use the term international student to denote students whose native language is not English. Sometimes such students are called non-native English speaking students or ESL students, for English as a Second Language. For simplicity, we use the term international students throughout the paper.

Writing is a key communication skill needed by students who are embarking on academic study and/or tackling university life. It is also important for professional careers [5]. Speaking is another essential communication skill but discussing it is beyond the scope of this paper. We believe that Gen-AI tools offer valuable assistance for helping international students with writing tasks, because they can quickly generate first drafts and help students overcome the intimidating Blank Page Syndrome [6]. These drafts can provide a starting point or a skeleton for their work.

Students may benefit from using Gen-AI tools for the following reasons. International students in academic settings often face significant challenges (especially in project-based subjects) in both comprehending and expressing ideas that are complex [7]. Before the advent of generative AI models like ChatGPT, international students relied on tools such as Google Translate and DeepL to help them navigate complex academic content. Because people feel more comfortable with processing information in their native language, a common approach is translating reading materials or assignment specifications into their native language, resulting in all the thinking and written work being carried out in their native language. The finished work will be translated back into English for submission. However, traditional translation tools have limitations, such as they typically perform mechanical translations solely based on dictionary definitions, failing to account for abstract concepts or contextual nuances [8]. This may result in students struggling not only to understand assignment requirements but also to articulate their own ideas clearly and accurately.

With generative AI tools, international students now have access to a more dynamic and efficient way of engaging with academic content. Instead of relying solely on traditional translation tools, students can use generative AI to first summarise complex texts, helping them grasp the core ideas more quickly. Through interactive conversations with an AI tool, students can have confusions clarified, such as contextual ambiguities or highly specialised academic terminology and vocabulary, which can be particularly challenging for international students. As a result, students have an opportunity to save time on Googling and asking questions on forums unfriendly to beginners, resulting in a more in-depth understanding in academic materials. In this way, the Gen-AI tool acts not only as a means for summarising information but also as a sophisticated search engine, capable of offering tailored explanations and context-driven insights.

Unlike traditional search engines, generative AI models potentially offer ongoing, context-aware conversations, retaining the context of previous interactions to create a more coherent and iterative learning experience. This is particularly valuable when students encounter complex, technical topics, as the AI can provide detailed explanations and simplify difficult concepts. Essentially, generative AI functions as a personalised, interactive learning assistant, helping students bridge the gap between understanding abstract academic material and expressing their ideas effectively. This not only significantly reduces the time spent searching for answers online where relevant information may not always be immediately available or may require waiting for others to respond. The tools also deliver customised, real-time solutions to questions.

This paper is organised as follows. The next section discusses our approach to helping international students find their voice. We then describe the exercises we developed. Section 4 contains a discussion where we look at previous work on helping international students with writing. We include recommendations for students on how to use Gen-AI tools more effectively. The final section concludes the paper.

2. Method

There are three underlying research questions in this paper that are interconnected. What are the actual capabilities of Gen-AI tools, setting aside the common hype around them? This is especially relevant given the variety of Gen-AI tools. With the mixture of free and paid tools, there may be financial factors determining what Gen-AI tools students should use. How can educators guide students to use Gen-AI tools most effectively? How can educators and students share insights about using Gen-AI tools? Overall, this paper aims to explore effective strategies for helping international students utilise current AI technologies to improve their academic writing while maintaining their unique voice and adhering to academic integrity standards.

This paper takes a non-standard methodological approach to answer the research questions. We primarily take a qualitative approach. A series of writing exercises were undertaken by students. Their experiences were reported and then analysed for insight. No thematic analysis was undertaken as the goal was to share insights rather than analyse the reported experiences.

There was also a degree of opportunism involved. Performing one writing exercise suggested a new one to perform. And comparisons were hard to make as the tools kept evolving. Discussing how to use Gen-AI tools is a moving target as there are continual efforts to improve the quality of output generated by Gen-AI tools. Indeed, Gen-AI tools have improved significantly since we started our research in the second half of 2024. We believe the insights gleaned are valuable more broadly. It is a matter for future research to repeat the exercises we developed with a wider cohort of students. When we started, we were unsure what we would discover. Because of what we learned, we persisted and describe our results as a series of insights as laid out in the Discussion section.

We addressed the research questions by conducting a series of writing exercises using Gen-AI tools for specific writing tasks. The results were discussed and analysed each week over a period of four months. Initial meetings were weekly but became more sporadic after the teaching semester finished. Specific insights are shared in Section 3.

More discussion of the validity of our approach is given in Section 4.

We repeated two of the exercises with a cohort of Chinese computer science students in their final year of an undergraduate computing degree who had a study tour hosted by the first author. The insights were useful in designing exercises for the visiting students. The insights from the writing exercises for the two cohorts of students were remarkably similar. Due to the nature of the experiences, it is appropriate to report on the results qualitatively. It would be possible to perform a quantitative study with a precise question, but performing such a study is beyond the scope of the current paper. We believe it is important to share qualitative insights in the fast-moving landscape of using Gen-AI tools.

Let us explain the background of this paper. The rationale for the project is rooted in the academic experience of the first author. He has read numerous reports over the past thirty years from international students. It is often a struggle to understand what is being said. More importantly, from a teaching perspective, it is difficult to determine whether a student understands a concept but lacks the ability to express it in English, or misunderstands it. This is known from many studies on teaching English to international students over many years [7,9,10,11]. As grammar checkers such as Grammarly have improved, some of English expression difficulties can be mitigated. Students should be expected to write effectively. Indeed, good communication skills are a requirement for most professional jobs.

The emergence of Gen-AI tools has changed the experience of creating and assessing reports. ChatGPT was the first Gen-AI tool used extensively for writing reports. Other tools such as Claude and Llama 3 quickly followed, which can also be used for writing tasks. It makes sense for international students to use such tools to write reports. However, it creates a new problem for teachers to navigate, namely, whether the student has independently completed the work.

In an undergraduate capstone project subject (IT Project, University of Melbourne. Available at: https://handbook.unimelb.edu.au/2023/subjects/comp30022, accessed on 1 March 2025) taught in 2023, the first author created an exercise where students needed to reflect on their project experience. They were explicitly allowed to use Gen-AI tools as long as they were transparent about their use. He read over 300 reflection reports, most of which were from non-native English speakers. Reading the reports was insightful and the experience directly influenced this paper.

Overall, the reports written with the assistance of Gen-AI tools were easier and quicker to read. However, there were two major problems that were connected. One was that the reports largely sounded the same. It was hard to distill students’ unique experiences, a phenomenon being increasingly understood. Prakhar Mehrotra’s article “ChatGPT and the Magnet of Mediocrity” [3] discusses how AI-generated content, such as by ChatGPT, tends to be derivative and mediocre. Mehrotra illustrates this by stating that while ChatGPT can generate competent blog posts or articles, it lacks the originality and creative flair that human writers can bring to their work, such as in the creation of new narrative structures or interesting ideas. The second problem concerns the authenticity of the experience being reported. There were at least ten reports that compared the experience of leading a software team to being a conductor of an orchestra. The gist of the comparison was that it was necessary to get diverse instruments/people to work together and coordinate them effectively. Clearly, ten people did not come up with the analogy independently. Furthermore, why was the analogy even appropriate in the first place? Did the students who used this analogy even have experience being in an orchestra? Could the reader relate to such an experience? On reflection, it was not an appropriate analogy. It did not reflect the student’s voice. There were also several analogies comparing leading a software team to being the captain of a boat on the seas. The same issue of authenticity exists.

In reading the reflections, the first author wanted to hear students’ voices, which were largely getting lost in the use of Gen-AI tools. He proposed a research project topic on how students can find their voice when writing reflective reports using Gen-AI tools. The second and third authors volunteered to undertake the project and conducted the initial exercises, the results of which are described here. The fourth author joined the project for later stages of analysis. Consequently, the exercises are skewed towards computing students, but we believe the insights are more generally applicable.

The second, third and fourth authors are international IT students studying computing at the University of Melbourne who wish to improve both their writing skills and the use of Gen-AI tools. Communication can be an issue for international students [12]. While the focus is on exercises which are meaningful for IT students due to the experience and expertise of the authors, we believe that the insights shared are more broadly applicable.

3. Results

As the project began, we undertook weekly writing exercises. An exact list of the exercises is available from the authors. Weekly meetings were very efficient and beneficial. Discussion of the outcomes was insightful for all the authors. These insights are shared in this section. As mentioned in Section 2, our study methodology is non-standard, and there is room for a more systematic and controlled study on a larger cohort of students in the future. However, in our opinion, it nevertheless is valuable to report our current results, as Gen-AI tools continue to evolve.

Our results are presented as a series of analyses of writing exercises. We mention to which research question the writing exercise pertains. Each exercise description states what was produced in the writing exercise, followed by insights gained by the authors while discussing the writing. The research team effectively followed the metacognitive strategies of monitoring and evaluation [13,14] as discussed further in Section 4.

3.1. Exercise 1: Writing Personal Reflections with AI Assistance

The first exercise we undertook set the scene. A simplified reflective report was proposed. Students were asked to write 100 words about what they had learned during their undergraduate studies. The participants were asked to take three different approaches to write the simplified reflection.

Independent writing: Participants independently authored their responses without external assistance.
AI-assisted writing: Participants used ChatGPT to polish self-written responses in approach 1.
AI-generated content: Participants relied on ChatGPT to generate the responses entirely based on prompts provided.

This exercise pertains to the following two research questions: How can educators guide students to use Gen-AI tools most effectively? Second, how can educators and students share insights about using Gen-AI tools?

The exercise was conducted by two of the authors and some of their friends. Twelve students overall undertook the exercise. They were given a week to respond. ChatGPT was used for the exercise due to its high popularity in the student community. It is a mainstream LLM with the most paid monthly active users on the market, as of 2024.

The three versions of the simplified reflective report of each of the participants were discussed at our second weekly meeting. The student authors gave a qualitative summary of the changes introduced by ChatGPT. The consensus was that the ChatGPT version sounded more “formal” than the initial response. The sense of formality was due to both choice of words and sentence structure. All of the changes that were made by ChatGPT were discussed, considering whether they accurately represented the intention of the author. The consensus was that ChatGPT often struggled to capture subtle semantic nuances, resulting in deviations from the original ideas it aimed to express. All authors agreed it was helpful to discuss all of this together and this discussion was a valuable learning exercise.

An interesting example of nuance that was extensively discussed was triggered by the following response (approach 1): “The first thing that I learned is to be brave”. After being refined (approach 2) by ChatGPT, it became the following: “First, I’ve learned the importance of bravery”. The revised version, while grammatically correct, altered the original tone. The phrase “to be brave” is more personal and direct, whereas the refined version sounded more abstract and detached, which is less suitable for a personal reflection. Interestingly, this distinction would not have been initially apparent to the student authors.

Another issue we identified was a tendency towards over-boasting. For example, when students used modest and humble expressions such as acquired the skill or capable of doing, the Gen-AI tools often replaced these terms with more exaggerated ones such as comprehensively or swiftly extract vital information. The AI-generated responses would describe outcomes in a more confident and affirmative manner, which often misrepresented the student’s original intention. Specific examples include the following:

Know replaced by Master.
Use replaced by Leverage.
More replaced by Significantly.
Are prepared replaced by Well prepared.
Capable of doing replaced by Have a good command of.
Have developed replaced by Gain a deep understanding of.
Have learned replaced by Have gained proficiency.

This overconfidence in language use can distort students’ intended meaning and tone, jeopardising students’ academic integrity in academic writing, where modesty and accuracy are priorities.

We were able to extract more significant findings from the refined responses (approach 3). Participants may “lose their voice” after responses are refined. Firstly, international students may have a limited ability to determine the semantic and contextual appropriateness of word choices in English, especially when thesauruses are suggested by LLMs. For instance, a response mentioned the participant’s ability to utilise some academic resources provided by the university to improve their academic performance. In the refined response, the word utilise was replaced with leverage. It may not be challenging for a native English speaker to determine the inappropriateness of using leverage based on years of experience of using English. However, many international students may fail to realise implications of this word choice. Additionally, the fact that Gen-AI tools tend to write in an overconfident tone and choose more advanced words (regardless of appropriateness) may create an illusion for international students that Gen-AI tools are professional in writing, yielding excessive trust in their abilities.

Interestingly, as we were writing this paper, the grammar checker provided by the LaTeX editor we used, Overleaf, became a focus of attention. We discussed its suggestions and realised that we did not want to automatically accept its suggested changes because the changes could change the tone. It was not solely an issue of correcting grammar.

An interesting case in point was the suggestion to change the phrase “excel in” to “excel at”. This prompted an interesting discussion and Internet search as to which was the correct usage. There was no clear answer, and the international students were exposed to a subtlety in the English language that would not have been apparent if the change had been automatically accepted. Indeed, discussions about points of language were consistently insightful and do not come about in standard feedback on assignments.

Another change concerned whether to use collected responses or responses collected. To explain the subtle difference between these phrases: “Collected responses” functions as an adjective phrase where “collected” modifies responses, suggesting responses that have been gathered or assembled. “Responses collected” is a past tense verb phrase indicating the action of collecting responses. It often appears in constructions like “the number of responses collected” or “responses collected during the survey”. Both are grammatically correct but serve slightly different functions in a sentence. “Collected responses were analysed” is appropriate. The responses collected showed interesting trends. The above explanation was edited using Claude 3.5 Sonnet, which then asked “Would you like me to provide some example sentences to illustrate the difference in usage?”.

Here is another interesting little subtlety. For the posting “Definitely something <person1> and <person2> need to refer to”, the grammar checker suggested adding a comma after “Definitely”: “Definitely, something <person1> and <person2> need to refer to”. The “corrected” version has the tone changed, as “Definitely” in the former sentence described the extent of certainty, while “Definitely” in the latter sentence was simply an exclamation. However, it would be hard for an international student not to accept the change as they may not appreciate the subtle distinction.

We conducted a similar exercise six months later with a group of Chinese students visiting on a study tour. The students were asked to write 100 words about their impressions of Melbourne, and then to refine their writing with a Gen-AI tool. We compared their initial efforts to the refined efforts. While grammar was improved and spelling mistakes eliminated, the Gen-AI tools changed the language, making it more forceful and adding impressions that the students may not have had.

All the students benefited from the discussion and seeing the consistent over-boastful writing produced by the Gen-AI tools. In addition, the students noticed the use of words with which they were unfamiliar such as “breathtaking” which appeared in multiple refinements. One notable refinement was the addition of the word “mesmerising,” a word which none of the students knew the meaning of. It was explained that it is not good practice to use such words.

3.2. Exercise 2: Comparing Gen-AI Tools

The landscape of Gen-AI tools is diverse, with many options such as ChatGPT, Claude, Meta AI, etc., each possessing unique strengths and limitations. A list of Gen-AI tools and the corresponding LLMs used in this research is given in Appendix A.

Heartened by the discussions stemming from analysing the responses to the first exercise, we were encouraged to undertake more exercises. One of the respondents in the first exercise had used a different Gen-AI tool. The modifications were different. We were all curious as to how effective the various Gen-AI tools were, and the research question we were considering was as follows: “What are the capabilities of Gen-AI tools?” Detailed information about the LLMs we used is provided in Table A1.

Based on the responses of the survey in exercise 1, the second exercise sought to explore the qualities (including credibility, semantic consistency and answer integrity) of generated texts from various Gen-AI tools. The question for which a response was requested was as follows: “Why do I need to understand and learn artificial intelligence, and why is artificial intelligence important?” Participants in the exercise were asked to write a 200-word response to this question.

We experimented with multiple Gen-AI tools. We received responses from the question and then refined the responses using three different LLMs: ChatGPT-4, Claude 3.5 Sonnet, and Llama 3 with the same prompt: “refine it”. A complete set of the responses are available on request.

By comparing the generated results from different LLMs with same prompt, we found that performance and output quality could be significantly affected by technical limitations, such as token limits and the maximum context length, also referred to as the context window. For instance, ChatGPT-4’s maximum token limit is 4096 tokens, while Claude 3.5’s is 8132 tokens and Llama 3 even supports 128k tokens. The token limit difference can significantly impact the performance when dealing with larger contexts. ChatGPT-4 tended to omit instructions or generate truncated responses when longer prompts were provided, whereas Claude 3.5 and Llama 3 128k handled the same prompts better because of their larger token capacities. In our exercises, Claude 3.5 Sonnet stands out for its ability to conduct literature reviews and follow complex instructions due to its strong contextual handling capabilities. Claude 3.5’s token limit and context length far exceed those of GPT-4, allowing it to process and output long texts without losing critical information. This became particularly evident during long-text testing, where GPT-4 occasionally struggled to fully read and process extended inputs, whereas Claude 3.5 was able to handle the entire context effectively.

ChatGPT-4 is often preferred for its balanced performance across a wide range of academic tasks, benefiting from its fine-tuned contextual awareness and general-purpose adaptability. During Ian’s internship, he experimented with connecting these three models to an Agent for executing complex instruction sets. The results showed that GPT-4 outperformed the other models in terms of precision in executing specific instructions, making it the most suitable choice for tasks requiring exact follow-through.

Meanwhile, Llama 3, though newer, is optimised for token efficiency, allowing it to handle longer-form writing tasks with less computational overhead. According to a study by Novita AI, Llama 3 70B can be up to 50 times cheaper and 10 times faster than GPT-4 when used through cloud API providers [15]. This makes it particularly advantageous for tasks that require processing large amounts of data or generating long-form content without excessive computational costs. These differences in performance, context-sensitivity and task specialisation make it more complex for students to choose the right model for their specific writing needs. Effectively utilising each model’s strengths while compensating for its limitations further complicates the process of matching the right tool to the task at hand.

During the discussion, the first author shared a response he had received from ChatGPT 3.5 to the following prompt: “Write a 100-word description of the fire at my father’s paint factory when I was 9”.

The response was the following: “When you were nine, a blazing inferno engulfed your father’s paint factory, painting the sky with hues of fiery orange and billowing smoke. Flames danced voraciously, consuming the structure, while the acrid scent of burning chemicals filled the air. Panic gripped the scene as firefighters battled tirelessly, their efforts highlighted by flickering red and blue lights. Amidst the chaos, memories of your father’s hard work and dedication mingled with fear for his safety. Despite the devastation, the community rallied, offering support and solidarity. The aftermath left an indelible mark, a testament to resilience and the fragility of human endeavour”.

The report is well written, but it is not an account of the fire. How could it be, as it has no details of the actual fire? ChatGPT generated an essay about a (fictional) memory, rather than describing a memory. They are not the same thing.

Interestingly, we entered the same prompt to Claude and received a more suitable response. Claude responded that it could not give a memory about the fire. However, it could give a 100-word description about a fire, which was a much better and useful response.

The discussions surrounding this exercise suggest that when students refine their prompts using Gen-AI tools, they should carefully consider the trade-offs between tool capacity and task requirements, especially when engaging in long-form or detailed writing. This led to the next exercise.

3.3. Exercise 3: Prompt Tuning Insights

In the third exercise, we built on the results of the second exercise and conducted further experiments. This time, we compared different prompt structures and tested multiple Gen-AI tools to assess how well they could handle personal writing tasks. Again, we were addressing the research question of how students should learn how to use the Gen-AI tools.

Our motivation stemmed from observing in previous exercises that the level of detail in prompts influenced the balance between creativity and standardisation in AI-generated responses. The Gen-AI tools used in this exercise remained the same, but the prompts ranged from simple instructions to more detailed ones that included personal background, writing context, and even bullet points. The three prompt structures we used were as follows:

Minimal context: The original question plus “refine it”.
Moderately detailed context: The original question plus “refine it and check grammar while keeping the same tone”.
Highly detailed context: The original question plus “refine it as a paragraph, academic writing, as an undergraduate student, major in CS, University of Melbourne, 200-words, answering ‘Why do I need to understand and learn artificial intelligence, and why is artificial intelligence important?’”.

We conducted more tests using multiple LLMs (ChatGPT-4o, Claude 3.5 Sonnet, Llama 3 70B) and tested each of the three prompts under controlled conditions. There were ten participants in this experiment, largely consisting of students majoring in computer science, ensuring a consistent background to allow for meaningful comparisons. For each prompt, the AI-generated responses were collected, grouped by prompt type, and then analysed for the following:

Relevance: The degree to which the response addresses the prompt.
Diversity: The linguistic, structural and content variation between responses.
Tone Preservation: The degree to which the response retains the distinctive tone and style implied by the original question.

By comparing the responses generated from different prompt styles and Gen-AI tools, we identified several key issues:

Our findings indicate that while detailed prompts can improve relevance and specificity, they also increase the risk of generating uniform and predictable responses. Comprehensively, the more detailed the prompt, the more uniform and standardised the responses became. For example, when the prompt explicitly mentioned the student’s background as “a CS student at the University of Melbourne”, ChatGPT-4o tended to produce highly similar responses across different participants.

Here are two notable sets of similar generated paragraphs:

Set 1: paragraph 1
As an undergraduate student majoring in Computer Science at the University of Melbourne, understanding and learning artificial intelligence (AI) is crucial for both my academic and professional development.
paragraph 2
As an undergraduate student majoring in Computer Science at the University of Melbourne, I recognise the critical importance of understanding and learning artificial intelligence (AI).
Set 2: paragraph 3
Understanding and learning artificial intelligence (AI) is crucial for several reasons, particularly from the perspective of an undergraduate student majoring in Computer Science at the University of Melbourne.
paragraph 4
Understanding and learning artificial intelligence (AI) is essential for me as an undergraduate student majoring in Computer Science at the University of Melbourne, as it aligns with both my personal interests and future career aspirations.

In the two sets of examples above, it is obvious that when given enough detailed information as a prompt, ChatGPT-4o is more likely to translate or paraphrase the prompt than to totally refine it based on the information provided by the prompt. Thus, with the same prompt, ChatGPT-4o will likely produce very similar generated content.

Similarly, if detailed bullet points were provided in the prompt, the AI would incorporate all of them into the response, often without much variation. This raises concerns in contexts like a classroom where many students may use similar prompts.

In such cases, responses could become repetitive. For instance, if most students identify as computer science students, ChatGPT might consistently generate phrases such as “version control systems” or other highly specific technical terms, which would lead to homogeneity in responses. This is problematic, as the goal of academic writing is to encourage original thoughts and varied perspectives.

We found that simpler prompts such as “refine it” or “check its grammar and keep the same tone” were preferred when students were writing personal reflections. These instructions allowed them to maintain control over their writing while enhancing their English fluency without losing their voice.

3.4. Exercise 4: Summarisation Ability of LLMs

We used ChatGPT-4o and Claude 3.5 Sonnet to test Gen-AI tools’ performance in different scenarios where international students face challenges. We observed that Gen-AI tools perform better on summarising content in very specific contexts, rather than generating content for open-ended questions in general contexts.

To begin with, we simulated a scenario that can challenge many international students—reading and understanding long assignment specifications. We collected many undergraduate-level computer science assignment specifications from the University of Melbourne, cleansed the text via removing irrelevant content such as headers and footers, and supplied them to both Gen-AI tools as the context with a prompt “Summarise it”. As a result, they successfully extracted the main points from the specifications in a straightforward and accurate manner. Minor errors did exist, such as ChatGPT-4o made an assumption that the total mark of an assignment was 100, while the specifications never explicitly stated so. We then gave a more challenging scenario: “I am an international student struggling to understand this specs, the deadline is approaching, and I don’t even know where to start”. ChatGPT-4o and Claude 3.5 Sonnet were both capable of accurately identifying the implementation priorities and extracting the grading emphases, and gave proper advice on earning the most essential marks in a time-sensitive manner. The advice was articulated concisely in bullet points, with subheadings such as “Immediate steps to take” and “Technical must-haves”, minimising the effort required to navigate through the information. At the end, both Gen-AI tools offered further assistance by asking “Let me know if you need help with specific coding tasks!” or “Would you like specific guidance on where to begin with either the UML design or the code implementation?”.

We also tested Gen-AI tools’ ability to assist with GRE (Graduate Record Examinations) questions, such as questions requiring the extraction of main ideas from given reading materials, which are typically several passages. These passages consist of substantial information with an incredible number of details, and are written in a clear structure. Using such reading materials to assess Gen-AI tools’ summarisation capabilities conveniently provides an accurate way to identify and uncover potential issues, such as oversimplification, confusion of subjects, incorrect chronological order, etc. For testing, we randomly selected three long passages from GRE reading comprehension questions which appeared in actual exams. Both ChatGPT-o1 and Claude 3.5 Sonnet showed tendencies towards simple restatements, where the AI merely paraphrased the original text rather than gathering and synthesising information scattered around. Additionally, ChatGPT-o1 occasionally presented information “made out of thin air”, generating details that were not present in the original passage. For example, for the following sentence extracted from an actual exam question:

Original: Recently, an unusually high number of dolphins have been found dead from infectious diseases, and most of these had abnormally high tissue concentrations of certain compounds that, even in low concentrations, reduce dolphins’ resistance to infection. The only source of these compounds in the dolphins’ environment is boat paint. Therefore, since dolphins rid their bodies of the compounds rapidly once exposure ceases, their mortality rate should decline rapidly if such boat paints are banned.

ChatGPT-o1’s summary: A recent surge in dolphin deaths from infectious diseases has been linked to elevated concentrations of infection-weakening compounds in their tissues (Made out of thin air: The original article did not mention the recent increase in concentrations of toxic compounds). These chemicals come exclusively from boat paint. Because dolphins quickly eliminate these substances once exposure stops, prohibiting the use of such paints should rapidly lower their death rates.

Meanwhile, Claude 3.5 Sonnet sometimes failed to provide a comprehensive summary, omitting essential information. Overall, these Gen-AI tools are not yet fully reliable as reading assistants, and we still need to manually verify the accuracy via referring to the original text.

In contrast, both ChatGPT-4o and Claude 3.5 Sonnet performed worse on open-ended questions with broad contexts. For instance, we attempted to seek career advice such as “What would a career as a data science engineer be like”. The responses were relatively too general and they simply repeated information that could be easily found on the internet. An analogy could be the following: if the previous “a struggling student coping with the approaching deadline” situation was provided, instead of advising on which specific tasks to prioritise and implementation steps to follow, suggestions such as “calm down and meditate yourself”, “make sure that you read the assignment specs thoroughly” and “reviewing the lecture content and the learning materials would help” were given. In the real world, if an open-ended question with a broad context is asked, the person being asked may seek more background information to better understand the intention and provide a more tailored response. However, both ChatGPT-4o and Claude 3.5 Sonnet provided general responses without asking any follow-up questions for clarification.

To verify the integrity and reliability of LLMs, we collected more than 20 assignment specifications from 11 computer science subjects at the University of Melbourne, converted them into the Markdown format with redundant text (such as headers and page numbers) cleansed, summarised these specifications using LLMs, and evaluated and compared the results using metrics such as Semantic Similarity, Bert Score, ROUGE-1 (n-grams), ROUGE-2, and ROUGE-L Scores, adopting and modifying the metrics used by a group of researchers to study [16]. Semantic Similarity measures the semantic closeness between the generated and original text, indicating the extent of meaning preservation. BERT Score evaluates the contextual and semantic alignment using BERT embeddings, providing measurements in precision, recall, and F1 scores. ROUGE-1 and ROUGE-2 calculate the overlap of unigrams and bigrams, reflecting lexical coverage and local coherence. ROUGE-L assesses global structure similarity based on the LCS (longest common subsequence). These metrics allow for a multidimensional analysis of the summary’s quality, combining lexical and semantic perspectives.

We used ChatGPT-4o and Claude 3.5 Sonnet as test objects and employed 4 different prompts to test the summary output of the LLMs:

“Summarise this”.
“Summarise in detail”.
“I am a student and my assignment is due soon, but I don’t have time to read everything thoroughly. Please summarise this and extract the key points.”
“I am a student and I am struggling with figuring out where to start, while the deadline is approaching. Could you give me some hints on what to do?”

We used BLEU and ROUGE as metrics to evaluate the summary accuracy and compare different models and prompt variations. The results are given in Table 1, and visualised in Figure 1.

According to the experimental data, ChatGPT-4o and Claude 3.5 Sonnet performed almost equally in summarisation, especially in keyword extraction and semantic information retention (as indicated by high ROUGE-1 and BERTScore metrics), and were capable of general tasks. However, their ability to retain local coherence and global structure, indicated by ROUGE-2 and ROUGE-L scores, needs improvement, especially when dealing with complex logical relationships or highly structured content. In general, LLMs are suitable for quickly generating summaries, but for summarisation tasks that require high precision and complex semantics, we cannot rely solely on LLMs.

We also presented a summarisation exercise to the visiting Chinese students who repeated the first exercise. The students were asked to summarise three articles and compare them with summaries that Gen-AI generated. We have not performed a detailed analysis. However, the quality of the GEN-AI generated summaries varied. For technical articles, sometimes the summaries were effectively just jargon. Also, student-driven summaries varied markedly in length, depending on the context of what was being summarised. The Gen-AI summaries were more standard. We intend to carry out more work on summaries in future work.

To further understand summarisation ability, we asked ChatGPT-4o and Claude 3.5 Sonnet to summarise an article by the first author [17]. Both LLMs produced coherent summaries. The accuracy of the summaries and their coherence were a surprise to the first author.

4. Discussion

The overall objective of our research is to provide strategies for students to use Gen-AI tools for a variety of writing tasks, while maintaining academic standards, ensuring academic integrity, and preserving students’ unique voices. The exercises described in this work show potential for improving the use of Gen-AI tools by international students for academic writing. We now discuss how this work fits in a broader context.

4.1. Writing Strategies

Teaching communication skills is often an afterthought in computing subjects. Yet, communication skills are vitally important for success in the workforce. Furthermore, academics in computer science do not typically seek advice from education experts. Engineering teachers “do it themselves”. Admittedly, the writing exercises were designed from the experience of the first author rather than through consultation with the education literature.

However, writing this paper has been an opportunity to connect with other work. An early example of seeing how international students learn to write in English dates from 1987, authored by Arndt [9]. There is a good survey of writing approaches in [13]. In particular, Mu identifies five different groups of writing strategies in ESL writing which were pioneered by Mu [13], namely, rhetoric strategies, cognitive strategies, metacognitive writing strategies, communicative strategies, and social/affective strategies. The paper cited in [18] discusses particular writing strategies. A good overview of the writing strategies mentioned in [13] is given by Aripin and Rahmat [14], concentrating on metacognitive ones.

In this paper, we have not adopted a theory-based approach. Computing students are typically not interested in writing theories and just want advice on how to write better. We do note that we have taken a social construction approach with the exercises. Given that we had immediate positive gains from our initial writing exercises, we persisted in gleaning insights which have already proven valuable. We note that there is room for extensive future work, as Gen-AI tools are here to stay.

The work reported in [19] is relevant. The paper examines Chinese university students’ metacognition, critical thinking skills and academic writing. Almost all of the student work investigated in our paper was from Chinese students.

We note that there is a difference between skilled and naive writers. In computing subjects, where most of students we considered came from, most students are not skilled. They often give simple points and expect Gen-AI tools to fill in the gaps.

4.2. Ethical Considerations

There are certainly ethical considerations when it comes to using Gen-AI tools. Universities worldwide are struggling with the questions of how to introduce Gen-AI and what restrictions to implement. Our research has taken the pragmatic position that students are using these tools, motivating us to give them guidance on how to use these tools better.

There are also concerns about bias in Gen-AI tools, stemming from the material on which they are trained. We are aware of such concerns, but they did not impact this research. Again, we focused on pragmatically giving students guidance on how to use existing tools. Some discussion of ethics and bias arose in our weekly discussion, but it did not significantly impact our reported observations.

Here are some qualitative reflections on ethics. Regarding the issue of bias in generative AI, we are all aware of cultural and linguistic differences, as well as various types of bias caused by imbalance and incompleteness in training datasets, in addition to problems such as plagiarism and fabrication. All AI-generated content involved in our paper has been reviewed manually to ensure that these issues were avoided.

In terms of ethical concerns, our paper clearly states that all use of AI tools should follow the principles of transparency and accountability for the generated content. We understand how to adhere to standards of academic integrity and avoid unethical behaviours that may result from the inappropriate use of AI tools. We recognise that generative AI may reflect cultural, linguistic or racial biases during content generation, largely due to imbalanced training data. For English as Second Language speakers, this could lead to misleading expressions and academic risks. In our paper, we clearly emphasise that AI tools must be used transparently and responsibly in academic contexts, and any usage must be disclosed to avoid academic misconduct due to unauthorised assistance. Furthermore, we remain cautious about the potential risks of plagiarism, fabricated content and fake citations caused by over-reliance on AI, and we stress the importance of applying critical thinking when reviewing AI outputs.

Arpin and Rahmat investigated students’ use of metacognitive writing strategies at their university in Malaysia [14], and were interested in possible differences in writing strategies by gender. While an interesting and important issue, we did not consider gender in our studies. We believe the insights gleaned about using the tools we reported in the previous section are independent of gender. Investigating gender differences is a topic for further research and is beyond the scope of this paper.

Intellectual Property Ownership is an ethical concern. Considering the fact that international students may face challenges in written expression, we did use large language models during the writing process to polish and optimise the original text of this paper. However, we can ensure that the structure, ideas, viewpoints and analytical process of the paper were entirely developed by the students and the professor, which guarantees the originality of the content and therefore establishes clear intellectual property ownership. In addition, the AI-generated content used in our research was used solely for evaluation and comparative analysis and was not included in the final version of the paper. As such, our work complies with academic institutions’ definitions and requirements regarding intellectual property.

4.3. Qualitative Evaluation of Lessons Learned Through the Exercises

The student authors of the paper did a retrospective, qualitative evaluation of what they learned. After completing the exercises, they unanimously agreed that their ability to use AI improved significantly, with a shift from 2 to 4 on a 5-point scale. Overall, we have all realised the importance of keeping your voice when using generative AI. This is particularly crucial for international students, as certain English words translated into Chinese may appear the same but differ in meaning, which may inadvertently distort the original intent.

There was considerable discussion of ethical issues. We have learned how to avoid generating overly standard (and similar) content with AI, thereby reducing the risk of academic plagiarism. In summary, the progressive discovery of the shortcomings of generative AI during the exercises has helped us better understand and use AI more effectively.

4.4. Language Difficulties for International Students

The discussions about how to use Gen-AI tools were beneficial for all the authors. In this AI era, the use of AI in the student community is inevitable and unstoppable. While some universities have been tempted to outlaw students’ AI usage, there is little justification for this because competency in using AI tools will be required when students enter their professional lives. A comparable situation might be the following: in some countries with an exam-oriented education system, students are strictly prohibited from dating because parents and teachers consider dating a distraction that worsens students’ academic performance. Ironically, this stance on dating suddenly shifts to the other extreme when students enter university. Dramatically, rather than being prohibited, students are constantly urged to date and eventually get married. Students, who were strictly forbidden from dating due to high-pressure exam-oriented education, suddenly face social pressures to start dating. They struggle with this abrupt shift and lack dating experience. Therefore, it is essential and beneficial to practise how to use Gen-AI tools in academic learning environments.

Staff need to specify assessments so that instructions for students concerning the use of Gen-AI tools are clear. There should also be an opportunity to reflect on this use. Our weekly discussions, where we reviewed the exercises undertaken with the Gen-AI tools, were valuable. Discussions about the nuances of words were insightful and open-ended. Providing an opportunity for such open-ended discussions is important for learning. Oftentimes, the demand for assessment rubrics limits such discussions. In other words, the push for precise, measurable assessments for quantifying students’ academic performance can overshadow or discourage free-form, in-depth conversations.

From conducting the relevant studies and our regular conversations, we observed that international students may encounter difficulties in the following four areas:

Keeping a personal voice: Ensuring that their unique voice and tone are retained while adhering to formal academic writing conventions is often challenging.
Over-reliance on AI: Non-native English speakers often overly trust AI tools due to lack of experience and comparative knowledge, but they are more discerning in their mother tongue.
Challenges in literature review: Effectively finding and synthesising relevant research remains a considerable obstacle for many students, especially when overcoming language barriers.
Difficulty in summarising texts: International students often spend a great deal of time summarising complex academic material, a situation further exacerbated by the fact that computer science specifications often contain advanced vocabulary or abstract concepts.

There is no doubt that using Gen-AI tools can help non-native English speakers increase the quality of their writing. At the very least, grammar and flow should improve. However, the challenge is to ensure that students learn rather than just accepting the suggestions of Gen-AI tools. It is important for students to be aware of the need to develop a voice, instead of submitting writing that sounds like “everyone else”.

4.5. Suggestions for Improvement

Advice on the advantages and pitfalls in using Gen-AI tools has begun to appear both in academic journals [1,4,8] and in the popular press [2,3,20]. A common theme in such articles is that text generated by Gen-AI tools has predictable patterns, an observation that we noted while discussing our exercises. Example papers are [2,3,20]. Here are three tips that emerged from the writing exercises.

Translate to your native language: To mitigate difficulties in understanding, we propose an alternative approach: after generating English content with AI, students should translate it back into their native language. Since students are more confident in their native language, this method allows them to assess whether the generated English text aligns with their original intent. By translating the content back and forth, students can ensure that their personal tone and meaning are preserved, resulting in a final output that better reflects their ideas.
Experiment with different Gen-AI tools: As discussed in Section 3, different Gen-AI tools produce different results. In our case, discussing the differences led to useful insights. Claude was more honest that it could not make up history.
Experiment with different prompts and summarising abilities: In a writing task, a structured workflow can be adopted. First, the entire task is broken down into multiple stages, each focusing on a specific subset of the relevant literature. Then, a Gen-AI tool can be used to generate a concise summary of each document. RAG technology can also help international students quickly access and summarise relevant academic resources. It uses retrieval technology to extract information from existing research, making it easier to conduct a comprehensive review without extensive manual searching.

To make AI-assisted content more personalised, students can adopt several strategies. Incorporating personal experiences, describing research motivations, and incorporating one’s own opinions and comments when discussing parts of the literature are all effective.

5. Conclusions

Gen-AI tools can provide valuable assistance to international students in their academic writing. In order for students to use such tools effectively, they need to understand the changes that the tools make to their original thoughts and drafts, and how best to provide prompts.

This study was carried out to help improve English writing skills. For the students involved, it was an opportunity to improve their writing and make better use of Gen-AI tools. For the teacher involved, it was an opportunity to learn how to better allow the use of Gen-AI tools while still improving the writing of students. Having completed these exercises, the importance of improving writing skills has only become more apparent.

By streamlining the literature review process and improving writing structure, these tools enable students to focus more on the content of their writing submissions and less on mechanical aspects.

There is potential to further explore the role of AI in critical analysis, research question generation and enhancing personal writing style. These tools could evolve to meet the specific needs of international students, providing more personalised assistance in the future.

Author Contributions

All authors contributed equally to this research. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent for participation was obtained from all subjects involved in the study.

Data Availability Statement

We used Google Drive and OneDrive to store our research data. We will not disclose detailed data for the privacy of the volunteers who participated in the research questionnaire. For the data that can be disclosed, this article includes these data in the provided table.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Model Details

The following shows the specific parameters of the models we used throughout the research process, so as to compare the advantages and disadvantages of different models.

Table A1. Details of the large language model involved in this paper.

Model Name	Parameter Quantity	Context Length	Time of Knowledge Cutoff	Token Limit
ChatGPT-4o	200b	128k	October 2023	4096
ChatGPT-o1	200b	200k	October 2023	4096
Claude 3.5 Sonnet	175b	200k	April 2024	8192
Llama 3-70B	70b	128k	December 2023	4096

References

Dwivedi, Y.K.; Kshetri, N.; Hughes, L.; Slade, E.L.; Jayaraj, A.; Kar, A.K.; Baabdullah, A.M.; Koohang, A.; Raghavan, V.; Ahuja, M.; et al. Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int. J. Inf. Manag. 2023, 71, 102642. [Google Scholar] [CrossRef]
Hacker, D.M. 11 Signs That AI Wrote It. Medium Magazine, 14 December 2024. Available online: https://medium.com/the-writers-pub/15-signs-that-ai-wrote-it-9bc37e165973 (accessed on 20 December 2024).
Mehrotra, P. Chat-GPT and the Magnet of Mediocrity. The Edge, 21 March 2023. [Google Scholar]
Ahmad, N.; Murugesan, S.; Kshetri, N. Generative Artificial Intelligence and the Education Sector. Comput. J. 2023, 56, 72–76. [Google Scholar]
Graham, S.; Perin, D. Writing Next: Effective Strategies to Improve Writing of Adolescents in Middle and High Schools—A Report to Carnegie Corporation of New York; Alliance for Excellent Education: Washington, DC, USA, 2007. [Google Scholar]
Korostyshevskiy, V. Spoken language and fear of the blank page. Adult Learn. 2016, 29, 170–175. [Google Scholar]
Hilton-Jones, U. Project-Based Learning for Foreign Students in an English-Speaking Environment. In Proceedings of the 22nd Annual Meeting of the International Association of Teachers of English as a Foreign Language, Edinburgh, UK, 11–14 April 1988. [Google Scholar]
Naveen, P.; Trojovský, P. Overview and challenges of machine translation for contextually appropriate translations. iScience 2024, 27, 110878. [Google Scholar] [CrossRef] [PubMed]
Arndt, V. Six writers in search of texts. ELT J. 1989, 41, 257–267. [Google Scholar] [CrossRef]
Ghafar, Z. Teaching writing to Students of English as a foreign language: The Challenges Faced by Teachers. J. Digit. Learn. Distance Educ. 2023, 2, 483–490. [Google Scholar]
Vejayan, L.; Yunus, M. Writing Skills and Writing Approaches in ESL Classroom: A Systematic Review. Int. J. Acad. Res. Bus. Soc. Sci. 2022, 12, 1301–1319. [Google Scholar] [CrossRef] [PubMed]
Amin, M.; Afzal, S.; Kausar, F.N. Undergraduate Students’ Communication Problems, their Reasons and Strategies to Improve the Communication. J. Educ. Res. Soc. Sci. Rev. 2022, 2, 109–117. [Google Scholar]
Mu, C. A Taxonomy of ESL Writing Strategies. In Proceedings of the Redesigning Pedagogy: Research, Policy, Practice, National Institute of Education Conference, Singapore, 30 May–1 June 2005; pp. 1–10. [Google Scholar]
Aripin, N.; Rahmat, N.H. Exploring Metacognitive Writing Strategies in the Writing Process using Think aloud Protocol: A Study across Gender. Am. J. Soc. Sci. Humanit. 2019, 4, 178–187. [Google Scholar] [CrossRef]
Marketing Novita AI. Llama 3.3 70b vs. Mistral Nemo: Which Is Suitable for Multilingual Chatbots? 2023. Available online: https://medium.com/@marketing_novita.ai/llama-3-3-70b-vs-mistral-nemo-which-is-suitable-for-multilingual-chatbots-8c84233bd8eb (accessed on 20 December 2024).
Bogireddy, S.R.; Dasari, N. Comparative Analysis of ChatGPT-4 and LLaMA: Performance Evaluation on Text Summarization, Data Analysis, and Question Answering. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–7. [Google Scholar] [CrossRef]
Sterling, L. The effect of Chat-GPT on Relationshios. In Interpersonal Relationships in the Digital Age; Ferreira, J., Ed.; Intech Open: London, UK, 2025. [Google Scholar]
De Mello, G.; Omar, N.; Ibrahim, R.; Ishak, N.; Rahmat, N. A Study of The Use of Writing Strategies from The Social Cognitive View. Int. J. Acad. Res. Bus. Soc. Sci. 2023, 13, 2695–2711. [Google Scholar] [CrossRef] [PubMed]
Teng, M.F.; Yue, M. Metacognitive writing strategies, critical thinking skills, and academic writing performance: A structural equation modeling approach. Metacogn. Learn. 2023, 18, 237–260. [Google Scholar] [CrossRef]
Ghate, A. The AI Panic is Real—More so for a Writer. Available online: https://medium.com/@amrutaghate/the-ai-panic-is-real-more-so-for-a-writer-5bf9cd2c249c (accessed on 17 December 2024).

Figure 1. A full-width bar chart comparing ChatGPT-4o and Claude 3.5 Sonnet performance scores on different metrics. ChatGPT-4o bars have diagonal lines, while Claude 3.5 Sonnet bars have dots for better accessibility.

Table 1. Performance metrics for Claude 3.5 Sonnet and ChatGPT-4o.

Metrics	ChatGPT-4o	Claude 3.5 Sonnet
ROUGE-1 (Precision)	0.8973	0.8453
ROUGE-2 (Precision)	0.2556	0.2765
ROUGE-L (Precision)	0.5000	0.4792
BERTScore_F1	0.5585	0.5519

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sterling, L.; Ye, C.; Ying, H.; Chen, Z. Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing. Information 2025, 16, 289. https://doi.org/10.3390/info16040289

AMA Style

Sterling L, Ye C, Ying H, Chen Z. Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing. Information. 2025; 16(4):289. https://doi.org/10.3390/info16040289

Chicago/Turabian Style

Sterling, Leon, Chunchun Ye, Haoxuan Ying, and Zhe Chen. 2025. "Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing" Information 16, no. 4: 289. https://doi.org/10.3390/info16040289

APA Style

Sterling, L., Ye, C., Ying, H., & Chen, Z. (2025). Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing. Information, 16(4), 289. https://doi.org/10.3390/info16040289

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Finding Your Voice: Using Generative AI to Help International Students Improve Their Writing

Abstract

1. Introduction

2. Method

3. Results

3.1. Exercise 1: Writing Personal Reflections with AI Assistance

3.2. Exercise 2: Comparing Gen-AI Tools

3.3. Exercise 3: Prompt Tuning Insights

3.4. Exercise 4: Summarisation Ability of LLMs

4. Discussion

4.1. Writing Strategies

4.2. Ethical Considerations

4.3. Qualitative Evaluation of Lessons Learned Through the Exercises

4.4. Language Difficulties for International Students

4.5. Suggestions for Improvement

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Model Details

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI