Source-Based Argumentation as a Form of Sustainable Academic Skill: An Exploratory Study Comparing Secondary School Students’ L1 and L2 Writing

: Argumentative writing is the most commonly used genre in writing classroom practices and assessments. To draft an argumentative essay in authentic settings, writers are usually required to evaluate and use content knowledge from outside sources. Although source-based argumentation is a sustainable skill that is crucial for students’ academic career, this area remains under-researched. Hence, this paper presents a within-subject study that investigated Hong Kong secondary school students’ argumentation construction in L1 and L2 source-based writing from both product-oriented and process-oriented perspectives. Multiple sources of data were collected, including L1 and L2 source-based argumentative texts, eye-tracking metrics and recorded videos, and stimulated recall interviews. Findings of our study show that the L1 source-based argumentative compositions of the Hong Kong secondary student writers differed greatly from their L2 ones in terms of the argument structure, source use, and reasoning quality. Analyses on four cases further revealed a multitude of factors such as self-regulation and cultural orientations coming into play in similar and different argumentation performance between L1 and L2 source-based writing tasks. This study contributes new knowledge to better understand the argumentation in L1 and L2 source-based writing, yielding meaningful implications on pedagogy and assessment in this ﬁeld.


Introduction
Argumentation competence is an important element at multiple levels of education because of its close connections to critical and higher-order thinking [1,2]. Whereas it is developed earlier in an oral form, argumentation is a more formalized process in writing [3]. Argumentative writing involves thoughtful consideration of both sides of a debatable issue [4]. It requires students to consider how they should use language appropriately to justify their position and refute others. Although it appears regularly in various disciplines, students face difficulties in writing essays in this genre [5,6]. Whether in L1 or L2, these essays are often limited to a simple argumentative structure consisting of the claim statement and supporting evidence [7] but not counterarguments or rebuttals [4,8]. In addition to being structurally flawed, these essays may also lack adequate reasoning procedures [9]. If the writers' claims are not well supported by their reasoning, the arguments may appear ill-founded and unconvincing [3].
Writing from sources is a necessary literacy skill for students engaging in academic studies [10]. While source-based writing tasks contribute to test fairness by providing identical content for all test takers [11], the format adds a complicating factor to argumentative writing by requiring writers to select and integrate information from multiple documents into their argument. The bulk of argumentative writing research has focused on the linguistic features and rhetorical structures of students' essays [12][13][14] or the effectiveness of designed instructional interventions [15,16]. However, little attention has been given to the compositional process of argumentation based on source texts. It is often unclear in what ways source texts and other factors interact together during the creation of L1 and L2 written argumentation. As an essential skill that will sustain students' academic success, source-based argumentation should be thoroughly researched to understand how students build written arguments and what difficulties they experienced. Moreover, previous process-oriented studies on writing have long been questioned for largely relying on qualitative methods such as verbal protocol and retrospective interview [17], whereas eye-tracking has gain popularity in this field as it affords millisecond-precise information about how language learners distribute visual attention during a language task non-intrusively [18]. To further tap into what is going on in writers' mind with a rich source of data, this study raised an innovative attempt to combine eye-tracking, stimulated-recall interview and writing texts to better examine how writers interact with the source texts to build written arguments. The participants' argumentation quality was demonstrated in their source-based essays, eye-tracking data and stimulated recall interview data complement each other to show how the participants deal with the given source texts to argue in writing. Insights in both writing products and processes could better illuminate language learners' argumentation behavior with the source support. Not limited to one language context, the findings also contribute to the understanding of within-writer similarities and differences in source-based argumentative writing between L1 and L2.

Establishing a Good Written Argument
Argumentation is a type of communication used for resolving different opinions on controversial topics [19]. Argumentative writing has been used regularly in highly influential language tests such as the College English Test (CET), the International English Language Testing System (IELTS) and the Test of English as a Foreign Language (TOEFL). Students are required to demonstrate their argumentation ability in writing not only for successful test performance but also for sustainable academic development [20]. Although the conceptual definition varies considerably across disciplines, it is generally suggested that at least two points should be considered when assessing the quality of written argumentation: the extent to which the writers follow structural conventions by including invariant argument elements, and whether the claims are well supported by evidence [21,22].
Studies examining surface structure have described the close relationship between argument elements and the quality of argumentative writing [4,23,24]. Crammond [23] investigated the differences in the complexity of argument elements between expert and student L1 persuasive essays. Her structural analysis revealed that expert writers produced sophisticated arguments with extensive use of claim-data complexes, rebuttals, and warrants. This suggests that complex argumentative structures can improve the quality of argumentative writing. Qin and Karabacak [24] examined the frequency of argument elements in L2 argumentative papers written by second-year English majors in a Chinese university. They found that counterarguments and rebuttals were used at a lower frequency than other elements, which undermined the overall quality of the papers. This concurs with reports of myside bias from related studies, which found that students were more likely to produce one-sided arguments in both L1 and L2 argumentative writing [5,25,26].
Researchers have cautioned against evaluating written argumentation solely by counting the frequency of argument elements, as this may prevent a fuller understanding and assessment of argumentation knowledge and competence. Studies have shown that students' essays are often unpersuasive because of their poor quality of reasoning, despite having a well-established argumentative structure [27,28]. In other words, the justification of the proposed claims is another important factor in constructing a well-written argument [29,30]. Arguing reasonably is a challenge for developing writers especially when they write in L2. In his examination of argumentative essays written by Indonesian English as a Foreign Language (EFL) learners, Rusfandi [31] found that students frequently generated reasons to support their claims but seldom anticipated reasons to weaken opposing views. Their low English proficiency was postulated as a possible factor behind this phenomenon. In their qualitative study of Iranian graduate EFL students, Abdollahzadeh, Farsani and Beikmohammadi [32] found that the students' argumentative essays had far from satisfactory reasoning despite having good argumentative structures. Given this, researchers have proposed assessing the underlying reasoning to complement structural coding. To date, three criteria have been widely accepted: (1) relevance, the prerequisite determining whether the evidence can be used to support the claim or not; (2) acceptability, or the reasonableness of situating the premises into the immediate context; and (3) sufficiency, which assesses whether the evidence provided is adequate to support the final conclusion [22,27,33]. Recently, using relevance and acceptability, Stapleton and Wu [3] evaluated the soundness of arguments in essays by Hong Kong high school students. They identified several patterns of reasoning deficiencies, emphasized the importance of reasoning quality in determining the validity of written arguments, and called for its further investigation. Although the evaluation criteria of reasoning quality have been well-documented, we have rarely looked into the soundness of written argumentation by linking it to structural analysis. Hence, to offer a fuller picture of the argumentation performance of Hong Kong secondary students, this study used the simplified Toulmin model to code the surface structure and scored the reasoning quality using the relevance and acceptability criteria.

Source-Based Argumentative Writing
Source-based writing tasks, which assess writing ability along with other skills, have recently increased in popularity in classroom writing practices and assessments. They require writers to comprehend, select, and synthesize ideas from multiple documents according to pre-assigned goals [34]. For source-based argumentative writing in particular, students are expected to construct arguments in writing based on listening and/or reading materials. Wingate [35] contended that the ability "to analyse and evaluate content knowledge" is an important component of written argumentation in addition to being able to develop a position and present it coherently. Plakans and Gebril [36] argued that source use in writing not only serves as a language repository but also contributes to the generation of ideas. Nevertheless, few studies have examined the role of source use in the establishment of written arguments. How do writers deal with source texts containing irrelevant, supportive, or contradictory information? How do they navigate these sources and integrate information into their written arguments? From which parts of the source texts do the data originate? These questions should be addressed through an elucidation of the source-based argumentation writing process.
Studies have examined L1 and L2 source-based writing using a variety of analytical approaches. One such approach is to distinguish the similarities and differences between the two languages. Conflicting results have been reported in this line of research. Campbell [37] compared the textual borrowing practices of 10 native English speakers and 20 English as a Second Language (ESL) learners in an in-class writing practice. He found that copying was a major strategy used by both groups of students but that ESL students utilized source information with acknowledgment more frequently than native speakers. Conversely, in Shi's [38] comparative analysis of native English speakers and Chinese ESL learners, the latter borrowed longer strings of source texts without explicit references at a higher frequency. Recent work by Doolan [39] investigated 260 essays from native English speakers and ESL learners at a post-secondary institution in the U.S. to better understand their source text use. By comparing their source integration and use of ideational units, Doolan found that both groups of students synthesized poorly and that L2 writers used a wider variety of source integration types. Summarizing recent research, Cumming, Lai, and Cho [10] pointed out that in both L1 and L2 contexts, students experience difficulties in writing from sources, which are influenced by a myriad of factors. They argued that a within-writer comparison between L1 and L2 conditions would provide better insights into writers' abilities to work with multiple texts, as well as the sociocultural, discoursal, and identity-related dimensions of writing in general. However, as they pointed out, there is surprisingly little research on this comparison. Moreover, few studies have examined the ways in which source use support or inhibit other required skills such as argumentation in writing. To fill the aforementioned gaps, we analyzed and compared argumentation in relation to source use in the L1 and L2 writing of a group of Hong Kong secondary students.

Eye-Tracking in Writing Process Research
An important part of understanding how a composition is produced comes from elucidating the process that underlies writing. However, compared to written products, little research has focused on the compositional process. One possible reason may have to do with the data collection methods. Traditional process-oriented writing research has relied heavily on qualitative methods including observation, think-aloud protocol, and interviews [40]. These methods provide rich data about the underlying cognitive processes used in writing, but they are inherently limited in that they interrupt the processes or produce delayed reports. Hence, there is a need for more valid tools to measure and record writing processes [41,42]. Over the past decade, some researchers have used eye tracking to generate online data of the processes. Based on the eye-mind assumption proposed by Just and Carpenter [43], "there is no appreciable lag between what is fixated and what is being processed (p.135)." Rayner [44,45] further confirmed that where our eyes locate and where our attention allocates overlap during the processing of most visual tasks. Therefore, the eye-tracking device enables researchers to detect and record participants' eye movements non-intrusively when looking at particular visual stimuli and make valid inferences about participants' cognitive processing from the datasets of location and time of eye movement during the task. To date, the application of eye tracking in writing research has centered on reading behavior during composition [17]. The method has facilitated observations on how participants read visual stimuli such as graph prompts, emerging texts, and automated feedback when producing compositions [40,46,47]. However, eye-tracking studies of the cognitive processes of writers are scarce, and to the best of our knowledge, none have explored the source-based argumentation behavior in L1 and L2 writing. To fill this important gap, the use of eye-tracking method with a post-task stimulated recall interview is necessary to provide more insights. It aimed to answer the following questions:

1.
What are the differences in argumentative structure in L1 and L2 source-based writing? 2.
What are the differences in source use between L1 and L2 source-based writing and to what extent do the source data support the claims? 3.
What is the role of source texts and other influential factors identified in the L1 and L2 written argumentation processes?

Methodology
The data used in this study were initially collected in a larger research project focusing on the cognitive processing of secondary students in L1 and L2 source-based writing tasks. However, the students' source-based argumentation performance was not analyzed. The study reported in this article focused on six cases that provide a further comparison of source-based argumentation in L1 and L2 writing contexts, using multiple sources that included the combination of written texts, eye-tracking data, and stimulated recall interviews.

Participants
Our study involved 6 Grade 10 students from Hong Kong high schools, among which 2 participants were female and the other 4 were male. All of the participants had been learning English as a second language through formal classroom instruction since Grade 1. They were recruited into this study through the recommendations of their Chinese teachers based on the requirement that participants had to be familiar with the computer test system and input methods used in this study. As suggested by their English teacher, they also represented a group of EFL learners ranging from lower to higher English proficiency levels according to their performance in one of the most recent school-based English tests. Informed consent was obtained from all participants prior to the research. All the participants signed the assent form and voluntarily participate in this study after being informed of the research procedure and purposes. Their parents also gave consent for their participation.

Instruments and Onscreen Source-Based Writing Tasks
A Chinese (L1) and an English (L2) source-based argumentative writing task used the same format as Hong Kong Diploma of Secondary Education language test. For the L1 Chinese writing, the topic was on priority seats in public transport systems, while for the L2 English writing task, the topic was underage organ donation. These writing topics are commonly used in language classrooms, and therefore are familiar to the participants. The source texts of L1 writing task were written in Chinese and those of L2 writing task were written in English. There are six source texts for each writing test, including an e-mail or a poster leading to the topic, one bar chart indicating the trend of the phenomenon, proverbs and two news articles on related aspects of the topics, and one essay discussing opinions from two sides.
The test materials were displayed on a computer screen in three sections. The left half of the screen consisted of the source materials. The participants could highlight the important parts by pressing 1 on the keyboard and undo the highlights by pressing 2. The upper right displayed a Word document that the participants could use for note-taking when listening to the audio recording. They typed their essays in another Word document in the lower right side of the screen. The two writing tasks had been assessed by local teachers and language assessment experts to ensure their clarity, appropriateness in terms of topic familiarity and standard of language use. The tasks were also piloted with two undergraduate students to inform the interface design. All the participants had 3 min to skim the six reading passages and 12 min to listen to an audio recording. They were then required to write a speech in response to the given prompt within the one hour during which the reading materials were available. They were prompted to summarize different views contained in the source materials and articulate their opinions on the topics with evidence. The minimum word count for the Chinese and English writing tasks was 500 characters and 400 words separately. Tobii TX300 with a sample rate of 300 Hz per second was used to record and track participants' eye movement in a natural writing setting.

Data Collection
Before carrying out the two source-based writing tasks, the participants underwent training to become familiar with the eye-tracking device and onscreen test system. The study then proceeded as follows: Step 1: Tobii TX300 was used to calibrate each participant's fixation measures to ensure the accuracy of eye-tracking data during task completion.
Step 2: The participants did the Chinese (L1) and English (L2) writing tasks separately while their eyes were being tracked for each task. A researcher observed the participants' test-taking process on a monitor computer and noted down critical episodes to be used for reference in the subsequent interviews. Critical episodes are any events indicating the participants' cognitive processing such as editing or deleting one sentence and noting down key information. Three participants did the Chinese task first, while the other three did the English task first. A break of 20 min was arranged between the two writing tasks.
Step 3: The participants took part in a stimulated recall interview immediately after completing the writing tasks. The eye-trace overlaid videos were replayed as prompts to facilitate the participants' reports of their mental activities. Critical episodes in their test-taking process that were noted by the researchers were also used as prompts to ensure that important information was not missed. Each participant was interviewed twice (once for each writing task), and each interview lasted approximately 20 min. The participants reported mainly in Cantonese, which is their mother tongue. The whole interview procedure was audio-recorded for in-depth analysis.

Data Analysis
Our data analysis consisted of two stages. In the first stage, we analyzed and characterized the argumentation performance for each source-based writing task. The role of source use in argument construction was inferred from the ways in which participants directed their visual attention when reading texts and adopted source information as data.
In the second stage, four cases were selected for a more in-depth study of the complex interactions between the various factors giving rise to similarities and differences between within-subject L1 and L2 written argumentation.

Analyzing Students' Source-Based Argumentation in L1 and L2 Texts
Three steps were involved in this stage to address research questions 1 and 2. The first step was to examine the extent to which the participants incorporated the Toulmin argument elements into their essays: claim, data, counterargument claim, counterargument data, rebuttal claim, and rebuttal data. After the students' final stance was ascertained, the texts were coded according to the frequency of the Toulmin argument elements. Two researchers coded all of the texts separately and generated an intercoder coefficient value of 75%, which indicated an acceptable level of agreement. The disagreements were resolved through discussion. Some excerpts from the students' L2 papers were offered in Appendix A to illustrate the structural coding.
The second step centered on source use in written argumentation. The eye-tracking metrics (fixation count, total fixation duration, visit count and total visit duration) were imported into SPSS version 24 and run through a descriptive analysis based on four groups of areas of interest (AOIs) in the reading texts: (1) pro-side AOIs containing information that could be used to argue for the topic; (2) con-side AOIs containing information that could be used to argue against the topic; (3) irrelevant AOIs containing no stance information; and (4) two-side AOIs containing information that could be used to argue for and against the topic. The six reading texts in the Chinese writing task were divided into 9 AOIs: 2 pro-side AOIs, 2 con-side AOIs, 2 irrelevant AOIs and 3 two-side AOIs. The six English reading texts were divided into 1 pro-side AOI, 1 con-side AOI, 4 irrelevant AOIs and 2 two-side AOIs. The analysis was generated according to the participants' stances expressed in their essays, which determined what the pro-sides and con-sides were for each participant. The eye-tracking metrics on four groups of AOIs were analyzed to see how the participants distributed their visual attention to different information. By linking the eye-trace overlaid videos (as shown in Figures 1 and 2) to the participants' interviews, we also traced the sources of data elements to gain a better understanding of their reasoning behavior in source-based writing.
The third step was to analyze reasoning quality using the relevance and acceptability criteria [3]. Firstly, two researchers evaluated the relevance of the reasons to the claims using a dichotomous scale (i.e., relevant or not relevant). If the reason was judged to be relevant, it was then assessed with a 3-point scale of acceptability (i.e., 1 = Not Acceptable, 2 = Weakly Acceptable and 3 = Acceptable). The scoring agreement of the two researchers was 77%. The two researchers then discussed the discrepancies and decided the final score. Finally, the reasoning quality was derived by dividing the total score by the number of reasons. Appendix B illustrates the analytical process using Participant 1 s L2 paper as an example.
Sustainability 2021, 132, 12869 7 of 19 2 = Weakly Acceptable and 3 = Acceptable). The scoring agreement of the two researchers was 77%. The two researchers then discussed the discrepancies and decided the final score. Finally, the reasoning quality was derived by dividing the total score by the number of reasons. Appendix B illustrates the analytical process using Participant 1′s L2 paper as an example.

Comparing Typical Argumentation Performance in L1 and L2 Source-Based Writing
The complex interactions involved in source-based argumentation processes were explored with the foci on similar and different performance in L1 and L2 writing tasks to answer research question 3. The interview transcripts were coded in line with the thematic analysis [48]. The interviews were first transcribed by the researchers using the eye-trace overlaid videos and audio recordings. The recorded videos enabled the researchers to link the participants' reports to what they were looking at and writing simultaneously. The interview transcripts were then segmented according to the critical episodes. Meaningful codes were initially generated for interview segments that were of interest and relevance to influential factors in written argumentation. The generated codes then underwent an iterative refining and were finally categorized in a short list. The four interviewees were selected in terms of their typically similar and different argumentation performance in L1 and L2 texts including the surface structure and the underlying reasoning. Several typical 2 = Weakly Acceptable and 3 = Acceptable). The scoring agreement of the two researchers was 77%. The two researchers then discussed the discrepancies and decided the final score. Finally, the reasoning quality was derived by dividing the total score by the number of reasons. Appendix B illustrates the analytical process using Participant 1′s L2 paper as an example.

Comparing Typical Argumentation Performance in L1 and L2 Source-Based Writing
The complex interactions involved in source-based argumentation processes were explored with the foci on similar and different performance in L1 and L2 writing tasks to answer research question 3. The interview transcripts were coded in line with the thematic analysis [48]. The interviews were first transcribed by the researchers using the eye-trace overlaid videos and audio recordings. The recorded videos enabled the researchers to link the participants' reports to what they were looking at and writing simultaneously. The interview transcripts were then segmented according to the critical episodes. Meaningful codes were initially generated for interview segments that were of interest and relevance to influential factors in written argumentation. The generated codes then underwent an iterative refining and were finally categorized in a short list. The four interviewees were selected in terms of their typically similar and different argumentation performance in L1 and L2 texts including the surface structure and the underlying reasoning. Several typical

Comparing Typical Argumentation Performance in L1 and L2 Source-Based Writing
The complex interactions involved in source-based argumentation processes were explored with the foci on similar and different performance in L1 and L2 writing tasks to answer research question 3. The interview transcripts were coded in line with the thematic analysis [48]. The interviews were first transcribed by the researchers using the eye-trace overlaid videos and audio recordings. The recorded videos enabled the researchers to link the participants' reports to what they were looking at and writing simultaneously. The interview transcripts were then segmented according to the critical episodes. Meaningful codes were initially generated for interview segments that were of interest and relevance to influential factors in written argumentation. The generated codes then underwent an iterative refining and were finally categorized in a short list. The four interviewees were selected in terms of their typically similar and different argumentation performance in L1 and L2 texts including the surface structure and the underlying reasoning. Several typical excerpts for each of the four cases were chosen in this stage for close examination of the influential factors underlying their L1 and L2 argumentation performance in writing. A comparison of the average frequencies of the argument elements that appeared in L1 and L2 source-based writing (see Table 1) showed that the six argument elements were used more in L1 than in L2. The average frequencies of claim (1.5), data (2.33), counterargument data (1), rebuttal claim (1.17) and rebuttal data (1) collectively suggested a relatively complete argumentative structure in L1 papers. In contrast, not every L2 paper contained claim, counterargument claim, counterargument data, rebuttal claim and rebuttal data, as the average frequencies of these elements were less than 1. The participants were first divided into those with clear stances and those without one, by judging whether they had clearly stated their personal opinions or not. For the participants with clear stances, basic argument elements (i.e., claim and data) and higher-level argument elements (i.e., counterargument and rebuttal elements) were included in both L1 and L2 argumentative writing. According to the mean values for these participants, the use of counterargument claims and data appeared less often than that of rebuttal claims and data in both L1 and L2 writing. The participants without clear stances (Participant 4 in L1 writing and Participants 2, 4 and 6 in L2 writing) used no argument elements in their writing; rather, they produced a summary of the positive and negative effects on the given topic. Analyzing the L1 paper of Participant 4 and the L2 papers of Participants 2, 4 and 6, we found both accounts of pros and cons in relation to the assigned topics but no semantic structures and linguistic patterns signaling personal opinions.  We compared eye-tracking metrics for different groups of AOIs in two writing contexts (L1 and L2) to see how the participants visually attended to reading texts with different positions (see Figure 3). For the participants who stated their opinions in both sourcebased writing tasks (Participants 1, 3 and 5), similar visual patterns were observed. They all directed relatively more attention to AOIs containing similar views than those with opposite ones. For example, Participants 1 and 3, who opposed underage organ donation and transplantation in their L2 essays, spent 29% and 31% of the time fixating on con-side AOIs compared to 14% and 24% on pro-side AOIs. The other three eye-tracking metrics also indicated similar visual behavior, thus corroborating this observation. For participants who summarized in their essays, we found that visual attention was unevenly distributed between the two groups of single-side AOIs, thus indicating a visual preference for a particular side. However, such visual patterns seem to contradict the lack of clear stance in their written essays. The two-side AOIs attracted more visual attention from most participants than did either of the two groups of one-side AOIs in both writing tasks.   Taking a closer look at how participants processed the reading sources, we further located the reasons generated from the reading texts in the groups of AOIs. As shown in Table 3, most of the relevant AOIs (i.e., pro-side, con-side, and two-side AOIs) gave rise to the data argument elements in both writing tasks. The participants utilized all of the relevant AOIs except AOIs 3 and 6 in the L1 writing task as an evidence base, and they With regard to the highest percentages in the eye-tracking metrics, Participants 3 and 5 in the L1 writing task and Participant 6 in the L2 writing task spent the most time on irrelevant AOIs. More specifically, irrelevant AOIs received 73% and 63% of the total fixation duration from Participants 3 and 5 in the L1 writing task and 62% of that from Participant 6 in the L2 writing task. This heavy visual attention to irrelevant AOIs influenced L1 and L2 argumentative writing in different ways, which are discussed in following section. Other participants spent relatively lower proportions of their time reading irrelevant information, with the proportions of total fixation duration ranging from 6% to 24% in L1 writing and from 15% to 21% in L2 writing. This may imply that most participants could effectively differentiate between relevant and irrelevant information when building written arguments.
We further located the source of data to elucidate what information the participants used as evidence for their arguments by relating the eye-trace overlaid videos to interviews and written products. The data presented in Table 2 suggested that there were three main sources: the listening materials, the reading texts and the participants' prior knowledge. In L2 writing, the participants used the reading texts the most to back up their assertions, followed by their personal knowledge and the listening materials, while the proportion of the three sources were relatively equal in L1 writing. With regard to each source of information, the proportion of data from the reading texts increased considerably from L1 to L2 writing (42.3% to 61%), and the use of listening material decreased from 23.1% to 7.7%. The proportion of data from personal knowledge remained relatively similar in the two writing tasks. A within-writer comparison of Participants 1, 3, and 5 further confirmed the increased use of reading texts and the decreased use of listening materials in L2 writing. In particular, Participants 1 and 3 did not use the listening materials at all in L2 writing. Instead, they relied on prior knowledge and the reading texts, respectively. Table 2. Sources of data elements in participants' L1 and L2 source-based writing papers.

Chinese (L1) Source-Based Writing Papers English (L2) Source-Based Writing Papers
Listening Material

Listening Material
Reading Texts Personal Knowledge n (%) n (%) n (%) n (%) n (%) n (%) Taking a closer look at how participants processed the reading sources, we further located the reasons generated from the reading texts in the groups of AOIs. As shown in Table 3, most of the relevant AOIs (i.e., pro-side, con-side, and two-side AOIs) gave rise to the data argument elements in both writing tasks. The participants utilized all of the relevant AOIs except AOIs 3 and 6 in the L1 writing task as an evidence base, and they fully utilized the relevant AOIs in the L2 writing task. These results showed that the participants used various pieces of information from the reading texts when arguing in writing.

Reasoning Quality of L1 and L2 Argumentative Writing Texts
As shown in Table 4, the reasoning quality of most L1 and L2 argumentative essays was between 1 and 2, indicating a weakly acceptable level. As Participants 2, 4 and 6 summarized in the L1 and/or L2 writing tasks, we compared the reasoning quality across Participants 1, 3 and 5 only to see whether their argumentation performance differentiated between the two writing tasks. The results indicated that the reasoning quality of L1 writing papers was higher than that of L2 writing papers. In the papers written by Participant 3 in particular, the reasoning score was 1.75 (almost at a weakly acceptable level) in L1 writing while it in L2 writing was lower only reaching 0.67 (close to a not acceptable level).

Factors That Shaped the Similar and Different Source-Based Argumentation in L1 and L2 Writing
In response to RQ3 regarding the source-based argumentative writing process, four cases of Irelynne, Mark, Kelvin and Tony (pseudonyms) were chosen to elucidate the possible influential factors in terms of the similar and different argumentation performance in L1 and L2 contexts, via a combination of eye-tracking metrics, eye-trace video descriptions, interview data and written texts.
Irelynne: Relatively complete argumentative structure and good reasoning quality in both L1 and L2 source-based writing The L1 and L2 papers written by Irelynne presented a relatively complete argumentative structure (with an exception of rebuttal data) and good reasoning quality (1.75 for L1 and 1.40 for L2). Two self-regulation behaviors were found to explain similar argumentation in the two writing contexts. Firstly, Irelynne planned before writing and searching effectively for useful source information according to the plan. When asked about how she organized the L1 paper in the interview, Irelynne stated that she formed a clear plan before starting to write: I came up with a general plan before I started to write. The first paragraph specified the definition and controversies of priority seats. Then, most importantly, I presented and proved my opinions that more priority seats should be added to public transportation.
Guided by the pre-writing plan, she searched the reading texts cyclically for information: When drafting the first paragraph, I scanned through these texts to find out the positive and negative effects of priority seats. I read News 1 describing a 20-year-old pregnant woman who was abused because she was sitting on a priority seat. I think this piece of news is an example of the negative effects of priority seats . . . I reread it, added it to my text with different wordings.
Eye-tracking metrics also provided evidence for her effective reading. In both the L1 and L2 writing contexts, irrelevant AOIs occupied a minor proportion of fixation duration for Irelynne (6% in L1 writing and 15% in L2 writing), whereas two-side AOIs occupied the most (39% in L1 writing and 42% in L2 writing). Additionally, the high achiever fixated much longer on AOIs sharing similar views to hers than those expressing opposite ones in the two writing tasks. Such visual patterns suggest that Irelynne distinguished irrelevant from relevant information effectively and allocated attention selectively and purposively. She followed her set goals strictly throughout the writing processes, devoting her visual attention to source texts in a pattern consistent with her plan and attitude prior to reading.
Second, this well-performed participant also executed effective monitoring and revising of language and content constantly. In the L1 writing task, she recalled a previous in-class writing experience and substituted the subject of speech composition with a thirdperson phrase: Here I deleted this sentence because of first person pronoun 'I'. My teacher once told us that you should try to avoid using the first-person point of view in speech writing because it may sound too subjective . . . replacing them with third-person is better.
From the simultaneously replayed eye-trace overlaid videos, we found that the fixation points of this participant moved backward to the beginning of the sentence "我来为大家 说一说关于关爱座正反两方面的观点" (I will elaborate on the supporting and opposing viewpoints on priority seats). The sentence was then deleted and replaced by another with a third-person subject: "社会上对于关爱座有不同的声音" (There are different voices debating the issue of priority seats), as shown in the participant's essay Word document.
When doing the L2 source-based argumentative task, Irelynne reread what she wrote and added more information into the second paragraph after recognizing the lack of negative evidence for the topic: In the first half of this paragraph, I want to describe the negative effects of underage organ donation and transplantation. That is why people think children should not be allowed to donate organs. I read what I have written and find that it is insufficient to demonstrate the severity of the harm that donating organs may bring to children, so I switch back to Source 3 and attempt to get more valuable information. Then I found here . . . the short-term and long-term effects of donors . . . I read it quickly to figure out what it mainly says and add two more sentences . . . with my own words.
As the eye-trace overlaid videos show, Irelynne's eyes first moved through the corresponding writing text that she had just completed, jumped onto Source 3 quickly, paused on the second paragraph, then switched back to the essay Word document. Shortly after a series of eye movements, Irelynne continued her L2 text editing.
Kelvin: Different reasoning quality in L1 and L2 source-based writing The L1 and L2 writing texts of Kelvin were similar in argumentative structure but considerably different in reasoning quality. The reasoning quality of the L1 paper was scored at 1.75 (weakly acceptable), while that of the L2 paper was scored at 0.67 (not acceptable). Moreover, the visual patterns contradicted the reasoning quality results of the two writing essays. Eye-tracking metrics showed that more than 70% of the fixations occurred on irrelevant AOIs, while only 9% occurred on pro-side AOIs for L1 writing. In contrast, the participant allocated visual attention effectively in L2 writing by attending more to con-side AOIs than to irrelevant ones. Taking into joint consideration the reasoning performance and writing process of Kelvin, it was found that he was highly engaged in accessing source information with the help of personal knowledge in L1 writing but hindered by the use of source information in L2 writing: L1 and L2 writing are different. L1 reading texts are easy to grasp, so I read them quickly and use source information as I want . . . I can also give more accounts of my feelings, thoughts, and experiences. In the L2 writing task, I spend more time reading the texts and try really hard to use these texts.
Despite the ineffective allocation of visual attention in the L1 task, Participant 3 understood the source texts easily. Furthermore, personal knowledge and experience acted as a valuable resource for him to build written arguments, which gave rise to a quarter of data. However, while Participant 3 relied entirely on the reading texts to generate data in the L2 task, he failed to utilize these sources accurately and effectively. Instead of paraphrasing, he mostly copied excerpts from the source texts, which undermined the reasoning quality: I make a long pause and have no idea about what to write. I read Source 4 and try to paraphrase it. I don't know how . . . so I copied the first two sentences here . . . maybe it is about the risks of being a living donor . . . I'm not sure.
Tony: Argumentation in L1 source-based writing but not in L2 source-based writing Tony stated his opinions in the L1 source-based writing task, but not in L2. His L1 essay was complete in argumentative structure and marginally acceptable in reasoning quality, whereas his L2 essay only summarized the sources. The different argumentation performance in L1 and L2 source-based writing can be explained by the fact that Tony maneuvered source information successfully according to task requirements in L1 writing while he was cognitively struggled with the attentional competition between argumentation and source use in L2 writing.
As shown by the eye-tracking metrics, the reading texts elicited different visual patterns from Tony in the two writing tasks. In the L1 task, Tony fixated the most on twoside AOIs (59% of the fixation duration), and there was a clear contrast of visual attention between pro-side and con-side AOIs (19% of the fixation duration for con-side AOIs and 4% for pro-side AOIs). Such visual patterns suggested effective reading behavior for the source texts. According to the interview, although Tony found the L1 reading texts to be challenging because of the different stances presented, he had a clear task representation and self-regulated his compositional process successfully: The reading texts are quite hard, I think . . . There are six reading texts, and they all present different aspects of priority seats... so I need to organize and paraphrase what I have read efficiently before writing. I know it clearly that the writing task has two goals. One is to summarize the source information about priority seats, and the other is to offer my personal views. I wrote two paragraphs to meet the two goals respectively. When I reread Sources 2 and 4, I recognized that some information can be used to link the two paragraphs coherently. Then I added another paragraph between them to describe the differing effects that priority seats bring to people such as elders and pregnant women.
However, in the L2 writing task, Tony focused mostly on irrelevant AOIs (more than 60% of the time), and no contrasting visual attention between the two single-side AOIs was found (7% and 8% of the time on pro-side AOIs and con-side AOIs, respectively). This implied that the participant was weak in differentiating relevant from irrelevant source information and that no clear visual preference could be observed from the eyetracking metrics. Integrating source information and establishing arguments were of great importance in this task. Because of the limited capacity of cognitive and L2 resources, writers attending mostly to one aspect would compromise performance on the other. In this case, understanding and integrating source information used up most of Tony's time and cognitive resources, leaving scarcely any for building arguments, thus causing the absence of the writer's views. As he reported in the interview: I repeatedly read Sources 3 and 4 to access information about the effects of living organ donation. I read Source 4, Source 3, and also Source 2. However, I only get something from the table in Source 2 . . . I think it shows the death toll of organ donation. Most of my time is spent on these source texts. I worked so hard to understand them while I performed badly in constructing and integrating their key points . . . no time to organize my personal views, so I choose a neutral position to use source information as much as possible. It is a strategy that helps me complete the writing task.
Mark: lack of argumentation in both L1 and L2 source-based writing Mark summarized two sides of the assigned topics in both L1 and L2 source-based writing. However, his eye-tracking metrics showed otherwise. Visual attention was unevenly distributed between the two groups of single-side AOIs in both writing tasks, suggesting his visual preference for a particular side. Specifically, the participant focused much longer on the con-side AOIs of L1 source texts (50% of total fixation duration and total visit duration) and the pro-side AOIs of L2 source texts (33% and 35% of total fixation duration and total visit duration, respectively). However, these visual patterns did not correspond with a clear position in the two written products. Further analysis of the interview data unveiled different reasons for the mismatch between visual patterns and written representations.
First, collectivist thinking invoked a frequent interplay of personal experience and source information in L1 writing. From the interview transcripts of Mark, we found that personal experience and source information frequently intersected in his L1 writing. Inspired by the prevailing collectivist thinking in Chinese culture, he preferred to consider the issue of "priority seats" as a double-edged sword: As reported in the news of Source 2, priority seats have triggered a public debate over the years. I take the subway to school every day, so I have relevant experience. I always see a few empty seats with people standing around them. They are not willing to take the seats. Young people will be criticized for taking these seats, and old people may be mistaken to be taking advantage of their seniority. But as the Chinese saying goes, everything has its pros and cons. It is hard to decide on this controversial issue, so . . . better to give a balanced summarization of both sides.
As the video clips showed, this participant fixated for a while on Source 2, which contained competing information, then scanned through the rest of the source texts quickly, then brought Source 2 back into view. He repeatedly switched between Source 2 and the essay Word document and wrote "关爱座在亚洲地区十分普遍, 它的原意是关心弱势群体, 一部分人都认为这是种传统美德的传承, 到了现在也有人认为关爱座的原意已渐渐变质" (Priority seats are a common phenomenon in Asia. They were created to care for groups in need. Some people see them as continuing a legacy of traditional virtues, while others find that their original meaning has been lost).
Second, a neutral position was chosen in L2 writing due to time constraints. For the L2 writing task, this participant encountered great difficulties in understanding the reading texts accurately and gave up on developing his personal opinions. Summarizing both sides of the debate was used as a compromising strategy to complete the writing task within the time limit: I seldom write in English, so I don't know how to compose this paper. I focus on Sources 2 and 3 since they are relatively easy for me to understand. I spent most of my time reading the two sources and tried to find some useful information, but I failed. I just wrote down whatever came into my head . . . weighing both the pros and cons, to reach 400 words. I know my paper only partly conforms to the task requirements, but I have no time . . . to take a side.
Coinciding with the participant's report, the video clips showed him beginning to write something down after scanning frequently between Sources 2 and 3. During the subsequent compositional process, the participant repeated the pattern of scanning a source text and writing a few lines until he had completed his essay.

Discussion and Implications
The present study zoomed in on six Hong Kong secondary students and explores their argumentation behavior in L1 and L2 source-based writing. Findings of this study may promote the sustainability of L1 and L2 writing development in terms of learning, teaching and assessment. In general, students performed better in L1 source-based argumentative writing than in L2 in both the argumentative structure and reasoning quality. In contradiction to the "myside bias" found in previous studies [49], the present study provided some evidence that secondary school students are able to acknowledge and refute alternative viewpoints in argumentative writing when provided with various source texts. It suggested that well-designed source materials may enrich students' written argumentation from a structural perspective. Especially with the reference of source texts, the participants tended to refute opposing views directly without specifying and evaluating the opposite sides first. However, it should be noted that elaborated counterargumentation has a marked impact on the persuasiveness of argumentative essays [8] and significantly relates to the overall writing quality [4,24]. Hence, language teachers should pay special attention to the explicit teaching of counterargumentation to prevent students from falling into the trap of arguing against what is presented in source texts.
Another key determinant of the quality of written argumentation is the data in support of claim elements. The assertion made in argumentative essays will not be compelling if it is not reinforced by a solid evidential basis. However, the participants' reasoning performance in this study was far from satisfactory, especially in the L2 context. Instead of stating a clear position, several participants summarized the two sides of the given issues, especially in their L2 argumentative writing. Different from the myside bias reported in previous research [4,49], this study found that the quality of written argumentation in Chinese secondary students is also undermined by the absence of the writer's own position. It is imperative to know that effective argumentation involves weighing alternative perspectives to support the writer's final stance [50] rather than purely summarizing competing information from sources. Hence, students should be helped to become conscious of the rhetorical purposes of the given writing tasks, which is a key component of task representation influencing their writing processes and performance [51]. Some collaborative classroom activities such as group feedback can be used to encourage the students to speak out their voices and write critically with supporting evidence. Similar to Stapleton and Wu [3], this study also found an inconsistency between the number of data elements and the reasoning quality in students' argumentative essays. This finding lends itself to the pedagogical implication that students need more explicit guidance to reason effectively rather than general suggestions about providing more evidence to support their views. Special instruction to develop students' knowledge of this genre is also advised to enhance their sustainable ability in argumentation and writing [52].
Source use has a unique contribution to the construction of written argumentation. However, its role might be moderated by language contexts, as its effects differed from L1 writing to L2 in this study. For ESL learners, the provision of source materials may add extra burden to L2 writing because of the competition for attention by source use and argumentation, as suggested by the cases of Kelvin and Tony. Given the limited attentional resources, argumentation performance may be undermined when most attention is allocated to the translation and summarization of source texts [15,53]. Especially in L2 source-based writing, students in this study relied heavily on the reading texts and less on listening materials and their own knowledge, which likely resulted in the failure of expressing their personal opinions. Similar to previous studies [36,54], students may copy the source texts frequently to fulfil the L2 writing tasks for the lack of language proficiency, which then weakened the reasoning quality. However, source texts played a less important role in the L1 writing task because participants were able to access prior knowledge and experience successfully. Differential reliance on source texts might enlarge the gap of reasoning quality between L1 and L2 essays.
The eye-tracking data showed that the participants who have a clear stance attended more to AOIs with supportive information for their viewpoints. This visual preference can be explained by the mechanism of "selective exposure to information," especially when the viewpoint is strongly held [55]. It means that people tend to search for myside information while neglecting the information that does not agree with their pre-existing attitudes [56]. A few participants focused the most on irrelevant information, which suggested poor intertextual comprehension and integration. Therefore, more explicit instruction in appropriately attending to different sources is also needed when teaching argumentative writing [50]. Additionally, the facilitative effects of self-regulation are evident in both L1 and L2 argumentative writing. Argumentative writing is a problem-solving process requiring self-regulation to better achieve the goal of persuasion [57]. As Zimmerman [58] argued, "self-regulated learners plan, set goals, organize, self-monitor, and self-evaluate at various points during the process of acquisition". Students need instructional support for self-regulation to overcome the difficulties in drafting the argumentative essays [59]. In this study, the high-achiever Irelynne benefited from self-regulation when building arguments in two argumentative writing. These processes enable the writer to allocate limited visual attention to source texts effectively and to select and adjust source information for better written argumentation. Arguably, if the goal is to foster students' argumentative skills in source-based writing, students will need time and opportunities for planning and monitoring their writing. From an instructional perspective, language teachers should encourage students to allocate more time to analyzing and weighing the controversial issues at hand before writing so that they may express their personal opinions more successfully. Finally, argumentation in writing is also influenced by the individual characteristics. One learner feature revealed in this study is cultural orientations which affirms the fact that writing is not simply a cognitive task, but also a culturally shaped product. The fundamental philosophy of Chinese cultural values is collectivism, in which harmony and deference are respected [60]. Therefore, Chinese students like Mark tend to be nonaggressive and weigh the benefits against the costs of assigned topics, especially in L1 argumentative writing. Focusing on both products and processes of written argumentation is expected to sustainably guide students to write reasonably, logically and convincingly.

Conclusions and Limitation
This study represented not only an exploratory investigation theoretically to examine students' argumentation behavior from the surface structure, the underlying reasoning and source use in L1 and L2 writing, but also an innovative attempt methodologically to examine the complex interactions involved in written argumentation processes by triangulating the data from eye-tracking (i.e., quantitative eye-tracking metrics and qualitative visual videos), stimulated-recall interview and writing tasks. It gave a closer look into the possible interactions between individual, cognitive, and contextual factors underlying L1 and L2 source-based written argumentation. More specifically, factors such as source texts, prior attitude, self-regulation, cultural orientation, and time limit may interact with each other to influence argumentation in source-based writing, which warrants further investigation. However, we acknowledge that there are limitations in this study. Firstly, the potential to generalize the results was extremely restricted because of the small sample size. A larger sample of writers from different demographic and educational backgrounds is needed for broader claims to be made. Secondly, the AOIs should be further controlled to generate more valid insights for reading and writing behavior at the sentence level or lower. Thirdly, although this study traced the data elements back into the source texts from which they came, the ways in which these data elements interact with other structural elements remain unknown. Future studies can explore the use of source information in the construction of written arguments in more details to better support the growth of L1 and L2 writing and attaining sustainable academic achievements.  Informed Consent Statement: Informed consent was obtained from all subjects involved in this study.
Data Availability Statement: Data is not publicly available, though the data may be made available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1. Illustrative excerpts for the Toulmin argument elements.

Argument Elements Illustrative Excerpts
Claim I agree with the view that teenagers are not allowed to take organ donation and transplant.

Data
The donors will be confronted with health risks since no surgical procedure is 100% safe.

Counterargument claim
Teens who are about 18 years old should be given the right to be a liver donor for their dying parents.

Counterargument data
The seventeen-year-old Hong Kong teen Michelle is willing to try all means to save her dying mother who is suffering from acute liver failure.

Rebuttal claim
Although underaged people are willing to donate their organs, it is inevitable that they will encounter with more health risks.

Rebuttal data
As the news says, organ transplant will bring the risks of infection from the surgery and then hurt the immune system.

Appendix B
Sustainability 2021, 132, 12869 17 of 19 Appendix A Table A1. Illustrative excerpts for the Toulmin argument elements.

Argument Elements Illustrative Excerpts Claim
I agree with the view that teenagers are not allowed to take organ donation and transplant.

Data
The donors will be confronted with health risks since no surgical procedure is 100% safe.

Counterargument claim
Teens who are about 18 years old should be given the right to be a liver donor for their dying parents.

Counterargument data
The seventeen-year-old Hong Kong teen Michelle is willing to try all means to save her dying mother who is suffering from acute liver failure.

Rebuttal claim
Although underaged people are willing to donate their organs, it is inevitable that they will encounter with more health risks.

Rebuttal data
As the news says, organ transplant will bring the risks of infection from the surgery and then hurt the immune system.