Sustainable Development of EFL Learners’ Research Writing Competence and Their Identity Construction: Chinese Novice Writer-Researchers’ Metadiscourse Use in English Research Articles

: English for foreign language (EFL) novice writer-researchers are faced with an increasing pressure for international publication as a prerequisite for sustainable career development in academia. The use of metadiscourse, as a key indicator for their discourse competence, has been a subject of research for English for Academic Purposes (EAP) and/or English for Speciﬁc Purposes (ESP) scholars. This study investigates metadiscourse features of research articles’ (RA) results and discussion (R&D) sections written by Chinese PhD students and their writer identities reﬂected through metadiscourse choice. A corpus was built, consisting of a subcorpus of R&D of unpublished research articles (RAs) written by Chinese PhD students (CNWs) and one of the same part-genre by English-speaking expert writers (EEWs). Metadiscourse used by the two groups were identiﬁed based on Hyland’s interpersonal model of metadiscourse. Quantitative analyses on the frequency and variety of metadiscourse markers found a signiﬁcant difference not only in interactional metadiscourse but also in some subcategories of interactive and interactional metadiscourse, indicating that CNWs attach more importance to organisation of ideas than to the persuasiveness of arguments. A questionnaire survey was conducted to explore the inﬂuence of the CNWs’ perception of RA writing on their metadiscourse choice. It revealed that knowledge of generic conventions and metadiscourse functions, awareness of the writer–reader relationship, and conﬁdence in language competence may inﬂuence metadiscourse choice. The paper concludes with the view that the CNWs generally view themselves as a recounter and reporter of their research, remaining conservative when presenting an authoritative voice and a conﬁdent identity as a knowledge creator.


Introduction
English for foreign language (EFL) novice writer-researchers are faced with an increasing pressure for international publication as a prerequisite for a sustainable career development in industry and academia. They need to be more cautious than Englishspeaking expert writers when looking for appropriate linguistic resources and rhetorical strategies to unfold new knowledge through texts as a knowledge creator on the one hand and negotiate existing discourses as a disciplinary community member on the other [1]. Metadiscourse contributes a great deal in these two aspects [2], thus attracting increasing attention from scholars in English for Academic Purposes (EAP) and/or English for Specific Purposes (ESP) writing. Metadiscourse is defined by Hyland as "aspects of the text which explicitly refer to the organisation of the discourse or the writer's stance towards either its content or the reader" [3] (p. 109). It plays the roles of organising the text, evaluating findings, engaging readers, and manifesting research significance in ways "that are meaningful and appropriate to a particular disciplinary community" [4] (pp. 438-440).
Although there are numerous studies on metadiscourse features in research articles (RAs) written by L1 writers, including those on disciplinary variations [5,6], cross-cultural variations [7,8], or metadiscourse features of a particular category [9,10], studies targeting at RAs written by L2 novice writers are scarce [11]. EFL writers have been found to be inadequate in deploying metadiscourse resources [12][13][14]. There are several reasons for this. First, the use of metadiscourse can be quite flexible, involving a wide range of linguistic devices and serving a variety of functions [2]. Second, metadiscourse use often varies across cultures, and unawareness of such cross-cultural differences may lead to ineffective communication with potential readers for scholarly publication [15]. In addition, metadiscourse is often neglected in research and writing courses, which tend to focus on the lexico-grammar of propositional ideas rather than rhetorical strategies on interpersonal aspects [8]. Therefore, research into the metadiscourse use of EFL writers is highly needed to provide EAP/ESP instructors and novice writers with evidence and new perspectives on metadiscourse features in this research genre.
Previous work investigating metadiscourse features in different RA sections have mainly been focused on the abstract [9] and introduction [11], yet results and discussion (R&D) remains an under-researched part-genre. This study chooses the combined sections of results and discussion (R&D) of an RA as the focus due to its critical role in establishing the significance of the study based on research findings [16,17]. R&D is a place where the writer demonstrates their ability to think critically about his/her own research and those of others, and to convince readers of the value of the study [18]. Metadiscourse use unconforming to the purpose of this part-genre of a particular discipline may jeopardize the stimulating and convincing strength of the argument.
One aspect of RA metadiscourse features in EFL writings that remains underexplored is the relationship with writer identity. RA production involves the writer's conscious or unconscious construction of a writer identity by assuming both the roles of researcher and discourse constructor [19]. Studies reveal that EFL writers encounter not only challenges in deploying linguistic and rhetorical resources [12,20], but also in developing a scholarly voice [1,21,22].
This study aims to explore novice writers' metadiscourse features and identity construction, particularly targeted at Chinese engineering PhD students. This group of writers are generally required to publish at least 2-3 RAs in English-medium SCI-indexed journals in order for graduation, yet they often suffer rejections from targeted journals due to a lack of competence in the academic genre. Although universities in China have recently begun to incorporate research writing instruction into the graduate EAP curriculum, such instruction is often provided by inexperienced instructors who are still in the process of learning the genre-based pedagogies [8]. They mainly focus on basic, conventional rules of research writing while often neglecting lexico-grammar features and the relationship between these features and rhetorical purposes. Thus, it will be of importance to investigate in what areas and to what extent metadisourse used by Chinese EFL writers of RAs is inadequate and how their metadiscouse use reflects their writer identity so as to facilitate genre-based EAP/ESP writing instruction.
The present study aims to achieve the following objectives: (1) To explore, through a corpus-based approach, metadiscourse features of the R&D of unpublished RAs written by Chinese PhD engineering students by comparing them with those displayed in the same part-genre by English expert writers in engineering; (2) To explore, through a questionnaire survey, whether Chinese novice writers' choice of metadiscourse is a result of their perception of RA writing; (3) How metadiscourse choice by these novice writer-researchers reflects the construction of their writer identity. Thus, the research questions are: (1) What are the metadiscourse features of the R&D of RAs written by Chinese novice writer-researchers? (2) Is Chinese novice writers' choice of metadiscourse a result of their perception of RA writing? (3) How does metadiscourse choice by these novice writer-researchers reflect the construction of their writer identity?
By combining both corpus-based approach and qualitative inquiry, this study hopes to offer inspirational findings on the relationship between metadiscourse use and writer identity construction.

Metadiscourse in RAs
The role of metadiscourse in "facilitating communication, supporting a writer's position and building a relationship with an audience" has been increasingly recognised in research writing [4] (p. 438). Successful research writers utilize a wide range of metadiscourse resources to achieve informational clarity on the one hand and guide readers to desired interpretations and a manifest stance on the other [23]. By doing so, they also strive to achieve "a balance between objective information, subjective evaluation and interpersonal negotiation" [24] (p. 294) so as to meet discourse community expectations and to gain acceptance for their statements [7].
Studies have tended to adopt a corpus-based approach to explore metadiscourse features in RAs, normally by comparing RAs with other genres, across disciplines, and across cultures. Swales [25] and Bunton [26] found that RAs engage in less metadiscourse than dissertation writing. More specifically, Kawase, by comparing various types of metadiscourse in the introduction section of RAs and PhD dissertations, revealed that RAs tended to be less explicit in their exposition than dissertations, and speculated that variations in metadiscourse use between these two part-genres may be attributed to different writer-reader relationships [11].
It is generally agreed that metadiscourse in RAs is "socially authorised and contextually constrained by the disciplinary communities in which it occurs" [4] (p. 448). Numerous studies have found differences across disciplines in the use of interactive metadiscourse (e.g., [5,27]) and interactional metadiscourse (e.g., [28,29]). Cao and Hu [5] and Hu and Cao [6] found differences in the use of several interactive and interactional metadiscurse resources between the natural and human sciences and attributed such differences to a different knowledge-knower structure between natural and human sciences. Hyland observed low frequencies of interpersonal metadiscourse in hard disciplines than in soft ones and speculated that there was a reluctance in the hard disciplines "to project a prominent authorial presence in presenting claims to their target community" [4] (p. 448).
Cross-cultural and cross-linguistic differences in metadiscourse use between RA writing in English and other languages are also manifested, since people from different cultures may assign different roles to writers, readers, and the text, thus using metadiscourse in different ways [2,7,23,30]. It is also revealed that, although metadiscourse elements are similar in texts written in different languages, strategies used to realize interpersonal functions seem to be "partly influenced and constrained by the lingua-cultural contexts in which texts are produced and consumed" [23] (p. 50). Comparative studies between English RAs and Chinese RAs have revealed that the former tend to use more interactional metadiscourse features to involve the audience in the text [7,31]. Writers from English-speaking cultures tend to adopt a more active role in making their ideas organized and explicit for readers to understand, while Chinese writers are found to be more implicit in expressing stance and engaging readers, passing the responsibility for interpreting the text to readers [8]. Li and Xu revealed that, compared with English RA writers, Chinese writers preferred metadiscourse markers for commenting on text to those for interacting with their readers, which contributed to "a more objective and detached style" in Chinese RAs than in English ones [8] (p. 54).

Metadiscourse in EFL Academic Writing
EFL novice writer-researchers' metadiscourse use is a key indicator for their discourse competence and has been a subject of research for EAP/ESP writing scholars. Some studies have focused on the struggles that EFL writers encounter in their process of RA writing [22,32], while others concern the ability of these writers to convey a clear position of the "author" on their arguments and to engage the reader directly in the texts [33,34]. Differences have been revealed in frequency and purpose of metadiscourse resources used by EFL learners from those by L1 English writers [30,35]. For instance, Lee and Deakin found that EFL academic essays generally contained fewer instances of interactional metadiscourse markers than L1 English ones and thus revealed less effective persuasion [35]. Hyland found that EFL writers had problems with both the type and range of metadiscourse markers, affecting the persuasive effect of their argument [4]. Specifically, EFL student writers were less strategic in stance-taking for establishing persuasive argumentation [36]. Zhang and Zhan suggested that the EFL learners' voice development may be influenced by dynamic changes in culture, language, and education in a globalized society [37].
Metadiscourse use is found to affect the effectiveness of writing. RAs written by high-rated L2 writers often embrace more interactional metadiscourse markers, especially expressions such as hedges, attitudinal markers, and engagement markers, since, with such expressions, authors can establish good relationships with their potential readers [38,39]. Writing of successful EFL writers may display less of a reliance on interactive resources, a balanced use of interactional resources, and an increased range of metadiscourse markers, indicating a greater dialogic sense of interaction and audience involvement in advanced EFL learners' writing [40,41]. They also use a higher variety of metadiscourse resources than less successful ones, who tend to be over-reliant on highly-frequent text connectives [41,42].
Despite a proliferation of studies on the metadiscourse features of English writing by Chinese EFL learners, most of them have focused on academic writing by undergraduates. Given the cross-genre differences in metadiscourse use, such studies may not provide direct evidence or implications for novice writer researchers. Studies on RAs written by Chinese writers are mainly on published Chinese RAs in soft disciplines (e.g., applied linguistics [7], sociology [8]), while those on English RAs written by Chinese novice engineering researchers are rare. Considering that engineering researchers in China generally receive less formal training in research writing and are often under more pressure of international publication than those in soft disciplines, the writing of such a group of writers indeed calls for more attention. In addition, most findings on the influencing factors of metadiscourse use are mainly postulations based on quantitative findings, which lack support from writers' reflection. It would be more inspirational if a qualitative inquiry was taken to explore how novice writers perceive RA writing and how such perceptions influence their decision-making about metadiscourse use.

Academic Writer Identity
According to Ivanič, "writing is an act of identity in which people align themselves with socio-culturally shaped possibilities of self-hood" [19] (p. 32). The social constructivist view sees writing as a social engagement in which writers interact with their potential readers, not only to convey messages, but to facilitate understanding [4]. In this process, writers also construct their own personality through their presentation of self and the way they explain their actions [28]. This means that writers adopt an appropriate identity through their choice of words to present ideas in ways that make sense to their readers and are acceptable by the disciplinary communities they belong to [43].
Despite the existence of disciplinary conventions, it is provincial to regard writer identity as fixed. Instead, writer identity is constantly changing due to the reality of heteroglossia, reflected by the dynamic, dialogic interactions between writers, texts, and the prospective audience [36]. Writer identity is not mono-faceted either, for different aspects of identity are constructed simultaneously in a text through lexico-grammatical choice [31]. Ivanič establishes a framework of writer identity, which contains four interrelated aspects: Autobiographical self (i.e., what a writer brings to the discourse from his past experience), discoursal self (i.e., a writer's self-representation constructed through discourse characteristics), authorial self (i.e., a writer's presence or authoritativeness displayed in the text), and possibilities for self (i.e., possible identities available in the socio-cultural context of writing) [19]. Among these aspects, discoursal self and authorial self are closely related to discourse characteristics, such as metadiscourse, and are thus the focus of this study. In RAs, discoursal identities are "constructed through the discourse characteristics of a text that reflect values, beliefs and power relations in the social context in which they were written" [19] (p. 25) and are associated with how writers are inclined to play various roles by choosing certain linguistic features. Authorial identity concerns how writers present self and establish authority for their claims in their writing in the sense of their position, opinions, and beliefs [19] (p. 27).
In RA writing, the presentation of discoursal and authorial self is reflected in several roles that the writer takes by taking into consideration the contextual factors, including the nature of his research, discipline conventions, his power relationship with readers, and his sense of self [31]. The roles of a writer-researcher can be classified into six types: Conveyor of general knowledge, guide or navigator, conductor of research, evaluator of previous claims, originator of claims, and reflexive self [19,44]. These roles display an ascending order of authority, with knowledge conveyor being the least authoritative and reflexive self being the most.
A number of studies on EFL writing have investigated writers' identity construction process by exploring specific metadiscourse markers. For instance, studies on authorial identity have revealed that English L1 speakers are more ready to present themselves explicitly in their RAs through relatively frequent use of self-mentions or stance and engagement markers, while writers of other languages display a covert author presence by avoiding the use of these metadiscourse markers, especially self-mentions [44,45]. Lee and Deakin, by exploring interactional metadiscourse in argumentative essays, found that Chinese EFL students were unwilling to construct an authoritative writer identity in their writing [35]. Similarly, Wu found that Chinese RA writers used stance and engagement markers less frequently than English RA writers, suggesting that the former were less willing to present a voice or community-recognized personality [31]. On the contrary, Geng and Warton, by investigating engagement resources, refuted the view that Chinese students were reluctant to critique in order to preserve and maintain public image [12]. These inconsistent findings point to the necessity to further explore the relationship between metadiscourse choice and identity construction, so as to provide more empirical evidence in this aspect.

Framework of Metadiscourse Markers
An interpersonal model of metadiscourse by Hyland [2] was adopted for data analysis in this study, since this model has been proved reliable for studying metadiscourse in research genres by many recent studies [6,7,35] and reflects the latest development in the methodology of metadiscourse analysis. The model classifies metadiscourse into two macro-types: Interactive metadiscourse (Table 1) and interpersonal metadiscourse ( Table 2). Interactive metadiscourse refers to resources that "manage the information flow to explicitly establish his or her preferred interpretations" [3] (p. 138). It contains Transitions, Frame Markers, Endophoric Markers, Evidentials, and Code Glosses. The interactional metadiscourse concerns "the writer's efforts to control the level of personality in a text and establish a suitable relationship to his/her data, arguments, and audience, marking the degree of intimacy, the expression of attitude, the communication of commitments, and the extent of reader involvement" [3] (p. 141). It consists of Hedges, Boosters, Attitude Markers, Self-mentions, and Engagement Markers. Each category is further classified in terms of more nuanced functions, following [5,6,23]. However, minor modifications were made in our study when classifying and identifying metadiscourse markers. First, instead of classifying Hedges into writer-and reader-oriented types as Lee and Casal [23], we classified them into subcategories of Probability, Inference, Approximate quantity, Approximate frequency, Approximate degree, and Approximate limitation, because this classification more clearly reflects salient functions of Hedging expressions. Second, the criteria proposed by Hyland [2] for identifying various types of metadiscourse markers were slightly altered. For example, Attitude Markers in our study included not only expressions concerning the writer's attitude towards his/her proposition (Example 1), but also those appraising the writer's or others' studies, which demonstrates the writer's effort to assert the value of his/her study (Example 2).

Example 2.
In this study, all Cu based samples showed better field emission results, . . . , compared with their corresponding CuO-based samples. (EEW49).

Corpora Construction
To study the metadiscourse features of Chinese novice writers (CNWs), we compared the results and discussion (R&D) of unpublished English RAs by Chinese PhD students and the same section of published RAs by English expert writers (EEWs). The unpublished RAs were collected from the writing center of a top Chinese university during 2018-2020. They were written by Chinese engineering PhD students of this university, who have all passed the English proficiency test for PhD students of the university and can be considered as intermediate-advanced EFL learners. They submitted their RA drafts to the writing center for revision with a hope of reaching the standard of their targeted international journals. The selection of RAs for the present study observes three criteria: (1) Each selected RA should have the conventional IMRD structure (though subtitles for each section may be different), with an integrated R&D section; (2) Each selected RA should be linguistically comprehensible, with grammatical errors and inappropriate uses not severely interfering with the reader's understanding of the RA; (3) Student-writer of the selected RA has received little formal training in RA writing and has little experience in publishing RAs (revealed in their application form for RA revision). Fifty RAs were selected and produced a CNWs' R&D corpus of 73,165 words. For comparison, a parallel EEWs' R&D corpus of published RAs was built, containing 92,584 words. These RAs were selected by the Chinese PhD students from international journals with high impact factors (JCR Q1 or Q2) to which they intended to submit their RAs. They were similar to the selected unpublished RAs in discipline, research topic, and generic structure, and were written by renowned scholars who are either L1 English speakers or affiliated within English-speaking institutions. A description of the two subcorpora is given in Table 3.

Analysis of Corpora
A word or a chunk of words expressing a metadiscursive function serves as the unit of analysis. Expressions with similar patterns, such as Table 1 and Figure 1, were counted as the same metadiscourse marker. By referring to the classification in Section 3.1, all the metadiscourse markers were manually identified and annotated, since metadiscourse is highly contextual and a particular expression can serve different functions. For example, the modal verb would serves metadiscursive function when it refers to the result or effect of a possible situation, but is not counted as a metadiscourse marker when it is used as the past form of will to project a future happening in the past. Another example is the conjunction and, which serves a metadiscursive function when connecting two clauses to indicate a clausal relationship, but not when connecting two words or phrases.
The two authors first identified metadiscourse markers of each subcategory in an R&D sample together against the definitions given by Hyland [2] (pp. 48-54) to make sure they understood and agreed on the coding criteria. Any inconsistencies during the coding process were discussed. They then annotated the data individually. Intercoder reliability was measured and a coding reliability coefficient of 0.82 was obtained, indicating a relatively high level of agreement between coders on the categorization of metadiscourse markers.
Quantitative analyses were conducted of the coded metadiscourse markers. The annotated text files were uploaded to Antconc (v. 3.2.4) for metadiscourse frequency count. Identified metadiscourse markers of each subcategory were counted to reveal the writers' metadiscourse repertoire and preference. They were then normalized to 10,000 words for comparison between the subcorpora. Chi-square analyses, a non-parametric test commonly used in corpus research, were undertaken to determine any statistically significant differences in the frequency of metadiscourse categories between the two subcorpora. p value was set at 0.05 for all statistical tests.

Questionnaire Survey
An online questionnaire was also conducted among the CNWs, aiming to explore their perception of metadiscourse in RAs' R&D. The questionnaire focused on perceptions of RAs' R&D functions, writer-reader interactions, and metadiscourse use. The questionnaire contained two open-ended questions and 10 Likert-scale questions. The first open-ended question was on their views of the functions of R&D in RAs. The second was whether they would consider the potential readers when organising the information of their RAs and in what aspects they would adjust their RA's R&D in anticipation of readers' needs. The 10 Likert-scale questions were included to measure how often the CNWs think each category of metadiscourse should be used in RAs' R&D. A brief definition and a few typical examples of each category were given to make sure each respondent understood the terms. Each question contained 6 options: 0 stood for "I have no idea whether to use it", 1 for never, 2 for rarely, 3 for occasionally, 4 for quite often, and 5 for always. The respondents were also asked to provide their reasons for their choice. The questionnaire link was sent to the CNWs after their RAs were selected into the subcorpus. They were asked to do the survey before a deadline. Eventually, we collected responses from 47 students.

Findings of the Corpus Study
To answer the first research question, i.e., what are the metadiscourse features of the R&D of RAs written by Chinese novice writer-researchers, we compared the frequency and variety of metadiscourse markers of each subcategory used by the CNWs with those of the EEWs. The numbers of interactive and interactional metadiscourse markers in RA's R&Ds by the CNWs and the EEWs were normalized to per 10,000 words (Table 4). To compare their salient pattern for a choice of metadiscourse markers, and to explore the metadiscourse repertoire of the CNWs, all subcategories and metadiscourse markers under each subcategory used by the two groups were ordered by frequency. Overall, our study found that the total frequencies (per 10,000 words) of metadiscourse markers used by the two groups were very close, and Chi-square analysis showed no significant difference between them (p < 0.05). However, a significant difference was found in both interactive and Interactional metadiscourse between groups. The CNWs used significant, more interactive metadiscourse markers than the EEWs but significantly fewer interactional ones (p < 0.05, Table 4). The results seem to indicate that, just as EFL writers, the CNWs attach greater importance to guiding readers through texts, but have less sensitivity to the need to involve readers into their texts and communicate with their readers. Both groups used interactive metadiscourse at a higher frequency than interactional ones. This could be attributed to the particularity of RAs in engineering disciplines: The major task of the author is to guide the reader through the texts before they can represent themselves to connect with their readers [7]. In general, the CNWs used significantly more interactive metadiscourse markers in their RA's R&D than the EEWs (Table 5). Among the five categories, both groups used Transitions most frequently, followed by Endophorics, Frame Markers, Code Glosses, and Evidentials. Between the two groups, the CNWs used more Transitions and Endophorics, with a significant difference in Transitions, while the EEWs used the other three categories more frequently.

Transitions
Transitions was the most frequent subcategory of interactive metadiscourse in both subcorpora, accounting for over 1/3 of the total interactive metadiscourse markers. Among the five categories of interactive metadiscourse, it was the only one that the CNWs used significantly more frequently than the EEWs (p = 0.0000). Compared with previous findings [7,8] which found that Chinese L1 writers of RAs normally used significantly fewer Transitions than English writers, our study found that the CNWs showed more preference for Transitions than the EEWs when writing English RA's R&Ds (Table 6). All three categories of Transitions saw higher occurrences in the CNW subcorpus than in the EEW one. Specifically, the CNWs used more than double the Additions that the EEWs used, especially for presenting results, while the EEWs preferred Contrasts, which were mainly used to compare their studies with other studies. Note: * per 10,000 words; ** variety; *** most frequently used three metadiscourse markers of each subcategory by each group; **** the token per 10,000 of the specific metadiscourse marker.
The CNWs seemed to be overdependent on explicit cohesive devices to realize discoursal relations while neglecting the role of other discourse strategies, such as the logical arrangement of information, in building coherence. An extreme case of such overuse was that some CNWs redundantly used a paratactic or progressive transitions to lead every single sentence of a paragraph, as in Example 3.

Example 3.
The elastic-like property of PS/CeO2 makes them deformed when stressed by the polishing pressure . . . . Meanwhile, as PS/CeO2 can well adapt to pad asperities, a lower and more uniform contact stress is formed on the wafer surface, making the wafer removed gently and uniformly. Moreover, the lower contact stress will induce a lower mechanical damage on the wafer surface. Also, the buoyant effect of PS/CeO2 is helpful to improve polishing quality, because it makes PS/CeO2 difficult to precipitate and agglomerate. Therefore, PS/CeO2 can contribute to better polishing quality compared with CeO2. (CNW32).
In Example 3, Additions such as meanwhile, moreover, and also were used as the starting point of every sentence in order to highlight the parallel or progressive relationship between adjacent sentences. However, such use seems awkward when the relationship between clauses is quite obvious or when other relationships are more appropriate, e.g., the cause-and-effect relationship between the first two sentences.
The variety of Transitions used by the CNWs was similar to that of the EEWs. However, some Transitions used by the CNWs seemed to be inappropriate. In Example 4, whereas is mistakenly used as an adverb rather than a conjunction to indicate a contrast. The CNWs also overused sentence initials and, but, and so, markers generally preferred in oral rather than written English, to show additive, contrastive, or causal relations between clauses. Such inappropriateness suggests the CNWs' inadequate knowledge of the usage of Transitions in academic genres.

Example 4.
For gap width, whereas, the most probable reason for the great influence on spark discharge energy was the volume of the plasma tunnel increases with the increase of gap width. (CNW27).

Endophoric Markers
Compared with other sections of RAs, the R&D section normally sees a much higher density of Endophoric markers because they are frequently used in this section to direct readers towards visuals as well as facts, theories, methods, and research results presented before or after. In our study, Endophoric Markers had the second highest occurrences in both subcorpora, which showed similarity in both quantity and variety, despite some minor differences, in which the EEWs used slightly more non-linear Endophoric markers but fewer linear ones than the CNWs (Table 7). This result is similar to that of Lee and Casal's results [23] on the metadiscourse features in R&D chapters of English PhD dissertations. Both groups used a great number of non-linear Endophoric markers to refer to figures or tables. As argued by Hyland [3], writings in hard disciplines are "typically semiotic hybrids", and are thus characterised by a great density of Endophoric Markers to build connections between text and visuals. These markers were used especially when writers guided readers through the reporting or interpretation of results, providing concrete support to enhance argument validity (Example 5). Non-linear 77. 5 6 in Table/  Note: * X is used to stand for an Arabic number.
Example 5. Figure 7 shows the results of the experiments. . . . It was found that the SDE increase with the increase of discharge gap width, the SDE is 591 µJ with the gap width of 1.5 µm, 517 µJ with gap width of 3.5 µm and 948 µJ with the gap width of 6 µm as Figure 7a shows. (CNW27).
Both groups used Endophoric Markers with similar linguistic structures, with the prepositional phrase in + visual (e.g., in Figure 1) being the most frequent. However, the degree of preference was slightly different. While the EEWs used parenthesized structure ( Figure X) much more frequently than the CNWs, the CNWs adopted an as shown/seen clause much more often when presenting the information of visuals. Among Linear Endophoric Markers, the EEWs used the retrospective above and prospective following far less than the CNWs, which is probably because the former preferred to precisely locate the mentioned materials (e.g., in Section X).

Frame Markers
The two subcorpora show similarity in both the frequency and variety of Frame Markers (Table 8). Congruent with Lee and Casal [23], Sequencers predominated in both subcorpora, followed by Announcers, while Discourse labels and Topicalisers appeared rarely. Although the CNWs used a similar number of Sequencers as the EEWs, they used the word then much more frequently when introducing the next item in a series of actions. In addition, the CNWs used much fewer Announcers than the EEWs. When reporting the aim or results of the study, the EEWs normally referred to this study (Example 6), while the CNWs often spoke of this paper (Example 7). This paper generally collocates with reporting verbs such as discuss to present the content of the RA. However, in the CNW data, it often inappropriately collocated with action verbs such as use to report the work done by the researchers.

Example 6.
This study is to investigate the structural changes occurring to hardwood Alcell TM lignin as a result of fiber devolatilization. (EEW7).

Example 7.
In this paper, multiple linear regression method is used to study the correlation between operating temperature, wind speed, relative humidity and the coefficients k, n in the model. (CNW25).

Code Glosses
Code Glosses were infrequent in both subcopora, and no significant difference was found (Table 9). Between the two categories of Code Glosses, Reformulations were used only slightly more than Exemplifications, which is somewhat different from Lee and Casal's study [23] that found significantly more Reformulations than Exemplifications in their corpus. The two groups also showed similarity in a variety of expressions for the two subcategories of Code Glosses. A difference between the two groups lies in the preferred Reformulation, with EEWs favoring the abbreviation i.e., (Example 8) and the CNWs preferring mean (Example 9). While both expressions are an elaboration or explanations of ideas, the former seems more concise than the latter, indicating that the CNWs probably were less adequate in presenting information in a more concise way. Example 8. This is also observed in this work, i.e., a decrease of the lattice spacing and an increase in orientation, crystallite size and pore diameter. (EEW6).

Example 9.
PSNR of OMP is finite, which means there are some errors in reconstruction. (CNW10).

Evidentials
Evidentials occurred infrequently in both subcorpora (Table 10). Evidentials are expected to occur in RA's R&Ds when writers compare their own findings with those of others or draw upon others' findings to interpret or support their own. However, unlike introductions where literature is densely referenced, R&Ds require relatively fewer references to literatures. The CNWs used fewer Evidentials than the EEWs, suggesting that they relied less on the studies of other researchers to support their arguments or made fewer comparisons with other studies. Both groups used many more Non-integral Evidentials (a cited source within parentheses) than Integral Evidentials (a cited source incorporated as part of the statement), which is in line with Hyland's finding that non-integral citations are preferred in hard disciplines to give prominence to the research and less emphasis to the role of the researchers [2].
The CNWs used about half as many integral Evidentials as the EEWs. Integral Evidentials function to "foreground individual interpretations, alternative perspectives, and human agency in knowledge construction" and "allow writers to show their stance and make evaluations" [5] (p. 28). The low frequency of this type in the CNW subcorpus indicates that the CNWs intended to make their writing more objective and impersonal by concealing the role of evaluators. By using Non-integral Evidentials instead of Integral ones, the CNWs tended to "adopt a non-committal stance that acknowledges or distances themselves from cited sources" [46]. This could be attributed to the CNWs' "self-perceived peripheral status in the academic discipline, or their traditional Chinese values of collectivism" [36] (p. 15).

Interactional Metadiscourse
The CNWs used significantly fewer interactional metadiscourse markers than the EEWs (Table 11, p = 0.0000). This result echoes the findings of some previous studies [8], and suggests that the CNWs probably attend less to the writer-reader interaction than the EEWs. Among the five categories, the CNWs employed significantly fewer Hedges and Attitude Markers (p = 0.0000) but significantly more Self-mentions (p = 0.0004) than the EEWs.

Hedges
Hedges were the most frequent interactional metadiscourse and accounted for almost half of the total interactional metadiscourse in both subcorpora (Table 12). Indeed, Hedges tend to occur more frequently in R&D than in any other RA section because they facilitate the presenting of claims or arguments in a polite and cautious manner, opening up an agreeable space for the negotiation of alternative explanations of potential results, speculating on the limitations of the present study, and indicating the practical implications for future research [6]. However, the CNWs used far fewer hedges, less than two thirds of the amount used by the EEWs. In cases where it would be more appropriate to use Hedges to demonstrate tentativeness in presenting plausible reasoning, some CNWs made bare assertions, as in Example 10: In Example 10, the use of overestimates without any down-toner or support for this claim makes the criticism of another researcher's work rather face-threatening. This shows that some CNWs were neither clearly aware of their relationship with readers in negotiating propositional information, nor familiar with the devices they could use to downplay their assertiveness and achieve politeness.
Both groups used the subcategories Probability (Example 11) and Inference (Example 12) predominantly, while Approximate degree and Approximate frequency were relatively infrequent. In R&Ds, writers tend to make speculations about the possible meaning or significance of their research findings and, at the same time, avoid direct personal responsibility for their statements.
Example 11. The non-detection of a-SiC in the XRD results above may be attributed to two factors. (CNW2).

Example 12.
This also can indicate that GASA can reconstruct noiseless signal accurately. (CNW16).
The varieties of hedging expressions used by the two groups were similar, though the CNWs used slightly fewer varieties than the EEWs. Both groups mainly resorted to modal adverbs and lexical verbs to indicate cautiousness in making claims, followed by the use of modal verbs to express uncertainty, probability, or approximation. Adjectives and nouns were rarely used, especially by the CNWs. They also used a few compoundhedging structures (two or more hedges used together, e.g., seem to indicate) to increase the strength of tentativeness. In addition, various hedging expressions ranked similarly in terms of frequency in the two subcorpora, albeit with a few exceptions. Words such as approximately, typically, likely, seem were strongly preferred by the EEWs than by the CWNs, and a few advanced words, which were used by the EEWs, though only occasionally, such as predominantly, postulate, indicative, did not occur in the CWN subcorpus at all.
Additionally, the misuse of hedges occurred in the CNWs' data. A typical case was the modal verb could, which was often used in the phrase it could be seen, where can should be used to express ability. In Example 13, could contradictorily co-occurred with clearly, the former expressing uncertainty while the latter certainty. This suggests that the CNWs were probably not clear about the functions of some hedging expressions.
Example 13. It could be clearly seen that the differences of coverage ratio among the different algorithms are decreasing. (CNW8).

Attitude Markers
As Hyland has pointed out, RA writers frequently use Attitude Markers to express their judgments, evaluations, and views on textual information [47]. Both the CNWs and the EEWs used fairly large numbers of Attitude Markers in their R&Ds (Table 13). However, the EEWs used many more Attitude Markers than the CNWs, indicating that the latter were more conservative at explicitly marking personal attitudes in R&Ds. This is probably because the CNWs regarded Attitude Markers as signifying subjectivity, which may be discrepant from their perceived conventions of RAs [35]. In addition, compared with the EEWs, the CNWs used a much smaller range of Attitude Markers, indicating their limited repertoire of Attitude Markers. The EEWs used a much wider variety of Attitude Markers than the CNWs, especially Evaluative adjectives. This was the most common subcategory in both subcorpora, taking up about half of the total Attitude Markers, with important and good being most frequent (Example 14). Attitude verbs were the second preferred subcategory, while Obligation modals, Affective adverbs, and Attitude nouns appeared infrequently in both subcorpora. The verb confirm (Example 15) and the two phrases be in consistent with (Example 16) and be in agreement with were preferred by both groups to support their initial hypothesis or to compare their findings with previous studies. This shows the tendency for engineering writers to strengthen persuasiveness by showing how they themselves respond to the referenced material. However, the verb expect (Example 17), which topped the EEW's list when stating whether their results were in line with their own or readers' expectations, was far less frequently used by the CNWs, indicating that the CNWs were less ready to address their readers' perceptions or to get them involved in the reasoning processes. In addition, the verb allow (Example 18), which was preferred by the EEWs, often to denote a benefit of a finding, a material, etc., did not appear in the CWN's subcorpus at all, suggesting the CNWs' unfamiliarity with such usage of this word.

Example 14.
What is more important, convergence time of NPCGA increases very slowly and slightly with N, K or R increases. (CNW8).

Self-Mentions
Self-mention was the third most frequent interactional metadiscourse in both subcorpora (Table 14). Contrary to previous findings which showed that Chinese writers tended to avoid the use of Self-mentions [7], our corpus reveals a significantly more frequent use of Self-mentions by the CNWs than the EEWs. Self-mention is a promotional device which gives a community-approved persona and consolidates the writer's credibility among other community members [29]. Both groups tended to use plural forms of the first-person pronoun to indicate the unity of the research group or the spirit of collectivism; no singular first-person pronoun was found in our corpus. An examination of the distribution of we by their main discoursal functions reveals differences between the two groups (Table 15). In the EEWs subcorpus, we was mostly used to present an argument or speculation. In particular, it most frequently collocated with epistemic verbs of judgement, such as assume and believe, occupying about one-third of the instances we was used by the EEWs. We, for describing experimental procedures and for reporting findings, accounted for one-fourth respectively. In the CNW subcorpus, however, we mostly co-occurred with action verbs such as compare, conduct, perform, for describing experimental procedures (Example 19), which took up over half of the instances of we used by the CNWs. This indicates that the CNWs were more ready to assume the role of researchers than of evaluators. Additionally, we tended to appear frequently in such collocations as we can see from, we can find, mainly for reporting findings (Example 20), yet its co-occurrence with epistemic verbs was rather infrequent (Example 21).

Example 21.
We presume that more TiO2 existed in native oxide films on \xA6\xC1 phase due to its higher Ti content. (CNW34).

Boosters
Both CNWs and EEWs used Boosters at a relatively low frequency (Table 16). As Hyland demonstrates, this is probably because the results in engineering disciplines are not that open for negotiation compared with those in the arts and humanities, and engineering writers tend not to use strong generalization so as to avoid refutation by readers [48]. The frequencies of Boosters used by the two groups were very close. Between the two subcategories of Boosters, the frequency of Emphatic Boosters far outweighed Amplifying Boosters. Both groups resorted to adverbs, projected clauses, and verbs to intensify their claims, with adverbs taking up the great majority. However, the degree of their preference for various boosting expressions displayed some differences. For example, the adverb obviously was the most favored Booster by the CNWs (Example 22), yet it did not occur in the EEWs' data. The CNWs used less projected structure such as it is clear that to show confidence in their arguments than the EEWs.

Example 22.
Obviously, the composite coatings prepared by conventional methods mainly focus on corrosion resistance, whereas electrical insulation is usually ignored with few published literatures. (CNW44).

Engagement Markers
Engagement Markers are mainly employed to connect readers and attract their attention to the writers' findings or statements [34]. In this study, both groups used Engagement Markers in their RAs sparingly (Table 17). The CNWs used fewer Engagement Markers than the EEWs, indicating that the former were probably less ready to create dialogic space with readers. Among the five subcategories of Engagement Markers, Directives were the main strategy adopted by both groups to engage readers, followed by some shared knowledge references and a few reader mentions; neither used Questions or Personal asides. Within the subcategory of Directives, the CNWs used more Predicative structures than Imperatives, while the EEWs used more Imperatives. The imperatives see and note predominated the EEWs' list (Example 23), especially when navigating readers towards tables or figures, while it was used only occasionally by the CNWs, who favored the predictive structure it can been seen (Example 24). Since predictive structures are somewhat less explicit than imperatives in initiating reader participation, our findings seem to suggest that the CNWs tended to keep a certain distance with readers when presenting results or arguments.
it is/can be seen (4.0), it should be noted that (0.8), it is worth noting that (0.3) it should/can be noted that . . . Example 23. Note that after decoding one frame, the global UTC time is available and therefore the timing module can be safely powered off. (CNW5).

Example 24.
It can be seen from Figure 9 that the volume of pores decreases with increasing amount of OSRA, which is consistent with the result of total porosity in Table 9. (CNW20).

Findings of the Questionnaire Survey
To answer the second research question, i.e., whether the Chinese novice writers' choice of metadiscourse is a result of their perception of RA writing, we conducted a questionnaire survey on the CNWs' perception of metadiscourse use in RA's R&Ds.

Functions of RA's R&D
We asked the respondents to give their ideas on the major functions of RA's R&Ds. Almost all the respondents reported that R&Ds should include a brief summary of results, an interpretation of results, and directions of future study. It reveals that the CNWs were aware of the major purpose of this part-genre in interpreting their results so as to reveal the answers to research questions. However, most respondents failed to recognize other important functions of RA's R&D, such as the negotiating of knowledge claims in the context of the published knowledge and demonstrating the value of the study. Only about 20% of the respondents mentioned these functions. Thus, the respondents seemed to have an incomplete knowledge of the functions of RA's R&Ds, especially regarding its role in negotiating different views and claiming novelty. This may limit their use of some categories of metadiscourse, such as Evidentials and Attitude Markers.

Awareness of Writer-Reader Interactions
Writer-reader interaction emphasizes the dialogic nature of a text, meaning that writers should anticipate readers' questions or reactions to what is written and address such reactions accordingly [49]. To explore the CNWs' awareness of writer-reader interactions, we asked whether they would anticipate the potential readers of their RAs. About twothirds of the respondents reported that they would never or rarely, saying that they simply tried to imitate RAs written by experts but did not think how they should deploy the language to meet readers' expectations. This reveals their lack of awareness of RA writing as a kind of social interaction, wherein knowledge is mutually constructed, negotiated, and created. This also indicates their traditional view of RAs as a passive expression of knowledge which requires little use of rhetorical strategies.
The respondents were also asked how they would revise their RAs in anticipation of readers' needs. About three-fourths of the respondents said that they would make their writing more easily understood, mainly by elaborating results, increasing coherence, defining jargons, and attending the accuracy of information. This reveals their efforts to facilitate readers' comprehension by increasing the clarity of information presentation, which further shows their emphasis on propositional ideas and their readiness to use interactive metadiscourse in their writing. However, their responses revealed little awareness of adjusting the interpersonal relationship with readers, e.g., the choice of personal pronouns to adjust the writer-reader distance, the use of persuasive strategies to convince readers of the research's validity, or the use of an appropriate tone to make readers emotionally comfortable. Thus, it seems that they were not fully aware of the dialogic strategies for inviting or restricting reader participation.

Perception of Metadiscourse Use
The respondents were asked how often they thought each category of metadiscourse should be used in RA's R&Ds. Table 18 summarizes the mean score of their perceived frequency for each category, and Figure 1 shows the distribution of the CNWs' perceived frequency of using each category.  For each category, about 15-21% of the respondents had no idea whether they should use it in R&Ds. This shows that a considerable proportion of the CNWs were not clear about the functions of metadiscourse, or that their choice of metadiscourse was an unconscious behavior. The categories of interactive metadiscourse generally received higher mean scores than interactional metadiscourse, indicating that the CNWs recognized the need to use more interactive markers than interactional ones in R&Ds. Among the 10 subcategories, Evidentials obtained the highest mean score (3.34), while Self-mentions the lowest (1.96).
The respondents were also asked to give reasons for their perception. For interactive categories, especially Evidentials, Transition, and Endophorics, most respondents chose "often" or "always" regarding the frequency of their use. The main reasons the CNWs gave for their frequent use of Evidentials were to avoid plagiarism, maintain academic integrity, and to gain credibility, as seen, for example, in Response 1:

Response 1.
It's very important to provide the source of a citation. This is the rule of scientific writing. Otherwise, it will be considered as plagiarism.
This shows the CNWs generally have an awareness of academic ethics. The reasons for using Transitions and Frame Markers are quite similar, mainly to increase logicality, coherence, and organization. For example, Response 2 noted: Response 2. Research articles should be in logical order. These expressions (Transitions) can help writer express his views logically and make the writing more organized.
Interestingly, A few respondents also referred to the difference between English and Chinese when giving reasons for frequent use of Transitions (Response 3): Response 3. I want to make the writing more logical and organized. I learned in my English classes that English writing usually uses more connecting words to show logical relationship between sentences than Chinese writing.
Note: The responses were originally in Chinese and were translated by the authors. The main reason for using Endophorics was also related to the coherence of the text, as most respondents put it, and as shown in Response 4:

Response 4. These expressions (Endophorics) can increase the connectiveness beteween various parts of the text.
A few respondents took readers into consideration, saying that using these markers could make it easier for readers to follow, such as in Response 5: Response 5. In writing, these expressions (Frame Markers) can give readers a logical order, so that they can well grasp the author's thoughts and intentions.
In terms of interactional metadiscourse, most respondents expressed a reservation in using them. The main reason for their avoidance of using Self-mentions was that these markers might increase the subjectivity of claims and reduce professionalism of the study, as shown in Response 6: Response 6. Of course I will never use Self-mentions when writing Results and Discussion. These expressions may highlight the role of authors and make the writing subjective. It is quite inappropriate to use them in RAs since we should keep neutral towards our results. I always use passive voice to avoid the use of first-person pronouns.
When commenting on the use of Attitude Markers (Response 7) and Boosters (Response 8), respondents gave similar perceptions, saying that these expressions explicitly emphasized the writer's personal attitude and were inconsistent with the RA's feature of objectivity. For Hedges, most respondents believed that such expressions were seldom used in scientific writing because they would reduce the credibility of their conclusions, as in Response 9. Some believed that it was inappropriate or even wrong to employ expressions with emotional tendencies in their RAs because their supervisors did not encourage them to do that, as seen in Response 10.

Response 7.
The emphasis of Results and Discussion should be on the analysis of results, so I would be cautious in using these expressions so as not to reveal my personal inclination.

Response 8.
Using these words (Boosters) does not conform to RA conventions. They either make my claims sound unreliable or make me appear overconfident.
Response 9. I would not use these words (Hedges) because they reduce the certainty of my findings and thus reduce rigorousness of reasoning.
Response 10. I learned from my supervisors that we should report our findings as objectively as possible when writing research articles. Thus, we should not express my attitude or emotion too often.
Note: The words in parentheses are added by the authors. A few respondents attributed avoidance of interactional metadiscourse to their novice writer-researcher identity, as shown inResponse 11: Response 11. As a novice writer, I do not have the confidence in strengthening my argument.
Only experts have the authority to do that.
To further reveal the CNWs' general perception of the writing of the RA's R&D, we conducted thematic analyses to identify common ideas that emerged repeatedly in their responses. For the interactive metadiscourse, the most frequent words occurring in their responses included "objective" and "logic", showing that the CNWs considered the RA's R&D section as a genre of objectiveness and logicality that does not need much embellishment. This probably explains the exceptionally high occurrence of Transitions in the CNWs' corpus, which functions to facilitate the logical connection between sentences. For interactional metadiscourse, the most frequent words in their responses included "rigorous", "accurate", and "objective". This again shows that the CNWs probably attached a greater importance to objectivity compared to other characteristics of RAs.

Discussion: Metadiscourse Features and Identity Construction
Based on the results of the corpus study and the questionnaire survey, we can answer the third research question, i.e., how the metadiscourse choice of these novice writerresearchers reflects the construction of their writer identity, especially their discoursal self and authorial self.

Discoursal Self
Discoursal self refers to the intentional or unintentional self-representation of writers to claim membership to a community they feel affiliated with [19]. The questionnaire survey reveals the CNWs' awareness of RA as a specific discoursal genre, which requires the observation of specific conventions in order to be accepted by the research discourse communities. The CNWs used significantly fewer interactional metadiscourse markers than the EEWs. One reason, as revealed by the questionnaire responses, may be that the CNWs attach less importance to such interpersonal functions of RAs, such as recognizing, constructing, and negotiating social relations, realized by metadiscourse markers, than to propositional content. Among all kinds of metadiscourse, the CNWs used significantly more Transitions, but significantly fewer Hedges and less of other types of markers on attitude and stance than the EEWs, indicating their tendency to focus more on logicality and organization than interaction. This is mainly due to their perception of RAs as a genre of representations of objective reality and their perception of themselves as a reporter of their research and guide and navigator of the text rather than an evaluator of various views or as a creative originator. They also fail to see themselves as a discourse constructor and promoter who can manipulate rhetorical strategies to enhance the credibility of their research. This perception may arise from the reinforced instruction that natural sciences are characteristic of hierarchical knowledge structures, which gives rise to "an explicit, coherent, systematically principled and hierarchical organization of knowledge" [50] (p. 172, cited in [5]).
Their reserved use of interactional metadiscourse also reflects their view of RAs as a closed, fixed monologue by the writer himself/herself about the objective world, which requires little to deploy various genre options to engage readers. Failing to see the flexibility of disciplinary conventions, they feel more comfortable strictly conforming to the fixed sets of rules they mostly learn from EFL classes or from their supervisors. To make their writing sound more impersonal and objective, they purposefully reserve their own attitude in their RAs, and follow other conventions to achieve objectivity, which have been repeatedly reinforced by their EFL teachers or supervisors and which they consider as the most secure way to be recognized by the disciplinary community. In fact, many CNWs fail to recognize RA writing as a process in which knowledge is socially constructed through the deployment of rhetorical strategies to make their RAs both understandable and persuasive. Some CNWs even go so far as to see no point of arguing for the novelty and significance of their research, for the reason that this can be self-evident if new findings are presented objectively.

Authorial Self
The authorial self is mainly associated with writers' willingness to make claims and/or their reliance on external authorities to support those claims [19]. The questionnaire survey reveals that many CNWs reported to avoid using interactional metadiscourse markers, such as attitude markers and boosters to explicit present their position, indicating their reluctance to deploy these resources to strengthen epistemic conviction. However, it should be noted that, although hard disciplines, compared with soft disciplines, may depend more on "procedural adequacy and methodological rigor" to achieve empirical authority than on personal voice or authority [6] (p.20), appropriate use of these metadiscourse resources will accentuate the persuasiveness and authoritativeness of the claims.
The authorial self is most typically represented by the use of first-person pronouns [45]. The high-frequency of Self-mentions in the CNWs' subcorpus apparently indicates that they did not spare to present a personal ethos in their writings. However, compared with the EEWs, who used the pronoun we mainly to present arguments and claims, the CNWs used we mainly for reporting factual information, such as procedures or findings, with which they had more confidence to present, though infrequently used them for interpreting results or presenting opinions, which carries more risk of being refuted. This agrees with some previous findings that Chinese EFL writers are reluctant to stake out a firm authorial identity in their RA's R&Ds [35,44]. This is further confirmed by the questionnaire results, which reveal that the CNWs tend to avoid first-person pronouns, mainly to mitigate the subjective voice of researchers and maximize the objectivity of the text.
Our results on the distribution of Self-mentions in the two subcorpora also reveals the different degrees of authority the two groups assumed for their research. One factor contributing to such a difference may be their different preferred rhetorical moves. Most the CNWs began their R&Ds with a summary of procedures, while the EEWs often began by directly with reporting results. In addition, the CNWs' R&Ds focused more on reporting results and interpreting results, giving little room for making an evaluation of related studies or for claiming the novelty of the present studies. This corroborates with Wu's research, which revealed that Chinese RA writers adopted a plain reporting of the research procedure and results as the main writing style [31]. This further shows that the CNWs tend to give much more prominence to the role of recounter of research over that of an evaluator or originator.
As inexperienced EFL writer-researchers, the CNWs' deficiency in both linguistic competence and disciplinary expertise may prevent them from taking more authoritative roles. Due to limited linguistic repertoire, they mainly resort to an imitation of experts' writings, especially when recounting methodology and reporting results, which requires a relatively low degree of language creativity. However, for evaluations and opinions, which require more originality of language, the CNWs often have to sacrifice the length and depth of those moves to reduce language mistakes. As Rafoth has suggested, inadequate language proficiency of EFL writers may hinder them from expressing their novel ideas and from constructing an identity as a confident researcher [51]. In addition, due to a lack of confidence in their disciplinary knowledge, they often hesitate to take an authoritative stance with which to comment on others' and their own studies, thus intending to hide their authorial identity and adopting a remote stance.
As Ivanič has pointed out, the construction of writer identity is both a conscious and subconscious choice of many possibilities of selfhood available to writers in the social context of writing [19]. In this process, a writer needs to deal with relationship between proximity, "relationship between self and community", and positioning, "the relationship between the speaker and what is being said" [48] (p. 36). A successful writer is able to locate an appropriate place where he/she "proclaims both individuality and membership of a group and a culture" (Ibid). For the CNWs, however, their perceived disciplinary conventions seem to prevent them from positioning themselves as competent researchers. Thus, when trying to achieve disciplinary proximity, they sacrifice their authority for objectivity. Their identity construction seems to lack the flexibility of choice, constrained by their limited genre knowledge, and their attempts to align with their perceived prescribed disciplinary norms, which are reinforced through EFL education. It is also constrained by their unawareness of reader-writer interactions, in which their EFL education often neglects. It is important for EFL students to change the one-dimensional perception of RAs as an objective and detached writing and understand how proximity can serve as a resource that enables individual positioning, and how positioning themselves as a more authoritative writer-researcher can help them gain credits in the disciplinary community [48].

Conclusions
The present study investigates metadiscourse features of RA results and discussion written by Chinese PhD students through a corpus-based approach. Although no significant difference was found in the total frequency of metadiscourse markers between the CNWs and the EEWs, the CNWs were found to use significantly more interactive metadiscourse, but significantly fewer interactional metadiscourse. In addition, various degrees of difference were found in the frequency of specific categories and subcategories of metadiscourse. Our findings indicate that Chinese novice writer-researchers tend to attach more importance to organization and clarity of ideas than to the persuasiveness of their arguments. In addition, the varieties of the metadiscourse markers of each subcategory used by the CNWs were generally smaller than those of the EEWs, showing that the CNWs may have a limited repertoire of metadiscourse resources. The CNWs' perception of metadiscourse function in this part-genre was also studied through a questionnaire survey to see the influence of such perceptions on their practice of metadiscourse choice and their writer identity as reflected through such choices. The questionnaire survey reveals that these writers generally view RA as an organized, objective genre, thus trying to achieve logicality in the text as a recounter and reporter of research while remaining conservative in giving their authorial voice as an evaluator or knowledge creator.
This study shows that metadiscourse choice is somewhat influenced by the writer's perception of generic conventions, reader-writer interaction, and metadiscourse functions. EAP/ESP writing instruction plays an important role in developing EFL writers' appropriate perception. At present, EAP/ESP writing instruction in China has begun to adopt a genre-based pedagogy, focusing on the macro-structure of the research genre and rhetorical moves in various sections. Micro-level instruction mainly concerns typical sentence patterns and the general grammatical rules of the research genre. Such rules often leave learners with the impression that research writing is a rather fixed act, rather than a flexible process that allows individual choice for discourse construction. In addition, while much attention is placed on how to improve the accuracy of expressions, little is taught about how to achieve effective communication with readers, e.g., relationships between lexico-grammatical features and their rhetorical purposes. The study points to the need to place greater importance on salient linguistic features, e.g., metadiscourse, in the EAP/ESP curriculum. For instance, the functions of different categories of metadiscourse and linguistic resources to achieve these functions can be explored by studying authentic RA examples.
By comparing CNWs' metadiscourse features with those of the EEWs, we do not intend to suggest that EFL novice writer-researchers should strictly follow the practice of L1 English expert writers. Our aim is to show that novice writers should be assisted in developing their awareness of rhetorical purposes and strategies to facilitate the sustainable development of their writing competence. Corpus and discourse analysis approaches can be adopted in teaching to help learners understand possible metadiscourse patterns in RAs of different disciplines. Such approaches also increase their awareness of certain lexicogrammatical features that are not a result of fixed rules but the preference of a disciplinary community or the choice by the writer to achieve certain rhetorical purposes. EAP/ESP teachers should help EFL learners change the one-dimensional perception of RAs as an objective and detached writing and help them to understand written texts as writer-reader interaction. In addition, teachers should also equip learners with metadiscursive strategies to express their professional opinions in RAs and establish their ethos as experts in their research areas.
The present study has some limitations, though. Firstly, the size of the corpus was relatively small because the RA's R&Ds in our language center database that fit our selection criteria were limited. Future studies will be based on a larger corpus of different disciplines so as to gain more generalizable findings. Secondly, the questionnaire survey was conducted only among Chinese novice writers. Future research may consider inquiring expert writers to investigate how they perceive the use of these rhetorical resources. Oral interviews could be adopted to elicit more salient responses from participants.
Despite its limitation, the present study contributes to the understanding of metadiscourse features of Chinese EFL writer-researchers and their relationships with writer identity construction. It also obtains lists of the metadiscourse markers of various subcategories and the writers' degrees of preference for them, providing EFP instructors and RA writers with useful references for metadiscourse choice. By adopting a questionnaire survey as well as quantitative and qualitative analyses of discourse, the study offers an explanation of metadiscourse choice by Chinese novice writer-researchers, indicating the benefits of using combined methods to conduct studies on discourse.