This section will firstly present an overall quantitative comparison between the two groups; the number of bundle types and tokens will also be discussed. Then, a detailed discussion will be based on a structural and functional comparison of lexical bundles between student writers and expert writers.
4.2. Comparison of the Structural Categories of Lexical Bundles
Overall, VP-based bundles account for 35.4% and 19.0% of the bundles in student writing and expert writing, respectively, whereas phrasal bundles, which include NP-based bundles and PP-based bundles, amount to 58.4% in student writing and 79.8% in expert writing. It shows that student writers rely more on VP-based bundles and less on NP-based and PP-based bundles than expert writers. Previous studies suggest that the lexical bundles most frequently used in academic writing are parts of noun or prepositional phrases [
11], whereas clausal bundles are more typical in spoken registers. Biber et al. [
4] concluded that 70% of lexical bundles in academic prose are phrasal bundles and 90% of bundles in conversation are clausal bundles. Student writing in the current study uses phrasal bundles much less than expert writing, reflecting a mixed style of academic prose and conversational spoken register.
Table 3 shows the distribution of each subcategory together with the log-likelihood test results (The present study used the log-likelihood calculator created by Jiajin Xu (
http://corpus.bfsu.edu.cn/TOOLS.htm, accessed on 6 December 2020); * = significant at
p < 0.05; ** = significant at
p < 0.01; *** = significant at
p < 0.001 throughout the paper. The raw frequency is shown in brackets. “+” means that the token frequency in Corpus 1 (SWC) is higher than in Corpus 2(EWC), i.e., “+” means overused, “-” means underused throughout the paper.). In the following section, lexical bundles in the structural subcategories will be compared.
Student writers employ a smaller number and lower proportion of NP-based bundles than expert writers. The two groups differ most in the use of
NP with other postmodifier fragment bundles. Expert writers use twice as many tokens as student writers. The findings presented in
Table 4 support Chen and Baker’s [
14] study that
NP with other postmodifier fragment bundles are usually part of relative clauses.
Student writers use
the ways in which significantly less often than expert writers. Another frequently used bundle by expert writers,
the extent to which, is not found in student writing. Although
the fact that the and
the relationship between the are shared bundles, student use is significantly less than expert use. The infrequent use of embedded relative clauses as post modifiers by Chinese MA student lent support to Chen and Baker [
14] and Pan and Liu [
32].
In addition, student writing is notably different from expert writing in the
Other NP subcategory. As shown in
Table 5, expert writers use five times as many types and twice as many tokens in bundles of this subtype than students. It seems that this structure is used by expert writers to highlight their research methods and remind readers of their research questions. Student writers do not use this category as frequently as experts do, and they only use this structure to summarize their findings according to the concordance lines.
Lexical bundles of the
NP with of-phrase fragment subcategory can be mostly grouped into the frame “
the + noun + of + the/a”, which was considered a fixed frame by Biber et al. [
37].
Table 6 presents the nouns collocating with this frame. Student writers use a slightly more extensive range of noun types but fewer tokens than expert writers. Lexical bundles of this frame are described as “extremely productive frames” [
37] (p. 78). However, their importance is underestimated by student writers. The underuse of “
the + noun + of + the/a” bundles is also found in Chen and Baker’s [
14] study, which concluded that neither British students nor Chinese students use this frame like experts. This underuse may be regarded as a feature of student writing due to their underdeveloped writing proficiency rather than the L1 background.
Another notable finding is that, as shown in
Table 7, the majority of these bundles begin with
the in both groups.
Expert writers use more tokens of bundles beginning with the than student writers, while student writers overuse bundles beginning with a/an, including bundles such as a better understanding of, a wide range of, a summary of the, and a larger number of, whereas expert writers used two kinds: a high level of and a wide range of.
One more observation is that both groups use only one bundle ending with the indefinite article
a (
the form of a in SWC,
the use of a in EWC) and more bundles end with the definite article
the. This might be a result of disciplinary characteristic. For example,
the use of a indicates any individual case or generic set, whereas
the use of the refers to specific cases, which are more characteristic of soft science [
7]. The reliance on the bundles ending with the definite article
the in student writings may reflect students’ awareness of the disciplinary features of academic writing.
The most frequently employed types by the expert group are PP-based bundles. Expert writers use a smaller number but higher proportion than student writers.
The
Prepositional Phrase with Embedded Of-Clause structure takes up the largest number of bundles (both types and tokens) among the 14 categories in both corpora. Many bundles of this structure appear to be the most frequently employed ones (e.g.,
from the perspective of in the SWC, and
in the case of in the EWC). Many
Prepositional Phrase with Embedded Of-Clause bundles fill the frame “
in the + noun + of”, which is another “extremely productive frame” proposed by Biber et al. [
37].
Table 8 shows such lexical bundles.
The two groups both used six types of lexical bundles in this frame. The number of tokens is slightly higher in student writing than in expert writing, which may result from the repetitive employment of in the use of (80 tokens). Three bundles in this frame are shared by both groups but are used in different ways. For example, student writers use in the form of as an adverb, as shown in example (1), but expert writers mostly use it as a post nominal modifier as shown in example (2).
- (1)
Their realization patterns are to be summarized in the form of tables to see whether certain categories are used more often than others. (S15).
- (2)
Qualitative assessment in the form of feedback or written comment is more appropriate for novice assessees. (E66).
The results also suggest that the two groups fill this frame for different functions. Expert writers use in the case of and in the context of to provide research background information, whereas student writers use in the use of and in the process of significantly more frequently to introduce the research procedure.
As for other prepositional phrase fragments bundles, student writers use a higher number but smaller proportion than expert writers. Many bundles in student writing are used to direct readers’ attention to certain positions in the text, such as in the following table and in the above example. Expert writers are found to use more lexical bundles indicating logical relations, such as in relation to the and in line with the. These bundles are among the top 30 frequent ones in expert writing but can hardly be found in student writing.
The verb phrase-based lexical bundles are the most frequently used category in student writing. Student writers use significantly more VP-based bundles than expert writers: more than twice as many types and three times as many tokens. The results show that experts use no bundles of the
Pronoun/NP + be fragment and
VP with active verb structure. Of other VP-based categories, student writers use more types and tokens of bundles than expert writers. Consistent with Chen and Baker’s [
28] study, which concluded that students with lower proficiency employ more VP-based lexical bundles, our results suggest that novice academic writers tend to rely too much on VP-based lexical bundles.
Student writers use more
Anticipatory it + verb phrase/adjective phrase bundles than expert writers. Pan and Liu [
32] reported infrequent use of this structure by L2 expert writers and explained that it might due to the lack of its counterpart in Chinese. However, L2 student writers in their study used much more
anticipatory-it bundles than L2 experts. It seems that MA students are aware of this pattern and use it effectively in theses writing. According to Hyland [
10], this structure can downgrade the personal role in interpretation without identifying the source of evaluation. The frequent occurrence of this structure may partly be explained by genre difference between student theses and research articles. MA students appears to realize the possible risks of explicitly attributing the source of evaluation to themselves in a high stakes genre where students are under the pressure of assessment. Both groups use this structure to highlight significance, provide explanation, and report findings. Although student writers seem to realize the importance of distancing themselves from judgment, they show a preference for different adjectives such as
necessary, obvious, and
clear. Expert writers used only two adjectives to fill this structure (i.e.,
important,
possible).
Another structure student writers significantly overuse is the
V + that frame, which is more commonly found in conversation [
38]. Student writers frequently use human subjects such as
we can see that to indicate findings as shown in (3). When expert writers use
V+ that frame, they are more willing to express similar meanings in an objective way with impersonal subjects as in (4). This is a strategy to strengthen the objectivity of their findings and interpretations.
- (3)
We can see that the latter genre is expanding the previous one by specifying in detail the very focus of it and thus elaborates the previous genre. (S12)
- (4)
The results showed that the learners portrayed a high level of pragmatic awareness in the three languages even though their L1 and L2 languages were still developing. (E59)
Overall, these findings demonstrate that student writers use bundles more frequently and show a strong preference for clausal bundles. Expert writers generate fewer bundle types and tokens, and they rely more on phrasal bundles than student writers. Phrasal bundles cannot be acquired naturally [
39]; it can be assumed that such linguistic resources can be incorporated into academic writing courses. The above structural analysis offers only a partial picture of bundle use. In the following section, a comparison of functional categories will be presented.
4.3. Comparison of Functional Categories of Lexical Bundles
Text-oriented lexical bundles rank as the largest category in both corpora; EWC contains a higher proportion of 53.6%, whereas SWC contains 46.0%. SWC contains a higher proportion of research-oriented bundles (44.2%), and this category is less frequent in EWC (41.7%). Participant-oriented bundles are the smallest ones in both corpora, accounting for 9.7% in SWC and 4.8% in EWC.
Table 9 shows the proportional distributions of subcategories in the two corpora. The results reveal that the type and token frequencies in all three main categories and most subcategories are higher in student writing. In terms of proportion, student writers rely more on research-oriented and participant-oriented bundles and less on text-oriented bundles than expert writers.
Research-oriented lexical bundles can be used to express real-world activities and experiences. In the current study, student writers use quantification and description bundles more frequently and expert writers use location and procedure bundles more frequently.
Location bundles are used to indicate time and place. Although this category does not contain many different types, location bundles turn out to be the most commonly used ones, and most of them are shared between the two corpora, such as at the same time and at the end of. A close examination reveals differences in the use of shared bundles. For example, student writers often use at the end of to indicate a particular place in their writing, as in (5), and expert writers tend to use it to refer to research stage, as in (6):
- (5)
At the end of the chapter, there is a description of the procedures of the research. (S18)
- (6)
It is only at the end of the activity that the instructor combines the students’ ideas and briefly explains the conceptual meaning. (E42)
Both groups use description bundles that fit the noun + of the structural pattern. These different nouns reveal the different functions they serve in student writing and expert writing. Expert writers use this pattern for more abstract functions, with nouns such as nature, meaning, quality, and context describing quality and property. Student writers, on the other hand, use this frame to provide more primary information with nouns such as basis, structure, and form.
Student writers use quantification bundles more frequently than expert writers. This finding is inconsistent with Pan et al. [
18], whose conclusion is that novice writers produce fewer quantification bundles than expert writers. A further examination reveals that the two groups employ quantification bundles for different purposes. It seems that student writers attach more importance to detailed quantitative information by using bundles like
the frequency of the and
more than half of. Expert writers, on the other hand, use bundles describing more generalized and abstract information such as
the extent to which and
a high level of. Hyland [
11] concluded that student writers are under pressure to demonstrate their ability to handle research and their familiarity with the subject content. That difference may explain why student writers employ more research-oriented bundles than expert writers and why they use this category to provide more basic and detailed information.
Text-oriented lexical bundles are concerned with the organization of texts and comprise the most substantial proportion in both corpora. Student writers use a significantly higher number but a smaller proportion of text-oriented bundles than expert writers. This massive concentration of text-oriented bundles is in line with previous studies [
16,
18] and indicates the discursive and evaluative characteristics of soft science language [
11].
Framing signals appear to be the most frequent group of text-oriented bundles in both corpora. These bundles are heavily employed to frame arguments by highlighting limitations and specifying cases. There are, respectively, 10 and 9 types of framing bundles among the Top 30 in the SWC and EWC. Bundles such as
on the basis of and
in the form of are shared and frequently used by both groups. In line with Hyland’s observation [
10], many framing signals found in the current study were
preposition + of structures, e.g.,
in terms of the, on the basis of. Both student writers and expert writers use a large number of PP-based bundles beginning with
in.
Student writers use framing signals to introduce helpful resources when describing their research procedure, such as the software/tool they used or the experienced researcher who helped them make a decision. It seems that such information makes their methodologies sound convincing, as in (7) and (8):
- (7)
Then with the help of other two teachers who have been teaching English for 20 years, the researcher identifies the relative clause errors. (S19)
- (8)
The data are analyzed with the aid of AntConc 3.3.5. (S18)
Expert writers use more bundles to set detailed criteria or limitations to their arguments, as in examples (9) and (10):
- (9)
With the exception of a small-scale study with low-proficiency EFL students (Shehadeh, 2011), previous research has not compared the longer-term effects of writing practice...(E53)
- (10)
There are also signs, at least in the case of universities, of national proclivities in the choice of adjectives. (E14)
These expressions specify the special cases in which the argument can be accepted, thus protecting them from direct contradiction with other research findings. In a word, framing signal bundles helps writers make their research methods and conclusions more convincing.
The second most frequently employed subcategory is structuring signals in both corpora. The use of structuring signals makes arguments well managed and organized. Some structuring signals are used to point to additional material to make it more salient [
40], and student writers in the present study frequently use bundles like
as shown in table and
in the following examples. It is common that further explanations do not immediately follow the tables, figures, and examples due to page layout. Writers need to remind their readers of where to find expected information. These expressions help readers follow their analysis. Student writers also use structuring bundles to summarize their findings, such as
based on the above. The nouns usually following this bundle include
discussion, analysis, and
results. Some structuring signals can announce discourse goals [
11], and such bundles are connected with different
attended this structures, allowing writers to build interaction with readers [
41]. Student writers use the
noun + active verb pattern (
this thesis aims to), whereas experts prefer
noun + of pattern (the aim of this). Another observation is that expert writers use structure bundles to scaffold the text. For example, they use
in the next section to introduce what they are going to talk about in the next step. Such guiding expressions function as a road map that helps readers follow their writing in the way the writer expects. However, it seems student writers have not realized the importance of stating the purpose of the subsequent sections. Other bundles frequently used in the EWC but rarely found in the SWC are those containing “
research question”. The use of such bundles is a reflection of reader awareness. Expert writers tend to remind their readers of research questions at different stages in the text. They present the question right before their explanation, helping readers know what can be expected in the following sections, as in (11):
- (11)
In order to address the second research question of the study, that is, the effect of sociocultural adaptation on production of routines, a first analysis was focused on the cultural congruity factor (RQ2a). (E50)
Transitional signals are used to build additive or contrastive links between elements. These bundles help to maintain the cohesion and coherence of the writing. Transitional bundles have comparable proportions in the two groups, and many of them are the most frequently employed ones, such as on the other hand and as well as the. Student writers and expert writers both use transitional bundles to establish connections in the text. However, student writers sometimes misuse the bundle on the other hand. Instead of using it to indicate a contrary situation or alternative viewpoint (see example 13), students sometimes use it to link two sentences but fail to achieve cohesive writing as they hoped, such as in (12):
- (12)
On the one hand, word list, key word list, or adjective list can only offer us a little information about words expressing opinions in reports. On the other hand, although concordances analysis is an efficient way to identify opinions in news, it falls short of dealing with larger text. (S3)
- (13)
Translingual scholars such as Makoni and Pennycook (2007), Blommaert (2005), Canagarajah (2017a), or Li (2017), on the other hand, offer an alternative perspective. (E7)
Resultative signals are used to establish inferential or causative relations between elements. Both groups use bundles containing result. Student writers prefer the bundle the result of the, and usually use show, indicate, and present after it. Expert writers, on the other hand, use the results show that more frequently. Another finding is that both student writers and expert writers favor verbs find and show, but student writers also use bundles containing point out. Furthermore, student writers often use it is found that while expert writers use it was found that. The use of past tense indicates that the result or finding is reasonable in the specific case, thus opening a space where writers feel free to challenge the conclusion.
Participant-oriented lexical bundles comprise the smallest proportions in the two corpora. Student writers use a more significant number and a higher proportion of bundles than expert writers. Stance bundles in the current study are expressed impersonally and show a connection to the anticipatory-it structures, such as it is important that, it is clear that, and others. Expert writers employ participant-oriented bundles to convey a reluctance to express full commitment. The use of hedges helps writers express opinions with a degree of uncertainty, thus protecting them from the potential disagreement with others. When they offer result interpretation, expert writers seem to express their opinions in a more tentative and cautious way, as illustrated in examples (14):
- (14)
It is possible that this was due to stronger semantic connections with word groups (animals, food) that were more familiar to the children... (E47)
However, student writers have not realized the importance of hedging bundles and they often use bundles to indicate necessity as illustrated in examples (15)–(16):
- (15)
It is necessary to study the effects of enhanced model on high school students’ incidental noticing. (S7)
- (16)
In the two examples above, it is obvious that students do not know which conjunctions can lead relative clauses. (S19)
Engagement bundles are employed to engage readers at a certain point in the text. The majority of engagement bundles in the present study used modal words to express the writer’s attitude of absolute necessity or importance, e.g.,
it should be noted and others. Student writers use bundles beginning with personal pronouns (e.g.,
we can see that), whereas expert writers do not. Pan and Liu [
32] reported that MA writers frequently use
we can see that to present a proposition based on information from table or figure. Findings from our corpora demonstrated different patterns. Student writers mostly use this bundle to report findings based on reviewing previous studies, as in example (17)–(18):
- (17)
As reviewed above, we can see that this essential perspective has apparently been embedded within SFL linguists’ conception of genre. (S12)
- (18)
From the research results, we can see that the majority of the respondents are bilingual and diglossic speakers. (S1)
Taken together, student writers and expert writers demonstrate comparable functional proportions. The two groups both employ lexical bundles to help construct propositions, unfold the text, and engage readers in a reader-friendly way. However, student writers tend to focus on detailed information and usually shape findings and conclusions with a high degree of certainty. Expert writers often develop their writing in a way such that readers understand the text as the writer expects. They also carefully express opinions and interpret results with hedging bundles, which creates a space for readers to argue with them.