4.3. The EW Method for Ranking Keywords
The EW method [
46,
47] assumes that each criterion has the same importance. If the problem to be solved contains
n parameters,
, the weight of the EW method is
. Let
be the assessment value of criterion
. The weights of the aggregated values for the EW method are shown in Equation (8).
When the EW method was adopted for computing the parameters of this paper (i.e., the keyness, frequency, and range) for ranking keywords, several deficiencies emerged. First, from the linguistic perspective, under the circumstance that the target corpus was not optimized, the keyness calculation results would have interference from function words and meaningless letters, causing the keyness values to be biased at the beginning. Second, although the EW method can simultaneously consider all parameters, the relative importance level of each parameter should not be the same; hence, it was difficult to meet the experts’ expectations.
4.4. The Proposed Extended AHP-Based Corpus Assessment Approach
To optimize and address the deficiencies of the two aforementioned methods, this paper adopted the target corpus as the empirical case, to demonstrate and verify the efficacy and practicality of the proposed approach. Detailed descriptions of each step were as follows.
The target corpus in this paper was based on 53 research articles with SCI from WOS. The lexical features included 10,595 word types, 189,680 tokens, and a type–token ratio (TTR) of 0.05586 (representing the lexical diversity).
To retrieve the keywords, the algorithm of the software will calculate a word’s keyness value to determine whether it is the domain-oriented word, by finding the word that has high frequency in the target corpus but has low frequency in the benchmark corpus. From the perspective of linguistic analysis, when the target corpus is the textual data of professional fields, then the benchmark corpus should select more general-purpose-use data (i.e., EGP). In addition, COCA is considered as the biggest and genre-balanced EGP corpus data, and is widely adopted by many corpus-based researchers as the benchmark corpus [
11,
21], and so did this paper. After processing by the software, the lexical features of the benchmark corpus (i.e., COCA) included 109,306 word types, 8,266,198 tokens, and a TTR of 0.01322.
To increase the accuracy of keyword extraction, this step adopted the corpus-based machine optimization approach to eliminate function words and meaningless letters [
21].
Table 3 shows the refined target corpus, which eliminated 217 word types and 81,097 tokens, and downsized the target corpus by 43%. Without the interference of function words and meaningless letters, the keyword generator could retrieve more domain-oriented or content words to form a more accurate keyword list.
Once the target corpus, the benchmark corpus, and the stop wordlist are input into AntConc 3.5.8 [
1], the traditional keyword-list-generating approach is used to exclude function words and meaningless letters to calculate each token’s keyness value and determine the keyword list (see
Figure 2). However, during this step, the keyword list still remains at the single-parameter evaluation stage.
In this step, the evaluation parameters decided by experts are determined as the tokens’ keyness, frequency, and range values for the following evaluation processes. The evaluation team in this study included three experts with academic specialties including NLP, corpus linguistics, teachers of English to speakers of other languages (TESOL), performance evaluation, and fuzzy logic. Based on
Table 1, the three experts determined the pairwise comparison results of the evaluation parameters, respectively. The results are shown in
Table 4.
Next, the researchers arithmetically averaged each element in the matrix given by the experts and summarized the results as shown in
Table 5, and then used Equation (1) to create the matrix for computation in the following steps.
After computing the aggregated pairwise comparison matrix (see
Table 5) using Equations (2) and (3), the maximum of the eigenvalue,
was 3.003, and the relative weights for the keyness, frequency, and range were 0.195, 0.278, and 0.527, respectively. The relative weights were given by the experts’ evaluation and calculated through the AHP computing process, which indicated the relative importance between each vector. Based on the priority vector that range (0.527) > frequency (0.278) > keyness (0.195), we reasoned that the experts’ overall assessments indicated that the so-called keywords should also occur widely and frequently in the corpus data.
To verify the reliability and validity of the relative weights, use Equations (4) and (5), and
Table 2 to compute the CI and CR values. The CR value is 0.003, which is less than 0.1, which expressed that the results were acceptable.
Use Equation (6) to normalize each parameter for further aggregated value computation.
Once all parameters were nominalized, the researchers used Equation (7) to compute the aggregated value of the keywords. The partial results of the keywords’ aggregated values are presented in
Table 6.
Based on each keyword’s aggregated value, the researchers re-ranked the keyword list (see
Table 6) to form the ultimate optimized keyword list.
The results of the ultimate optimized keyword list can be integrated with the complete evaluation results from the experts to provide a more complete benchmark for defining critical lexical units, thereby improving the efficiency and accuracy of NLP.
4.5. Comparison and Discussion
To enhance the accuracy of the corpus evaluation results, a corpus assessment approach must be able to compute multiple parameters at the same time and consider the relative importance between different parameters. However, the traditional keyword-list-generating approach [
10] only uses the likelihood ratio method [
11] to determine and rank keywords in the target corpus, which is a deficiency of corpus assessment [
2,
3,
10,
22]. Thus, to optimize the aforementioned issues, this paper proposed an extended AHP-based corpus assessment approach that integrated the likelihood ratio method, the corpus optimization approach, and the AHP method to refine corpus data, simultaneously handle multiple parameters, and consider the relative importance between different parameters for accurately evaluating keywords. COVID-19-related research articles (N = 53) from the environmental science discipline were adopted as the target corpus and used as an empirical example to verify the proposed approach.
This paper compared three approaches from three perspectives: (1) corpus optimization; (2) considering multiple parameters simultaneously; and (3) considering the relative importance between different parameters to highlight the contributions of the proposed approach (see
Table 7).
Firstly, for corpus optimization,
Table 6 indicates that function words, such as the, and, of, and in, appeared on the keyword lists generated by the traditional keyword-list-generating approach [
10] and the EW method [
47]. Due to function words being critical elements to form meaningful sentences, those tokens usually occupy over 40% of the corpus data. If the function words are not eliminated beforehand, the likelihood ratio method [
11] will consider them as keywords because their extremely high frequency values will disguise the keyness computation results. Once the function words are included in the keyword list, content words that may be true keywords will be excluded; thus, causing bias in the computation results. Before entering the algorithm computation process, the proposed approach adopted the corpus optimization approach to eliminate function words and meaningless letters, to enhance the computation accuracy.
Secondly, when considering multiple parameters simultaneously, it is insufficient to use the traditional keyword-list-generating approach [
10], as it is based on only one parameter (the keyness) to rank keywords. To make the evaluation results approach uncontroversial, the EW method [
47] and the proposed approach were used to simultaneously take three parameters (i.e., the keyness, frequency, and range) into consideration, and each keyword’s aggregated value was used to re-rank the keyword list.
Finally, in consideration of the relative importance of different parameters, the researchers soon discovered the major problem of the EW method [
47]. Although the EW method could consider the three parameters at the same time, the importance between the three parameters would be considered as equal, and the relative importance between the parameters would not be confirmed. To compensate for this deficiency, the proposed approach integrated the AHP method [
12] to calculate the relative weights of each parameter and identify the relative importance between parameters. After using the AHP method to calculate the experts’ evaluation scores, the researchers discovered that the relative weights of the keyness, frequency, and range were 0.195, 0.278, and 0.528, respectively, which were not equal. The derived implications of the unequal relative weights indicated that, after generating the keyword list, the experts wanted to identify the most widely- and frequently- used keywords in the target corpus; hence, their assessment results determined the relative importance of the three parameters as range > frequency > keyness.
In summation, to handle the single-parameter evaluation deficiency of keyword ranking and optimize the traditional corpus-based assessment approach, the proposed extended AHP-based corpus assessment approach was able to exclude function words and meaningless letters, simultaneously compute multiple parameters, and consider the relative importance between different parameters.