Empowering Weak Languages Through Cross-Language Hyperlink Recommendation

Nguyen, Nhu; Takeda, Hideaki; Karunathilake, Lakshan

doi:10.3390/info16090749

Open AccessArticle

Empowering Weak Languages Through Cross-Language Hyperlink Recommendation

by

Nhu Nguyen

^1,*

,

Hideaki Takeda

^2,3

and

Lakshan Karunathilake

^2,3

¹

Faculty of Information Technology, Vietnam Maritime University, 484 Lachtray St., Lechan, Haiphong 040313, Vietnam

²

Department of Informatics, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies (SOKENDAI), Shonan Village, Hayama 240-0193, Kanagawa, Japan

³

National Institute of Informatics (NII), 2 Chome-1-2 Hitotsubashi, Chiyoda City 101-8430, Tokyo, Japan

^*

Author to whom correspondence should be addressed.

Information 2025, 16(9), 749; https://doi.org/10.3390/info16090749

Submission received: 10 June 2025 / Revised: 13 August 2025 / Accepted: 22 August 2025 / Published: 29 August 2025

Download

Browse Figures

Versions Notes

Abstract

Wikipedia is an important platform for promoting language inclusivity and sharing global knowledge. However, while languages with more resources have a lot of content, languages with fewer resources face challenges in accessibility and cultural representation. To help address this gap, we use multilingual datasets and neural graph collaborative filtering to recommend missing hyperlinks, helping to improve low-resource languages on Wikipedia. By encouraging cross-language collaboration, this method strengthens the connections and content of these languages, promoting cultural sustainability and digital inclusion. Experimental results show significant improvement in recommendation quality, with clear benefits for weaker languages. This highlights the role of recommender systems in preserving unique cultural aspects, building connections between language communities, and supporting fair knowledge sharing in a globalized world.

Keywords:

Wikipedia; Wikidata; hyperlink; recommendation system; collaborative filtering; Graph Neural Network

1. Introduction

Wikipedia, a flagship project of the Wikimedia Foundation (https://wikimediafoundation.org, (accessed on 10 June 2025)), is a prime example of collaborative platforms. As one of the most visited websites worldwide, it serves as a vital repository of information on various subjects. As of 20 May 2024, Wikipedia (https://www.wikipedia.org, (accessed on 10 June 2025)) exists in 342 languages, 19 of which have more than one million articles each. However, despite its vast multilingual reach, significant disparities persist in content coverage between resource-rich languages, such as English, and low-resource languages. This imbalance creates gaps in knowledge accessibility, digital inclusion, and cultural representation. Although English Wikipedia boasts over six million articles, comprising nearly one-third of the platform content, more than 80% of the editions have fewer than a hundred thousand articles. For example, Table 1 displays the overlap of topics between languages in the top 15 Wikipedia versions with the most content. Each column illustrates the proportion of shared topics between them (e.g., the overlap between English (en) and Vietnamese (vi) is 67.4%, while only 12% of Vietnamese topics overlap in the opposite direction). Such disparities not only limit access to critical information for speakers of underrepresented languages but also perpetuate systemic inequalities in the digital knowledge landscape.

Previous efforts to address content inequality on Wikipedia include tools like “Not in the other language” and “redlinks”, which help identify missing articles. However, these tools often generate long and unsorted lists, placing additional burdens on volunteer editors. Given the voluntary and unpaid nature of Wikipedia contributions, streamlining the discovery and recommendation of missing content is essential to improve editor engagement and sustain content growth in weaker languages [1]. To bridge the gap between languages, we leverage hyperlink information. Each article on Wikipedia includes hyperlinks, such as the “Mickey Mouse” article, which links to “Minnie Mouse” and “Donald Duck.” These hyperlinks are also known as intra-language links. They connect related articles within the same language, allowing the reader to navigate seamlessly between relevant topics [2,3]. By forming an interconnected network of knowledge, hyperlinks improve content accessibility and facilitate efficient exploration. Although hyperlinks can extend across languages and media, this study focuses exclusively on intra-language links, unlike the works in [4,5], which used the links in Wikipedia’s “See also” section. Rather than enriching intra-language links in Wikipedia articles by transferring links between pairs of different language articles [6], we combine multilingual articles on the same topic and their hyperlinks to identify patterns and provide suitable suggestions.

Our research shares a high-level goal with [7] in facilitating the sharing of knowledge between online knowledge bases. However, while [7] focuses on aligning articles in different encyclopedias using textual similarity, we prioritize topics to capture the unique characteristics of each language. Our approach leverages hyperlink information as weak supervision, emphasizing editorial autonomy and cultural diversity rather than enforcing content alignment. We respect the culture and society that influence each Wikipedia edition. The diversity of articles in various languages should be preserved and further developed. We recognize that articles in different language editions of Wikipedia on the same topic may share some content while still retaining their distinct characteristics.

In this paper, we focus on hyperlink recommendation to support Wikipedia editors. Text writing is the primary contribution of editors and should remain untouched, while identifying related information (articles) is necessary but time-consuming. Therefore, our goal is to provide hyperlink recommendations for articles, balancing editorial freedom and ease of editing. To support low-resource Wikipedia editions, we follow these principles. First, we do not believe that the English Wikipedia should be the sole standard that other language Wikipedias follow or merely replicate. Each language-specific Wikipedia is maintained by the culture and society associated with that language, and its content reflects these cultural and societal aspects. Second, the originality of the content in each language should be preserved and developed. Naturally, content in one language is inherently related to content in other languages. When a single topic appears in multiple language-specific Wikipedias, it is reasonable for them to share some content to a certain degree. In summary, we propose the following hypotheses.

Hypothesis 1.

If articles on a single topic exist in multiple language-specific Wikipedias, and they can share some content with each other. Figure 1 illustrates the first hypothesis with a portion of the “Mickey Mouse” article (https://en.wikipedia.org/w/index.php?title=Mickey_Mouse, (accessed on 10 June 2025)) in English and Sinhala versions. The content in the Sinhalese version is significantly more limited compared to the English version. In this case, the article with more comprehensive content can supplement the other article.

Hypothesis 2.

If articles on a single topic exist across multiple language-specific Wikipedias, each article may also have unique content that differs from the others. Figure 2 illustrates the second hypothesis, showing the “Mickey Mouse” article in English and Japanese. Figure 2 illustrates the second hypothesis, presenting the Mickey Mouse article in English and Japanese. Although both versions appear intuitively rich, they differ semantically, reflecting the cultural perspectives of each language community. The Japanese article emphasizes the industrial and contractual circumstances surrounding Walt Disney’s loss of Oswald the Lucky Rabbit, which directly prompted the creation of Mickey Mouse. The English article, in contrast, highlights Mickey Mouse as a cultural product, focusing on his characterization, narrative roles, and subsequent media expansion. These differences highlight how the cultural perspectives of each community influence the portrayal of the topic, providing complementary information.

To support Wikipedia editors, we adopt collaborative filtering (CF). Intuitively, similar articles share some hyperlinks in their content. By treating hyperlinks as items and articles as users, we can represent this situation as multiple users having relationships (e.g., use or non–use) with items. In this context, similar users are expected to have similar relationships with the items. This CF–based recommendation approach works well in a single language. We then extend this idea to multiple languages of Wikipedia. By referring to Wikidata links, we can obtain sets of articles from different languages corresponding to a single topic. Specifically, each user (article) establishes relationships with items (hyperlinks) derived from articles in two or more Wikipedias on the same topic.

Recommendations are generated using information from two or more Wikipedias, and the final recommendation for each article combines insights from these multiple sources. As a result, each Wikipedia edition can benefit from the information shared by others. By implementing this approach, we aim to address the following research questions:

RQ1: Can reliable hyperlink recommendations be made?
RQ2: How does the volume of language–specific Wikipedias affect recommendation performance? Specifically, can Wikipedias of low–resource languages benefit more from recommendations?
RQ3: What are the characteristics of an effective recommendation model for hyperlinks?
RQ4: To what extent do the recommendation results align with the editors’ intent?

The remainder of this paper is structured as follows. Section 2 reviews relevant studies. Section 3 outlines the methodology, while Section 4 describes the data collection process for the experimental setup. Section 5 presents and evaluates the experimental results from multiple perspectives. Section 6 introduces a demonstration application of our method for suggesting hyperlinks. Finally, Section 7 discusses the limitations, followed by the conclusion in Section 8.

2. Related Works

Previous studies have extensively examined the information gap in different editions of Wikipedia and proposed various strategies to address it [8,9,10,11,12,13,14,15,16,17]. For example, Roy et al. provide a browser for comparative topics in different versions of Wikipedia [10]. Ref. [11] tackles the challenge of aligning text passages between interlingual article pairs on Wikipedia. Their approach involves developing methods to identify and link text passages written in different languages that share overlapping information. Refs. [12,13] highlights the disparity in content coverage between high–resource and low–resource languages, emphasizing the importance of cross–lingual data integration. Piccardi et al. introduced Wikipedia–based Polyglot Dirichlet allocation (WikiPDA), a cross–lingual topic model that represents Wikipedia articles in any language as distributions over a unified set of language–independent topics [14]. This model uses the inherent linking of Wikipedia articles and their association with concepts in the Wikidata knowledge base, to make language–independent articles when viewed as collections of links. Ref. [15] detect fine–grained differences in content conveyed in different languages for the analysis of multilingual NLP and corpora. However, this presents a challenging machine learning problem, as annotation is expensive and difficult to scale. They focus on cross–lingual semantic divergences of Wikipedia articles between English and French versions.

To improve Wikipedia article recommendations, previous studies have investigated multiple factors such as article structure, hyperlink usage, content alignment, and cross–language considerations [14,18,19]. For instance, ref. [14] highlights the importance of structural information, but has limited attention to content, hyperlink patterns, and multilingual aspects. In contrast, ref. [20] is primarily focused on content relevance but does not adequately address structural organization, link structures, or language diversity. Ref. [21] has also examined how the layout and coherence of the articles influence the recommendations. For example, ref. [21] introduced a method to suggest a section of articles based on the prediction of the title. As core components of Wikipedia entries, sections not only improve navigation and readability but also serve as a structured foundation for article creation and enrichment. Ref. [22] adopts an approach similar to [20], focusing on content support while paying minimal attention to other aspects. In contrast, ref. [23] strikes a more balanced approach, addressing both structural aspects and cross–language integration effectively. Meanwhile, ref. [24] focuses on content, hyperlink usage, and cross–language integration, although structural support remains relatively underexplored in their work. In [25], an empirically evaluated system was introduced to address content disparities between different Wikipedia languages. Their system involved several steps: leveraging the Wikipedia knowledge graph to identify articles present in one language but absent in another, ranking these missing articles by importance through accurate prediction of their potential future page view counts, and recommending them to editors in the target language.

Then, hyperlinks and cross–language links have been recognized as crucial data elements to bridge these gaps [26]. Ref. [19] introduces a language–independent method to identify missing cross–language links, particularly between English and Japanese editions. In the context of hyperlink–based approaches, ref. [4] demonstrates the use of hyperlinks in the recommendation of related articles, while ref. [18] employs a bridge model to analyze document relationships through hyperlink structures. Ref. [7] also leveraged inter–language links to measure the similarity of articles in different languages. Ref. [27] discovered missing hypertext links in Wikipedia by computing a cluster of highly similar pages around a given page. Then, they identified candidate links from similar pages that might be missing on the given page. This study is close to our idea. However, limiting the focus to mono–language Wikipedia articles might overlook valuable cross–language connections. Ref. [28] used a different approach, taking advantage of the rich knowledge of articles in 100 languages on Wikipedia. However, the computational cost is significant, and the prediction task involves only suggesting missing hyperlinks without considering the specific trends of each language community. The study by Akhil et al. also shows that the addition of hyperlinks improves the visibility of the article, highlighting the importance of automated, cross–lingual support for editors [29]. However, their work specifically focuses on orphan articles that lack both incoming and outgoing links. Despite their importance, hyperlinks and their associated relationships within Wikipedia remain an insufficiently explored yet highly valuable resource, especially in multilingual contexts [6].

Unlike the approach in [30], which relies on existing hyperlinks to make recommendations based on user preferences, our research focuses on suggesting missing hyperlinks. We propose a language-independent methodology based on collaborative filtering, which has proven effective in recommendation tasks [31,32]. Among these, neural graph–based collaborative filtering has emerged as a promising model to capture collaboration signals deeply within recommendation systems [33]. This approach helps to leverage hyperlink attributes extensively to indicate their potential in cross–language integration. Thus, this method is intended not only to enhance the content of articles in low–resource languages but also to consider the preferences of each language community.

3. Hyperlink Recommendation Method

This section outlines the hyperlink recommendation task and introduces several collaborative filtering (CF)–based models adopted in our approach.

3.1. Hyperlink Recommendation Task

This task aims to provide suggestions to editors in selecting and adding hyperlinks while ensuring alignment with the preferences or characteristics of local languages. Thus, hyperlink recommendation differs from the problem of link prediction [34], which predicts a missing head or tail in a triple (head, relation, tail). Our problem is categorized into recommendation system tasks.

We construct a user–item matrix for the hyperlink dataset, where articles and hyperlinks serve users U and items I, respectively. For this hyperlink recommendation task, we work with implicit rating or implicit feedback. Table 2 and Figure 3 show an example of implicit feedback in the hyperlink dataset. Specifically, the articles and hyperlinks in Table 2 are converted into a user–item matrix, as shown in Figure 3. The rating is determined by whether the user (article) mentions the item (hyperlink) or not. To represent these data as a matrix, we denote M and N as the number of users and items, respectively. We define the adjacent user–item interaction matrix

Y \in R^{M \times N}

based on user implicit feedback as follows:

y_{u i} = \{\begin{matrix} 1, & if interaction (user u, item i) is observed; \\ 0, & otherwise . \end{matrix}

Using implicit ratings derived from the dataset, we apply collaborative filtering (CF) approaches to identify the most relevant hyperlinks with the highest

{\hat{y}}_{u i}

scores, ensuring that they align with the preferences of each local community.

3.2. Bayesian Personalized Ranking Model

We selected this model as our initial collaborative filtering (CF) approach because it is specifically designed for implicit feedback datasets. Bayesian Personalized Ranking (BPR) [35] enhances the matrix factorization (MF) model, represented as follows:

{\hat{r}}_{u i} = U_{u} \cdot V_{i}^{⊺}

(1)

where

U_{u}

is the vector representing user u and

V_{i}^{⊺}

is the vector representing item i in the latent space.

BPRMF optimizes the model using the following loss function:

L o s s = \sum_{(u, i, j) \in D} ln S ({\hat{y}}_{u i j}) - λ_{Θ} {∥ Θ ∥}_{2}

(2)

where D represents the set of triplets

(u, i, j)

, where the user u is aware of the preference of item i over item j,

S (.)

is the sigmoid function, and

λ_{Θ}

are the regularization parameters of the model. This formulation enables the BPRMF model to effectively learn personalized rankings from implicit user feedback.

3.3. Graph Neural Network Model

Recently, the rapidly growing field of Graph Neural Networks (GNNs) has garnered considerable interest and demonstrated state-of-the-art (SOTA) performance in various recommendation tasks. This success can be attributed to its robust capacity for node representation, particularly in leveraging high–order information [36,37]. In particular, neural graph collaborative filtering (NGCF) [33] has proven to be particularly successful for CF tasks, making it the second approach chosen for hyperlink recommendation. In learnable CF models, there are generally two essential components:

Embedding, which converts users and items into vectorized representation;
Interaction modeling, which reconstructs historical interactions using embedding.

The advancement of deep learning applied to CF involves primarily transforming or enhancing these components to achieve higher performance. Among them, neural graph collaborative filtering is an effective algorithm in capturing collaborative signals to reveal the behavioral similarity between users (or items) overlooked by previous algorithms. NGCF focuses on exploring collaborative signals, which present the latent features of user–item interactions in the embedding component. To achieve this, NGCF captures connectivity by analyzing indirect relationships between users and items through multiple–step paths. These paths carry valuable collaborative signals. NGCF then uses a neural network–based embedding propagation layer to refine and share embeddings across the graph.

In the embedding layer, every user u (or item i) in the interaction matrix has an embedding vector

e_{u} \in R^{d}

(

e_{i} \in R^{d}

), and d is the embedding size. The embedding of u (i) is defined as

\begin{matrix} e_{(u)}^{(l)} & = σ (m_{u \leftarrow u}^{(l)} + \sum_{i \in N_{u}} m_{u \leftarrow i}^{(l)}) \\ e_{(i)}^{(l)} & = σ (m_{i \leftarrow i}^{(l)} + \sum_{u \in N_{i}} m_{i \leftarrow u}^{(l)}) \end{matrix}

(3)

The activation function

σ

uses LeakyReLU [38] to encode both positive and negative signals.

N_{u}

represents the set of items interacted with by user u (

N_{i}

represents the set of users that interact with the item i). NGCF also uses the message–passing architecture of GNNs, where the messages are defined as follows:

\{\begin{matrix} m_{u \leftarrow i}^{(l)} = p_{u i} (W_{1}^{(l)} e_{i}^{(l - 1)} + W_{2}^{(l)} (e_{i}^{(l - 1)} ⊙ e_{u}^{(l - 1)})) \\ m_{u \leftarrow u}^{(l)} = W_{1}^{(l)} e_{u}^{(l - 1)} \end{matrix}

(4)

where

W_{1}^{(l)}, W_{2}^{(l)} \in R^{d_{l} \times d_{l - 1}}

are the transformation weight matrices in training,

d_{l}

represents the size of the transformation, and

e_{i}^{(l - 1)}

denotes the previous message passing steps involving

(l - 1)

hop neighbors. These embeddings are concatenated to obtain the final embeddings of the user and the item (Formula (5)), which are then used to compute prediction scores

{\hat{y}}_{u i}

using the inner product [33].

e_{u}^{*} = e_{u}^{(0)} ∥ \dots ∥ e_{u}^{(L)}, e_{i}^{*} = e_{i}^{(0)} ∥ \dots ∥ e_{i}^{(L)}

(5)

The optimization process also uses the BPR loss function in Formula (2) from [35]. BPR operates under the assumption that observed interactions, which more accurately reflect a user’s preferences, should be assigned higher prediction values than unobserved ones.

4. Dataset

In this section, we present the process of constructing the datasets used in our hyperlink recommendation experiments, highlighting the multilingual and cross–lingual aspects that are central to our approach.

4.1. Data Creation

We mainly extract articles and hyperlinks from three editions of Wikipedia: English, Japanese, and Vietnamese. We also gather data from Sinhala, a low–resource language, for further evaluation. A unique aspect of our research is that we retrieve cross–language data under the condition that articles in different languages must have internal links to each other. We also leverage the interconnectedness of Wikipedia articles, which link to each other and are associated with concepts in the Wikidata knowledge base. This graph–like representation of links makes articles inherently language–independent. Each dataset configuration differs to illustrate the experimental aspects.

Previous datasets or hyperlinks have also been found on Wikilinks. Consonni et al. [39] offer a full dataset of internal Wikipedia links for the nine largest language editions. This dataset covers 17 years, from Wikipedia’s establishment in 2001 to 1 March 2018. Therefore, we cannot use it due to its lack of updates. Therefore, we propose a lightweight, language–independent pipeline [40] for extracting research data. The process begins by retrieving articles from Wikipedia dumps (Table 3) and processing them with the Dump Parser and Data Indexing tools. Relevant details are extracted, including titles, IDs, and Wikidata identifiers of articles and hyperlinks. As a result, we obtained 130,000 articles sharing the same topics in three languages (English, Japanese, and Vietnamese) and 20,441 articles in four languages (English, Japanese, Vietnamese, and Sinhala).

4.2. Data in the Different Cases

Table 4 shows the statistics of the datasets in different cases. To ensure comparability, we selected 1000 articles (users) from each language, based on the filtered articles in Section 4.1. The interaction column shows the number of interactions in each dataset, where an interaction represents whether an article mentions a hyperlink in its content.

Case 1: The data consists of seven datasets, each configured according to a language combination or a single language. The articles in this dataset were selected using strict criteria, requiring that the articles in all three languages share the same topic and have more than 100 hyperlinks. Articles from this dataset were also used in our previous study [41], where it demonstrated effectiveness in recommending hyperlink types that align with local interests in different language communities.
Case 2: This is the multilingual dataset in three languages (en, ja, vi) with the same number of users (articles) as Case 1, but these articles are randomly selected. Each article may pertain to a distinct topic and is not necessarily related.
Case 3: This dataset is extended with Sinhala (si) Wikipedia articles. This dataset also keeps our condition with articles that share topics in the four languages.

These datasets are available and accessible in the Hyperlink Recommendation directory on (https://github.com/nhunthp/Hlink_RS, (accessed on 10 June 2025)).

4.3. Hyper Parameter

For a fair comparison, we search for optimal parameters in the validation data and evaluate these models in the test data. To ensure a level playing field, we constrain the maximum model size for all methods by setting the same upper bound for the number of hidden units per layer at 1024. For computational efficiency, we assign an embedding dimension of 128 to all the methods. We employ LeakyReLU [38] as the activation function for all models. During the training phase, we set the batch size to 2048 and the learning rate to 0.001.

5. Result and Evaluation

5.1. Evaluation Metrics

To assess the effectiveness of item recommendation, we employ the widely adopted leave-one-out evaluation method, which has been extensively used in the literature [35,42,43]. In this evaluation, we reserved the most recent interaction of each user as the test set and utilized the remaining data for training purposes. Since ranking all items for every user during evaluation would be overly time–consuming, we followed a common strategy [36,44] of randomly selecting 100 items with which the user had not interacted and ranking the test item among this subset of 100 items. The evaluation of a ranked list was based on normalized discounted cumulative gain (NDCG), recall, and hit ratio (HR) metrics [45].

Normalized discounted cumulative gain (NDCG) is a measure used to assess the quality of rankings [46]. It is used to assess the performance of recommendation systems and other information retrieval systems. The value of NDCG is determined by comparing the relevance of the items returned by a model to the relevance of the items that a hypothetical “ideal” model would return. Formula (6) shows the normalization of discounted cumulative gain (DCG) and ideal discounted cumulative gain (IDCG) at k to calculate the NDCG, where k means only the top k items are considered. The formula for NDCG at k is

NDCG @ k = \frac{DCG @ k}{IDCG @ k}

(6)

In this context, DCG is the sum of gains associated with items within a search query, with an additional step of discounting the gains based on the rank of the items (Formula (7)). IDCG stands for ideal discounted cumulative gain, representing the highest possible DCG value achievable for a given set of items. It is determined by calculating the DCG for the ideal order of the items based on their gains (Formula (8)).

DCG @ k = \sum_{i = 1}^{k} \frac{2^{r e l_{i}} - 1}{{log}_{2} (i + 1)}

(7)

IDCG @ k = \sum_{i = 1}^{k} \frac{1}{{log}_{2} (i + 1)}

(8)

Recall at k:

Recall @ k = \frac{Number of relevant items recommended}{Total number of relevant items}

(9)

The hit ratio (HR) is the metric used to evaluate recommendation systems. It measures the proportion of test cases where the true item was successfully included in the top k recommendations provided by the model. In other words, if the model recommends a list of k items to a user, the hit ratio is the percentage of those lists where at least one of the items in the user’s test set (i.e., the items they interacted with or liked) is present in the top k recommendations. The hit ratio at k is calculated as follows:

Hit Ratio (HR) = \frac{Number of successful recommendations}{Total number of recommendations}

(10)

These metrics are often used together to provide a comprehensive view of the performance of the proposed system. As k increases, recall and NDCG tend to increase due to the evaluation of a larger set of proposed items. However, choosing the appropriate value of k requires considering the requirements of the application and the user experience. The hit ratio measures the proportion of users engaged in at least one of the suggested items (within the first k). Similarly to recall, the hit ratio also tends to increase as k increases because there are more opportunities for users to interact with at least one suggested item. However, evaluating the hit ratio with a large or small value k depends on the specific goals of the system and the user experience. If the system aims to recommend a small set of items of the highest quality, it can choose a small value k and evaluate the hit ratio in k. In contrast, if the goal is to provide more diversity in recommendations, a higher value k can be chosen to evaluate the hit ratio.

5.2. Experiment Result

We carried out experiments to assess the performance of CF–based models in our datasets to generate top n recommendations. We used the evaluation metrics mentioned above to verify the value of k in the range (20, 40, …, 100). Data were partitioned into 80% for training and 20% for testing to ensure reliable evaluation results. Our goal is to assess the system’s performance at large top recommendation sizes, evaluate the effectiveness of the models in cross–language recommendations, and compare their performance with monolingual and non-cross-language datasets.

5.2.1. Experiment 1: Comparison of the Performance of Different Models in Various Cases

In this experiment, we compare the performance of different models on datasets from Case 1, Case 2, and Case 3, as described in Section 4.2. To ensure a fair comparison, we use the three–language dataset configuration from Case 1 for this evaluation. Table 5 presents a comparative analysis of two recommendation models, BPRMF and NGCF, in three different datasets. In all cases, the performance of the NGCF is outperformed by the BPRMF model. The reason is that the high–order connectivity exploited in NGCF aligns well with the cross–language structure of our data, allowing us to better capture implicit relationships across languages. This also shows the effectiveness of GNN methods in recommendation tasks.

When examining the metrics for the NGCF model in each case, it becomes evident that Case 3 exhibits the highest recall, while Case 1 demonstrates the highest NDCG and hit rate. In detail, Figure 4 shows the performance on the NGCF model between datasets from Case 1 and Case 2. In three measurement indicators, the Case 1 dataset returns better recommendation results. This emphasizes the positive influence of cross-language data with shared entities, rather than random selection. Figure 4 illustrates the performance of the NGCF model in Case 1 and Case 3. Case 1 exhibits higher NDCG and hit rate values than Case 3, whereas Case 3 demonstrates higher recall. The reason is that the number of hyperlinks in Case 3 is less than in Case 1. The issue arises with languages that are too weak, such as Sinhala, as it becomes quite challenging to find cross-language articles with a significant number of hyperlinks. This indicates that for optimal recommendation results, the dataset should ideally consist of one weak language, while the remaining languages should be strong.

The experimental results confirm that reliable hyperlink recommendations can be achieved. CF-based models, particularly NGCF, are effective in generating relevant and efficient recommendations. As the value of k increases, the performance of the model improves in terms of key metrics such as NDCG and the hit rate, thus addressing RQ1.

5.2.2. Experiment 2: Comparison of Performance Across Different Language Configurations in Case 1

In this experiment, our goal is to examine the influence of multilingual configurations on the recommendation results. Based on the findings of Experiment 1, where the NGCF model consistently outperformed other models in all cases, we focus our analysis in this experiment solely on the evaluation metrics of the NGCF model across the datasets in Case 1. Figure 5 presents a performance comparison with different values of k, representing various configurations, in English, Japanese, and Vietnamese.

There is a general trend of improvement across all languages as k increases. Our main focus here lies on highlighting the distinctions among language configurations. The highest values for NDCG and HR occur in the multilingual dataset with three languages, with percentages of 51.3% and 98.37%, respectively. Recall values are generally high in mixed-language datasets, except for the Vietnamese dataset. This is because several relevant items in these data are smaller than others.

The volume of language-specific Wikipedias affects recommendation performance. Languages with more resources, such as English and Japanese, perform better in terms of NDCG, hit rate, and recall compared to low-resource languages like Vietnamese. However, low-resource languages can still benefit from recommendations, especially when included in multilingual datasets that leverage shared entities. Despite this, they face challenges such as fewer relevant hyperlinks, which limit their performance. These findings address RQ2.

5.3. Generated Recommendations

The results of the recommendations of our model demonstrate a keen sensitivity to the unique characteristics of each language, reflecting a thoughtful consideration of cultural and contextual differences (see Table 6). For example, in the top 50 recommendations for Japanese (ja) articles related to Mickey Mouse (Q11934), we observe a strong emphasis on entities and topics closely related to the Japanese media landscape, such as TV Tokyo and Tokyo Disneyland. This tailored approach ensures that the recommendations are highly relevant to Japanese readers, highlighting entities such as Marvel Entertainment, Harrison Ford, and Nintendo, which have a significant presence in Japan. In contrast, the recommendations for Vietnamese (vi) articles show a distinct set of suggestions, aligning with the interests and relevance of Vietnamese readers. Key recommendations include Game Boy Advance, Nihon Keizai Shimbun, The Walt Disney Company, and historical and cultural references such as 1991, 1996, and the United States of America. This indicates that the model adeptly adjusts its output to meet the preferences of Vietnamese users and contextual relevance.

Based on the results of the two experiments in Section 5.2 and the evaluation in this section, we realize that an effective hyperlink recommendation model should capture complex relationships and be suitable for all languages. In this study, our goal is to propose hyperlinks that enhance the connectivity of content between languages while preserving the unique characteristics of each language. Although BPRMF learns latent factors through matrix factorization to capture user–item preferences, it may struggle to identify more complex hidden patterns in the dataset. Deep learning models such as NGCF better capture these intricate relationships, which address RQ3.

5.4. Evaluation by Local Wikipedia Editors

To answer RQ4, a survey was conducted using Google Forms (https://forms.gle/kHSSam55nYPeaHQQ9, (accessed on 10 June 2025)) to evaluate the effectiveness of hyperlink recommendations. The survey involved six experienced editors from Vietnamese Wikipedia, most of whom had 1 to 5 years of editing experience, with some having more than 5 years of editing Wikipedia articles. The editors were asked to rate the top 20 hyperlink suggestions related to articles on various topics, including prominent figures, geography, society, history, and astronomy. Specifically, they were asked to assess the usefulness of the first 20 recommended links, with checkboxes provided to indicate whether a hyperlink is useful.

We grouped the responses of the discussants and calculated the percentage consensus on the usefulness of each topic, which is presented in Table 7. According to the result, the following hyperlinks for Astronomy and Geography have the highest usefulness rates. This illustrates how efficient this is in delivering high-quality and relevant hyperlink suggestions in Vietnamese Wikipedia articles.

6. Proof-of-Concept Application

This section presents a proof-of-concept (POC) application designed to evaluate the feasibility of hyperlink recommendation in multiple languages. This link is available here (https://lakshankarunathilake.github.io/HyperLink/, (accessed on 10 June 2025)). The application incorporates two main features: Topic Trending and Hyperlink Recommendation. The Trending topic feature visualizes the linguistic variations between English (en), Japanese (ja), and Vietnamese (vi) using word cloud charts (see Figure 6). Users engage with this feature by clicking to access the generated charts based on predefined local datasets. The purpose of this feature is to highlight differences between Wikipedia versions on the same topic, as revealed in our study [41].

The Hyperlink Recommendation feature generates hyperlink suggestions for article titles based on user input, including language and topic. As an example, we use articles on topics such as country and city. Users will choose a target language (en, ja, or vi) and provide a text input (e.g., “Mongol Empire”) to receive a list of recommended hyperlinks, ranked according to confidence scores. The results are returned in multiple languages to help users identify those with high predictability. Figure 7 shows the results of the hyperlink recommendations for the article “Mongol Empire” in Japanese. The output is presented in four columns: Qid, which contains the Wikidata identifiers, followed by their corresponding labels in en, vi, and ja. However, in some cases, the results may appear only in certain languages due to the locality of the method. This approach ensures that user preferences for each language are preserved. This POC application provides a foundational framework for further research and development in cross-lingual hyperlinking systems.

7. Discussion

7.1. Limitations of Dataset

The experiments carried out in this study have explored the use of several editions of Wikipedia, including English, Japanese, Vietnamese, and Sinhala, to suggest links and types of articles. The data extraction process is executed through a simple and convenient pipeline. However, we also identified some limitations associated with the methods. Specifically, as our study focuses on a cross-language dataset, there exists an inverse ratio: the larger the number of languages considered, the fewer articles with common topics between them. This disparity poses a challenge to Wikipedia. Therefore, increasing the number of Wikipedia versions is not the solution; instead, it would be more reasonable to identify the language groups with the highest number of common article topics.

7.2. Limitations of Method

The algorithms currently under consideration rely primarily on exploring the dynamics of interaction between articles and hyperlink types, based primarily on the similarity observed in various languages. However, this methodology overlooks the potential benefits of incorporating additional features, such as the textual content of the articles. Without integrating these additional attributes, the algorithms may encounter limitations in their ability to fully capture the nuanced semantics inherent in the textual information. Consequently, this approach may result in less than optimal performance and a restricted scope, potentially hindering the algorithms’ capacity to effectively harness the rich diversity of information available within the articles. Thus, a key area for future enhancement lies in exploring strategies to integrate additional features, thereby expanding the algorithms’ capabilities and enabling them to achieve more comprehensive and accurate analyses of article interactions.

8. Conclusions

This research introduces a hyperlink recommendation method using implicit multilingual rating datasets. Experimental results show that multilingual datasets improve model performance. By modeling deeper layers of connectivity, NGCF effectively uncovers latent cross-lingual associations embedded in the hyperlink structure, leading to improved recommendations for low-resource communities. Therefore, the GNN model outperforms the BPRMF model in capturing cross-lingual behavioral patterns and providing meaningful recommendations. The proposed approach improves digital inclusion by improving knowledge accessibility in low-resource language communities, addressing content scarcity, and educational gaps. It also strengthens cross-cultural connections, promoting knowledge sharing and solidarity across linguistic barriers through relevant hyperlink recommendations. In addition, it supports cultural sustainability by enriching underrepresented languages and preserving unique cultural traits. However, the scope of the experiments was limited to NGCF. Future research could explore more advanced algorithms to further improve performance. Expanding these methods to include more language editions and larger evaluation datasets will be crucial to a more comprehensive assessment of their effectiveness in diverse linguistic and cultural contexts.

Author Contributions

Conceptualization, N.N. and H.T.; data curation, N.N.; investigation, N.N., data analysis, N.N.; resources, N.N. and L.K.; writing—original draft, N.N.; review and editing, H.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are publicly accessible on GitHub at https://github.com/nhunthp/Hlink_RS (accessed on 10 June 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Moas, P.M.; Lopes, C.T. Automatic Quality Assessment of Wikipedia Articles—A Systematic Literature Review. ACM Comput. Surv. 2023, 56, 1–37. [Google Scholar] [CrossRef]
Kim, J.; Kim, S.; Lee, C. Anticipating technological convergence: Link prediction using Wikipedia hyperlinks. Technovation 2019, 79, 25–34. [Google Scholar] [CrossRef]
West, R.; Paranjape, A.; Leskovec, J. Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia. In Proceedings of the WWW ’15: 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; Republic and Canton of Geneva: Geneva, Switzerland, 2015; pp. 1242–1252. [Google Scholar]
Schwarzer, M.; Schubotz, M.; Meuschke, N.; Breitinger, C.; Markl, V.; Gipp, B. Evaluating Link-based Recommendations for Wikipedia. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, Newark, NJ, USA, 19–23 June 2016; pp. 191–200. [Google Scholar]
Labhishetty, S.; Siddiqa, A.; Nagipogu, R.; Chakraborti, S. WikiSeeAlso: Suggesting tangentially related concepts (see also links) for Wikipedia articles. In Proceedings of the Mining Intelligence and Knowledge Exploration: 5th International Conference, MIKE 2017, Hyderabad, India, 13–15 December 2017; pp. 274–286. [Google Scholar]
Tsunakawa, T.; Araya, M.; Kaji, H. Enriching Wikipedia’s Intra-language Links by their Cross-language Transfer. In Proceedings of the COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, Ireland, 23–29 August 2014; pp. 1260–1268. [Google Scholar]
Wang, Y.C.; Chuang, C.M.; Wu, C.K.; Pan, C.L.; Tsai, R.T.H. Cross-language article linking with deep neural network based paragraph encoding. Comput. Speech Lang. 2022, 72, 101279. [Google Scholar] [CrossRef]
Lih, A. The Wikipedia revolution: How a bunch of nobodies created the world’s greatest encyclopedia. In The Wikipedia Revolution: How a Bunch of Nobodies Created the World’s Greatest Encyclopedia; Hyperion: New York, NY, USA, 2009. [Google Scholar]
Ashrafimoghari, V. Detecting Cross-Lingual Information Gaps in Wikipedia. In Proceedings of the Companion Proceedings of the ACM Web Conference 2023 (WWW ’23 Companion), Austin, TX, USA, 30 April–4 May 2023; pp. 581–585. [Google Scholar]
Roy, D.; Bhatia, S.; Jain, P. Information asymmetry in Wikipedia across different languages: A statistical analysis. J. Assoc. Inf. Sci. Technol. 2022, 73, 347–361. [Google Scholar] [CrossRef]
Gottschalk, S.; Demidova, E. MultiWiki: Interlingual Text Passage Alignment in Wikipedia. ACM Trans. Web 2017, 11, 1–30. [Google Scholar] [CrossRef]
Boschin, A.; Bonald, T. Enriching Wikidata with Semantified Wikipedia Hyperlinks. In Proceedings of the Wikidata Workshop ISWC, Virtual Conference, 24 October 2021. [Google Scholar]
Miz, V.; Hanna, J.; Aspert, N.; Ricaud, B.; Vandergheynst, P. What is Trending on Wikipedia? Capturing Trends and Language Biases Across Wikipedia Editions. In Proceedings of the WWW ’20: Companion Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; ACM: New York, NY, USA, 2020; pp. 794–801. [Google Scholar]
Piccardi, T.; West, R. Crosslingual Topic Modeling with WikiPDA. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 3032–3041. [Google Scholar]
Briakou, E.; Carpuat, M. Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 1563–1580. [Google Scholar]
Roy, D.; Bhatia, S.; Jain, P. A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages. In Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 2373–2380. [Google Scholar]
Lewoniewski, W.; Węcel, K.; Abramowicz, W. Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information. In Proceedings of the FedCSIS-AIST 2022, ISM 2022, Sofia, Bulgaria, 4–7 September 2022; pp. 48–67. [Google Scholar]
Wu, J.; Zhang, X.; Zhu, Y.; Liu, Z.; Guo, Z.; Fei, Z.; Lai, R.; Wu, Y.; Cao, Z.; Dou, Z. Pre-training for Information Retrieval: Are Hyperlinks Fully Explored? arXiv 2022, arXiv:2209.06583. [Google Scholar] [CrossRef]
Oh, J.H.; Kawahara, D.; Uchimoto, K.; Kazama, J.I.; Torisawa, K. Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Sydney, NSW, Australia, 9–12 December 2008; IEEE: New York, NY, USA, 2008. [Google Scholar]
Yang, D.; Halfaker, A.; Kraut, R.; Hovy, E. Identifying Semantic Edit Intentions from Revisions in Wikipedia. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 2000–2010. [Google Scholar]
Piccardi, T.; Catasta, M.; Zia, L.; West, R. Structuring Wikipedia Articles with Section Recommendations. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ’18), Ann Arbor, MI, USA, 8–12 July 2018; pp. 665–674. [Google Scholar]
Faltings, F.; Galley, M.; Hintz, G.; Brockett, C.; Quirk, C.; Gao, J.; Dolan, B. Text Editing by Command. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human, Online, 6–11 June 2021; pp. 5259–5274. [Google Scholar]
Difallah, D.; Saez-Trumper, D.; Augustine, E.; West, R.; Zia, L. Crosslingual Section Title Alignment in Wikipedia. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 5892–5901. [Google Scholar]
Schick, T.; Yu, J.A.; Jiang, Z.; Petroni, F.; Lewis, P.; Izacard, G.; You, Q.; Nalmpantis, C.; Grave, E.; Riedel, S. PEER: A Collaborative Language Model. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
Wulczyn, E.; West, P.; Zia, L.; Leskovec, J. Growing wikipedia across languages via recommendation. In Proceedings of the WW ’16: International Conference on World Wide Web 2016, Montreal, QC, Canada, 11–15 April 2016; ACM: New York, NY, USA, 2016. [Google Scholar]
Gundala, L.A.; Spezzano, F. Readers’ Demanded Hyperlink Prediction in Wikipedia. In Proceedings of the Companion Proceedings of the WWW ’18: The Web Conference 2018, Lyon, France, 23–27 April 2018; Republic and Canton of Geneva: Geneva, Switzerland, 2018; pp. 1805–1807. [Google Scholar]
Adafre, S.F.; de Rijke, M. Discovering missing links in Wikipedia. In Proceedings of the LinkKDD ’05: 3rd International Workshop on Link Discovery, Chicago, IL, USA, 21 August 2005; pp. 90–97. [Google Scholar]
Calixto, I.; Raganato, A.; Pasini, T. Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 3651–3661. [Google Scholar]
Arora, A.; West, R.; Gerlach, M. Orphan Articles: The Dark Matter of Wikipedia. In Proceedings of the International AAAI Conference on Web and Social Media 2024, Buffalo, NY, USA, 3–6 June 2024; Volume 18, pp. 100–112. [Google Scholar]
Bompotas, A.; Triantafyllopoulos, P.; Raptis, G.E.; Katsini, C.; Makris, C. Towards Exploring Personalized Hyperlink Recommendations Through Machine Learning. In Proceedings of the UMAP Adjunct ’24: Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization, Cagliari, Italy, 1–4 July 2024; pp. 528–533. [Google Scholar]
Sedhain, S.; Menon, A.K.; Sanner, S.; Xie, L. AutoRec: Autoencoders Meet Collaborative Filtering. In Proceedings of the WWW ’15 Companion: 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 111–112. [Google Scholar]
Zheng, Y.; Tang, B.; Ding, W.; Zhou, H. A neural autoregressive approach to collaborative filtering. In Proceedings of the ICML’16: 33rd International Conference on International Conference on Machine Learning—Volume 48, New York, NY, USA, 19–24 June 2016; pp. 764–773. [Google Scholar]
Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T.S. Neural Graph Collaborative Filtering. In Proceedings of the SIGIR’19: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 165–174. [Google Scholar]
Kumar, A.; Singh, S.S.; Singh, K.; Biswas, B. Link prediction techniques, applications, and performance: A survey. Phys. A Stat. Mech Its Appl. 2020, 553, 124289. [Google Scholar] [CrossRef]
Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the UAI ’09: Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Arlington, VA, USA, 18–21 June 2009; pp. 452–461. [Google Scholar]
Elkahky, A.M.; Song, Y.; He, X. A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. In Proceedings of the WWW ’15: 24th International Conference on World Wide Web, Florence Italy, 18–22 May 2015; Republic and Canton of Geneva: Geneva, Switzerland, 2015; pp. 278–288. [Google Scholar]
Wang, Q.; Wu, S.; Bai, Y.; Liu, Q.; Shi, X. Neighbor importance-aware graph collaborative filtering for item recommendation. Neurocomputing 2023, 549, 126429. [Google Scholar] [CrossRef]
Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier Nonlinearities Improve Neural Network Acoustic Models. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; p. 3. [Google Scholar]
Consonni, C.; Laniado, D.; Montresor, A. WikiLinkGraphs: A complete, longitudinal and multi-language dataset of the Wikipedia link networks. In Proceedings of the International Conference on Web and Social Media, Munich, Germany, 11–14 June 2019. [Google Scholar]
Nguyen, N.; Takeda, H. Exploring Cross-Language Differences in Wikidata-based Hyperlink Types for Enhanced Editorial Support on Wikipedia. In Proceedings of the 12th International Joint Conference on Knowledge Graphs (IJCKG 2023), Tokyo, Japan, 8–9 December 2023. [Google Scholar]
Nguyen, N.; Takeda, H. Augmenting Low-Resource Language Wikipedia through Hyperlink Type Recommendation. IEICE Trans. Inf. Syst. 2025, 12, 2024EDP7258. [Google Scholar] [CrossRef]
Bayer, I.; He, X.; Kanagal, B.; Rendle, S. A Generic Coordinate Descent Framework for Learning from Implicit Feedback. In Proceedings of the 26th International Conference on World Wide Web 2016, Tokyo, Japan, 8–9 December 2023. [Google Scholar]
He, X.; Zhang, H.; Kan, M.Y.; Chua, T.S. Fast Matrix Factorization for Online Recommendation with Implicit Feedback. In Proceedings of the SIGIR ’16: 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Italy, 17–21 July 2016; pp. 549–558. [Google Scholar]
Koren, Y. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the KDD’08: 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008; pp. 426–434. [Google Scholar]
He, X.; Chen, T.; Kan, M.Y.; Chen, X. TriRank: Review-aware Explainable Recommendation by Modeling Aspects. In Proceedings of the CIKM ’15: 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 19–23 October 2015; pp. 1661–1670. [Google Scholar]
Wang, Y.; Wang, L.; Li, Y.; He, D.; Liu, T.Y. A Theoretical Analysis of NDCG Type Ranking Measures. In Proceedings of the 26th Annual Conference on Learning Theory, Princeton, NJ, USA, 12–14 June 2013; Shalev-Shwartz, S., Steinwart, I., Eds.; Proceedings of Machine Learning Research: Cambridge, MA, USA, 2013; Volume 30, pp. 25–54. [Google Scholar]

Figure 1. An example for Hypothesis 1.

Figure 2. An example for Hypothesis 2.

Figure 3. An example of hyperlink datasets.

Figure 4. Comparison of performance of NGCF on different cases.

Figure 5. Comparison of performance of NGCF on different language configurations in Case 1.

Figure 6. The difference of local topic trending in three languages.

Figure 7. Results of hyperlink recommendation by topic.

Table 1. The percentage of topic overlap among the language versions of Wikipedia. (Data referenced from Wikipedia 2022. Source: https://meta.wikimedia.org/wiki/List_of_Wikipedias, (accessed on 10 June 2025)).

Language	Code	en	fr	de	nl	pl	es	it	ru	ja	pt	vi
Englis	en	-	29.2	24.1	16.4	17	22.6	22.2	19.5	12.5	16.9	12
French	fr	78.7	-	44.4	30.9	33.8	40.6	43.7	35.6	22.3	32.5	17.5
German	de	58.5	40	-	24.2	27	28.7	32.6	28.6	17	22.7	11.6
Dutch	nl	50.7	35.4	30.8	-	26.2	30.5	29.2	23.8	14.6	24.9	29.6
Polish	pl	72.9	53.7	47.7	36.3	-	41.7	45.9	42.9	23.2	35.8	19.8
Spanish	es	83.4	55.7	43.7	36.5	36	-	47.5	39.5	25.5	41.6	25.5
Italian	it	82.2	60.1	49.8	35	39.7	47.7	-	41.4	25.2	38.3	19.1
Russian	ru	70.3	47.6	42.4	27.8	36.1	38.6	40.3	-	24.8	32.2	17
Japanese	ja	75.7	40.7	41.6	23.2	25.9	34	33.4	33.8	-	28.6	17.4
Portuguese	pt	81.4	60.2	55.6	47.9	49.8	67.0	61.4	53.1	34.6	-	34.2
Vietnamese	vi	67.4	32.8	24.2	48.4	23.4	36.3	26	23.8	17.9	29	-

Table 2. Articles and their hyperlinks.

Article (User ID)	Hyperlink (Item ID)
Mickey Mouse (1)	New York (1), Walt Disney (2)
Tokyo (2)	New York (1), World War I (3), Kanto (4)
Marie Curie (3)	World War I (3), Nobel Prize (5)

Table 3. Statistics of the number of articles in Wikipedia dumps by language.

Language	#Articles
English	6,928,834
Japanese	1,441,422
Vietnamese	1,294,331
Sinhala	21,950

Table 4. Dataset statistics across various cases with different languages.

Dataset	Language Configuration	#Users	#Items	#Interactions	Density
Case 1	en	1000	201,047	406,877	0.00202
	ja	1000	111,075	284,165	0.00256
	vi	1000	43,592	144,301	0.00331
	en + ja	2000	257,214	691,041	0.00134
	en + vi	2000	209,810	551,177	0.00131
	ja + vi	2000	126,816	428,466	0.00169
	en + ja + vi	3000	263,396	835,342	0.00106
Case 2	en + ja + vi	3000	98,792	167,890	0.00057
Case 3	en + ja + vi + si	4000	98,791	319,080	0.00073

Table 5. Comparison performance metrics for different cases and algorithms.

Top-k	Model	Recall (%)			NDCG (%)			Hit Rate (%)
Top-k	Model	Case 1	Case 2	Case 3	Case 1	Case 2	Case 3	Case 1	Case 2	Case 3
k = 20	BPRMF	4.44	11.33	11.14	22.38	20.31	21.4	67.00	41.22	60.49
k = 20	NGCF	8.63	11.48	24.74	13.7	30.01	18.68	89.67	45.19	71.84
k = 40	BPRMF	6.56	14.17	15.66	27.91	22.72	25.89	78.20	48.91	71.46
k = 40	NGCF	13.05	14.83	20.02	37.90	21.66	30.82	95.17	54.26	81.53
k = 60	BPRMF	8.11	15.65	19.16	31.52	24.15	29.37	83.03	53.64	77.93
k = 60	NGCF	16.44	17.39	24.61	43.54	23.74	34.73	97.17	59.69	86.43
k = 80	BPRMF	9.40	16.87	21.66	34.40	25.20	31.84	86.13	56.48	81.62
k = 80	NGCF	19.16	19.22	28.29	47.82	25.18	37.81	98.07	63.18	89.31
k = 100	BPRMF	10.59	17.88	23.65	36.90	26.10	33.84	88.27	59.32	83.87
k = 100	NGCF	21.56	20.82	31.08	51.30	26.36	40.12	98.37	66.10	91.12

Table 6. Top recommendations of “Mickey Mouse” article in Japanese and Vietnamese.

Recommendation for ja	Recommendation for vi
Walt Disney Studios	2012
Marvel Entertainment	2006
IGN	TV Tokyo
Disney Channel	Nintendo
Walt Disney Studios Motion Pictures	Game Boy Advance
Harrison Ford	2020
superhero	comics
Metacritic	The Walt Disney Company
Universal Pictures	1991
Academy Award for Best Visual Effects	Nihon Keizai Shimbun
Paramount Pictures	1999
Steven Spielberg	…
Nitendo	Steamboat Willie
Walt Disney Pictures	1996
…	United States of America
animation	video game
Tokyo Disneyland	PlayStation 2
…	…

Table 7. Evaluation of usefulness of recommendations across different topics by editors.

Topic	Usefulness Rate (%)
Prominent Figure	40
Geography	65
Society	45
History	25
Astronomy	70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nguyen, N.; Takeda, H.; Karunathilake, L. Empowering Weak Languages Through Cross-Language Hyperlink Recommendation. Information 2025, 16, 749. https://doi.org/10.3390/info16090749

AMA Style

Nguyen N, Takeda H, Karunathilake L. Empowering Weak Languages Through Cross-Language Hyperlink Recommendation. Information. 2025; 16(9):749. https://doi.org/10.3390/info16090749

Chicago/Turabian Style

Nguyen, Nhu, Hideaki Takeda, and Lakshan Karunathilake. 2025. "Empowering Weak Languages Through Cross-Language Hyperlink Recommendation" Information 16, no. 9: 749. https://doi.org/10.3390/info16090749

APA Style

Nguyen, N., Takeda, H., & Karunathilake, L. (2025). Empowering Weak Languages Through Cross-Language Hyperlink Recommendation. Information, 16(9), 749. https://doi.org/10.3390/info16090749

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Empowering Weak Languages Through Cross-Language Hyperlink Recommendation

Abstract

1. Introduction

2. Related Works

3. Hyperlink Recommendation Method

3.1. Hyperlink Recommendation Task

3.2. Bayesian Personalized Ranking Model

3.3. Graph Neural Network Model

4. Dataset

4.1. Data Creation

4.2. Data in the Different Cases

4.3. Hyper Parameter

5. Result and Evaluation

5.1. Evaluation Metrics

5.2. Experiment Result

5.2.1. Experiment 1: Comparison of the Performance of Different Models in Various Cases

5.2.2. Experiment 2: Comparison of Performance Across Different Language Configurations in Case 1

5.3. Generated Recommendations

5.4. Evaluation by Local Wikipedia Editors

6. Proof-of-Concept Application

7. Discussion

7.1. Limitations of Dataset

7.2. Limitations of Method

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI