A Recommendation Algorithm Combining Local and Global Interest Features

: Due to the ability of knowledge graph to effectively solve the sparsity problem of collaborative ﬁltering, knowledge graph (KG) has been widely studied and applied as auxiliary information in the ﬁeld of recommendation systems. However, existing KG-based recommendation methods mainly focus on learning its representation from the neighborhood of target items, ignoring the inﬂuence of other items on the target item. The learning focuses on the local feature representation of the target item, which is not sufﬁcient to effectively explore the user’s preference degree for the target item. To address the above issues, in this paper, an approach combining users’ local interest features with global interest features (KGG) is proposed to efﬁciently explore the user’s preference level for the target item, which learns the user’s local interest features and global interest features for target item through Knowledge Graph Convolutional Network and Generative Adversarial Network (GAN). Speciﬁcally, this paper ﬁrst utilizes the Knowledge Graph Convolutional Network to mine related attributes on the knowledge graph to effectively capture item correlations and obtain the local feature representation of the target item, then uses the matrix factorization method to learn the user’s local interest features for target items. Secondly, it uses GAN to learn the user’s global interest features for target items from the implicit interaction matrix. Finally, a linear fusion layer is designed to effectively fuse the user’s local and global interests towards target items to obtain the ﬁnal click prediction. Experimental results on three real datasets show that the proposed method not only effectively integrates the user’s local and global interests but also further alleviates the problem of data sparsity. Compared with the current baselines for knowledge graph-based systems, the KGG method achieves a maximum improvement of 8.1% and 7.6% in AUC and ACC, respectively.


Introduction
In recent years, with the continuous development of Internet technology, recommendation systems have been widely used in e-commerce, social networks, music and news to solve the problem of information explosion. Among them, collaborative filtering (CF) based recommendation algorithms are one of the most widely used and popular technologies [1]. Collaborative filtering [2][3][4] infers unknown data through historical data of user interaction with items to recommend items that users may be interested in. However, collaborative filtering-based recommendation algorithms usually have data sparsity and cold start problems. Therefore, many researchers have proposed using knowledge graphs [5][6][7][8][9] as auxiliary information to alleviate the problem of data sparsity.
A knowledge graph is a commonly used auxiliary information in recommendation systems, which usually contains richer facts and connections about items. A knowledge graph is a heterogeneous graph, mainly composed of triples (entity, relation, entity). Among them, nodes correspond to entities, and edges correspond to relations. The recommendation algorithm based on a knowledge graph uses the rich semantic association between items to improve the performance of the recommendation system. Specifically, it aggregates neighbor nodes around target items in knowledge graphs to enrich the representation vector of target items. At present, knowledge graphs have been successfully applied to recommendation systems, and researchers have studied many classic algorithms.
PER [10] is a path-based recommendation algorithm that views KG as a heterogeneous information network and extracts latent features between users and items from manually constructed entity-entity paths. However, PER relies on path design, which is difficult to construct reasonable entity paths from massive data manually. CKE [11] combines the CF module with knowledge embedding, text embedding and item image embedding in a unified Bayesian framework to improve the performance of the recommendation system, but it does not consider the differences in different modal features in information fusion. DKN [12] combines news title embedding with entity embedding in the knowledge graph to enhance news feature representation. However, DKN is not an end-to-end training and requires pre-acquisition of entity embedding representations, which increases model complexity. RippleNet [13] views the user's historical interests as seed sets in KG and iteratively expands the user's interests along KG links to discover his latent interest in candidate items. However, when exploring users' interests along KG links, authors ignore the influence of relations on user interests. MKR [14] shares high-order interactive representations learned from KG through multi-task learning to supplement the insufficient item representation caused by data sparsity. KGNN-LS [15] transforms the knowledge graph into a user-specific weighted graph using label smoothness regularization and then applies graph neural networks to calculate personalized project embedding representations. Finally, the user's click probability is predicted through the inner product. KGCN [16] aggregates neighborhood information to obtain high-order structural information and semantic information from KG to enrich target entity representation and improve recommendation performance. KGAT [17] combines an embedding propagation layer with an attention mechanism, which adaptively propagates embeddings from the neighbors of the target item to update its representation, ultimately capturing high-order structural information based on both user behavior and item attributes and efficiently utilizing rich information in the knowledge graph. CKAN [18] explicitly encodes collaborative signals through collaborative propagation and combines them with knowledge associations to enrich user and item embeddings. In the propagation process, it selects neighbors through a knowledge-aware attention mechanism and aggregates the representations of users and items from different propagation layers as input to the prediction layer, predicting the preference score of the user for the item. KGIN [19] proposes an intent network based on the knowledge graph, which identifies important intents and relation paths to characterize the correlation between users and items, providing interpretability and better recommendation performance for prediction. CKEN [20] encodes the user-item collaboration information explicitly using a collaborative propagation method and propagates it in the knowledge graph to obtain item representations. In addition, it designs a feature interaction layer to enable feature sharing between the items in the recommendation system and the entities in the knowledge graph, further enriching the latent vector representations of the items. However, the above methods, although focusing on the impact of entities and relationships on users when obtaining rich item representations through knowledge graphs, are limited by the size of the receptive field, making it difficult to involve every neighboring node and lacking the exploration of users' global interest characteristics, which is insufficient to effectively explore users' interest distribution.
In order to better capture users' local and global interests for a recommendation, this paper proposes a recommendation algorithm combining local interest features and global interest features, which integrates knowledge graph information and rating information, called KGG. Figure 1 is the framework diagram of KGG. KGG consists of three parts: local feature learning module, global feature learning module and linear fusion module. First, in the local feature learning module, we used KGCN to aggregate the node information in the neighborhood to obtain the feature representation of the target item, then calculated the dot product between the obtained item feature vector and user feature vector to obtain local interactive features. Secondly, in the global feature learning module, we used Generative Adversarial Network (GAN) to learn user interest distribution from the rating matrix and finally obtained global interactive features. Finally, we used a linear fusion module to unify local features and global features to predict the user's click probability. called KGG. Figure 1 is the framework diagram of KGG. KGG consists of three parts: local feature learning module, global feature learning module and linear fusion module. First, in the local feature learning module, we used KGCN to aggregate the node information in the neighborhood to obtain the feature representation of the target item, then calculated the dot product between the obtained item feature vector and user feature vector to obtain local interactive features. Secondly, in the global feature learning module, we used Generative Adversarial Network (GAN) to learn user interest distribution from the rating matrix and finally obtained global interactive features. Finally, we used a linear fusion module to unify local features and global features to predict the user's click probability. We conducted extensive experiments on three real-world datasets to evaluate the performance of our model. The experimental results validated the complementarity of two feature learning modules and produced better prediction performance when fused. In addition, our proposed KGG significantly outperformed existing baselines in all three datasets. In summary, the contributions of this paper are as follows: 1. We combined the Knowledge Graph Convolutional Network and Generative Adversarial Network organically, and our model can learn both local and global interests effectively to explore users' interest preferences. 2. By utilizing rating information and knowledge graph information, we can further alleviate the data sparsity problem existing in recommendation systems. 3. Through extensive experiments on three real-world datasets, the results show that our proposed KGG achieves better click prediction accuracy than current state-ofthe-art methods.
The rest of the paper is organized as follows: Section 2 summarizes the related work. Section 3 provides the problem definition and proposed model. Section 4 analyzes the experimental results. In Section 5, we conclude this paper.

Recommendation Method Based on Generative Adversarial Network
In recent years, Generative Adversarial Networks (GANs) [21][22][23] have been applied to the recommendation field due to their strong ability to learn complex data distributions. With the continuous development of GANs, Jun Wang et al. first applied mature GAN models to the recommendation field in 2017 and proposed the IRGAN model [24], which uses GANs to learn user's interest distribution and combines collaborative filtering to predict how interested users are in projects. Dong-Kyu Chae et al. proposed the CFGAN model [25], which uses real-valued vectors for training, thus fully utilizing GAN's adversarial characteristics. Y. Tong et al. proposed the CGAN model [26], which replaces the generator in GAN with a variational autoencoder [27], transforming adversarial training from discrete space to continuous space, thus improving the robustness of the model. We conducted extensive experiments on three real-world datasets to evaluate the performance of our model. The experimental results validated the complementarity of two feature learning modules and produced better prediction performance when fused. In addition, our proposed KGG significantly outperformed existing baselines in all three datasets. In summary, the contributions of this paper are as follows:

1.
We combined the Knowledge Graph Convolutional Network and Generative Adversarial Network organically, and our model can learn both local and global interests effectively to explore users' interest preferences.

2.
By utilizing rating information and knowledge graph information, we can further alleviate the data sparsity problem existing in recommendation systems.

3.
Through extensive experiments on three real-world datasets, the results show that our proposed KGG achieves better click prediction accuracy than current state-of-theart methods.
The rest of the paper is organized as follows: Section 2 summarizes the related work. Section 3 provides the problem definition and proposed model. Section 4 analyzes the experimental results. In Section 5, we conclude this paper.

Recommendation Method Based on Generative Adversarial Network
In recent years, Generative Adversarial Networks (GANs) [21][22][23] have been applied to the recommendation field due to their strong ability to learn complex data distributions. With the continuous development of GANs, Jun Wang et al. first applied mature GAN models to the recommendation field in 2017 and proposed the IRGAN model [24], which uses GANs to learn user's interest distribution and combines collaborative filtering to predict how interested users are in projects. Dong-Kyu Chae et al. proposed the CFGAN model [25], which uses real-valued vectors for training, thus fully utilizing GAN's adversarial characteristics. Y. Tong et al. proposed the CGAN model [26], which replaces the generator in GAN with a variational autoencoder [27], transforming adversarial training from discrete space to continuous space, thus improving the robustness of the model. Yuan Lin et al. proposed the IFGAN model [28], which further introduces an additional generator into the original GAN, increasing information flow between generators and reducing differences between generators and discriminators.

Recommendation Method Based on Knowledge Graph
Currently, recommendation methods based on knowledge graphs can be divided into three categories: (1) Embedding-based methods [29][30][31][32][33][34], which obtain rich representations Embedding-based methods have high flexibility in the auxiliary recommendation, but they are more suitable for graph applications due to their focus on modeling semantic relevance.
(2) Path-based methods [35][36][37][38][39], which assist in recommendations by exploring various connection patterns between items in the knowledge graph. Although path-based methods can intuitively use KG, the design of paths often requires manual work, which is time consuming and laborious in practice. (3) Hybrid methods [40][41][42][43], which combine embedding-based methods and path-based methods to obtain item embedding representations.

Method
In this section, we introduce the proposed KGG model. First, we define the research question. Then, we introduce the various modules of KGG in detail. Finally, we present the complete learning algorithm of KGG.

Problem Definition
We assumed that there are M groups of users U = {u 1 , u 2 , . . . , u m } and N groups of items I = {i 1 , i 2 , . . . , i n } in a particular scenario.
In this paper, we used an implicit interaction matrix Y ∈ R M×N , where if a user has rated or interacted with a certain item, the corresponding position of the item in the implicit interaction matrix is set to 1 (y ui = 1), and if not rated or interacted, the value at that position is 0 (y ui = 0). Additionally, we used a knowledge graph G containing head entities, relations and tail entities (h, r, t), where h ∈ H, r ∈ R, t ∈ H, H is the set of entities, and R is the set of relations. For example, triples (Full River Red, movie. movie. director, Zhang Yimou) indicates that Full River Red was directed by Zhang Yimou. In many recommendation application scenarios, an item i ∈ I also corresponds to an entity e ∈ E. As in the above example, Full River Red also corresponds to an entity in the knowledge graph.
In this paper, we used GAN to learn the global interest features of users in the implicit interaction matrix Y and used the KGCN method to learn the local interest features of users in the knowledge graph. Our ultimate goal is to predict whether there is a potential interest for user u on an uninteracted item i, namelyŷ ui = F(u, i|θ, Y, G) , where θ represents the model parameters of function F.

Local Feature Learning Module
In the local feature learning module, this paper mainly refers to the KGCN [16] method proposed by Hongwei Wang et al. KGCN is a graph-based deep learning framework that captures the relationship between entities by mapping node and edge features to a lowdimensional space to obtain the feature representation of the target item. In this section, we focus on introducing the principle of KGCN.
In the KGCN layer, different users have different levels of interest in different relationships. For example, u 1 loves Zhang Yimou very much, so he gives a high score to the movie Manchurian Red, while u 2 only likes love movies and is not interested in who directed it, so he will not watch Manchurian Red or give it a low score after watching. Therefore, the KGCN model needs to calculate the degree of users' interest in each relationship, namely, the scores, and normalize all the relationship scores to obtain the probability of users' possible liking for each relationship as weights for neighbor node representation.
The formula for calculating the score between a user and a relationship is as follows: S represents the score of the user and the relationship, d represents the dot product calculation, and the dimension size of user u and relationship r is the same. After calculating the scores of each relationship for the user, all scores are normalized to obtain the final probability of the user's interest in each relation, as shown in Formula (2): where r i,e represents the relationship between entity i and entity e, S u r i,e represents the score of user and relation, N(i) represents the set of entities directly connected to item i. As shown in Figure 2, red nodes represent our target items; blue nodes represent our neighbor nodes. If the field size is 1 (D = 1), there are three entities directly connected to the red node e 1 , e 2 , e 3 , then N(i) = {e 1 , e 2 , e 3 }.

sentation.
The formula for calculating the score between a user and a relationship is as follows: represents the score of the user and the relationship, represents the dot product calculation, and the dimension size of user and relationship is the same. After calculating the scores of each relationship for the user, all scores are normalized to obtain the final probability of the user's interest in each relation, as shown in Formula (2): where , represents the relationship between entity and entity , , represents the score of user and relation, ( ) represents the set of entities directly connected to item . As shown in Figure 2, red nodes represent our target items; blue nodes represent our neighbor nodes. If the field size is 1 ( = 1), there are three entities directly connected to the red node , , , then ( ) = { , , }. The probability of the user being interested in the relationship is calculated as the weight represented by the neighboring node, which is specifically calculated according to Formula (3): If the receptive field size is 1 ( = 1) in Figure 2, then the representation of the neighboring nodes is = , ⋅ + , ⋅ + , ⋅ . The probability of the user being interested in the relationship is calculated as the weight represented by the neighboring node, which is specifically calculated according to Formula (3): If the receptive field size is 1 (D = 1) in Figure 2, then the representation of the neighboring nodes is v = ∼ S u r 1 ,e 1 · e 1 + ∼ S u r 2 ,e 2 · e 2 + ∼ S u r 3 ,e 3 · e 3 . The last step of the KGCN layer is to represent the neighboring nodes and the target item as a single vector in an aggregation manner. There are three main aggregation methods: Sum aggregator, Concat aggregator and Neighbor aggregator. In this model application, we mainly use the Sum aggregator. The calculation method is shown in Formula (4): where i is the vector representation of the target item itself, and i u N(i) is the vector representation of its neighbor node.
Finally, we predict the target items by using the target item feature representations learned from the KGCN layer and user representations as shown in Equation (5): whereŷ ui represents the local interest prediction, i u is the feature vector of item i corresponding to user u, and f is the dot product operation.
In the local feature learning module, we use the cross-entropy loss for training, which is calculated as shown in Formula (6): C represents cross-entropy loss, and F represents the L2 regularization term.

Global Feature Learning Module
In the global feature learning module, the CFGAN model framework was used in this paper. In the training schemes proposed by the previous researchers, there was no consideration of the generated samples being discrete item indices, which easily leads to Discriminator (D) being unable to distinguish the source of input and thus cannot fully leverage adversarial training to learn user's interest distribution. To address this issue, Dong-Kyu Chae et al. proposed the CFGAN model [25]. CFGAN uses G and D for adversarial training, where G tries to generate continuous real-valued element purchase vectors instead of discrete item indices, and D tries to distinguish generated real-valued purchase vectors from real ones. This can avoid the confusion of D caused by mutually contradictory labels in current GAN-based recommendation algorithms and make the discriminator guide G through back propagation to continuously improve so that generated vectors become closer to actual purchase vectors during the iteration process. Finally, a higher precision recommendation can be achieved.
In this paper, we only introduced the key part of CFGAN. Figure 3 is the framework of the CFGAN model, which has specific user conditions, that is, to learn the parameters of the model under the condition of considering user personalization. Given a specific user condition vector c u (user attribute features) and real user purchase vector r u (position interacted with users and items is 1, otherwise 0). The purpose of G is to generate vectors that are as close as possible to real purchase vectors, while D's goal is to distinguish input vectors from generators or discriminators as much as possible.  Formally, the D's objective function, denoted as J D , is as follows: where r u is the true purchase vector,r u is the generated purchase vector, is the dot product operation, e u is an indicator vector (if u purchased i, then e ui = 1; otherwise e ui = 0), and k u is an indicator vector (if j belongs to the negative sample set sampled in training, then k ui = 1; otherwise k ui = 0).
The loss function of G is shown in Formula (8): Among them, γ is the second regularization term, j is the weight parameter, and j is the non-interaction item.
For D's loss function J D , the main goal is that if the input vector comes from the generator, it is better for its output value to be close to 0 after passing through D; if the input vector comes from a sampled real purchase vector, it is better for its output value to be close to 1 after passing through D, and D reversely transmits the obtained information to G and guide G's training. For G's loss function J G , the main goal is that after passing through D, the output value of the G-generated purchase vector is as close to 1 as possible, namely, as close as possible to real purchase vectors.
After G and D underwent continuous confrontation training, they eventually reached a balanced point where D could not distinguish the source of the input. At this point, the model training was completed. The dense vectors generated by G serve as scores for each user on what they may like, and by sorting these scores and selecting the top N items to recommend to the user, the predicted result is presentedŷ ui (global).

Linear Fusion Module
The above two modules capture the local and global interest features of users from different sources (rating matrix and knowledge graph), respectively. This paper unifies the local and global together through a designed linear fusion module, whose specific calculation method is shown in Formula (9): y ui = σ(β ·ŷ ui (local) + α ·ŷ ui (global)) (9) where the sigmoid function is used to limit our prediction value between 0 and 1, and β and α are weights of local and global interest features. Different proportions of local and global interest features have different impacts on the accuracy of our prediction. In the experimental part, we analyzed these two hyperparameters.

KGG Model
The KGG model adopts a step-wise training approach, where the local feature learning module and global feature learning module are independent of each other and trained separately. Finally, the linear fusion module is used to obtain the final click prediction result.

Experiment
We conducted experiments and evaluations on three real-world datasets with our proposed KGG model. Our main research questions are as follows: (1) How does our KGG model compare to existing knowledge graph-based recommendation models? (2) What is the impact of the hyperparameters of the linear fusion module on our model? (3) Is it effective to combine the local interest feature learning module and the global interest feature learning module?

Datasets
In order to verify the performance of the KGG model, we used three common datasets: Last.FM, Movielens-1M and Book-Crossing. These datasets all contain user and item rating histories, which we processed as implicit ratings where the position of items rated is 1 and 0 otherwise. For each dataset, we randomly selected 60% of the user interactions as a training set, 20% as a test set and 20% as a validation set. The information on the datasets is shown in Table 1.

Baselines
We compared our proposed KGG model with the following baselines, including one without KG and five based on KG: PER [10]: Viewing KG as a heterogeneous graph and exploring the various relationships between projects through path-based methods.
CKE [11]: An integrated recommendation framework that combines collaborative filtering with structural, textual, and visual knowledge.
DKN [12]: Using CNN to combine entity embedding and word embedding as multiple channels for click prediction.
Wide&Deep [44]: Combines a generalized linear model with embeddings and multilayer perceptrons from deep learning to predict clicks.
RipppleNet [13]: A memory-network-like approach is employed to propagate users' preferences on the KG for the recommendation.
KGCN [16]: Utilize the neighbor nodes of the target item in the knowledge graph reasonably to enrich the feature representation of the target item and explore users' personalized and latent interests.
KGAT [17]: A recommendation algorithm that adaptively obtains the embedding representation of the target item from its neighboring nodes through an attention mechanism.
KGIN [19]: A knowledge graph-based intent network is proposed, which utilizes users' intents and relation paths to characterize the correlation between users and items.

Experiments Setup
In the click-through rate prediction, AUC (area under the curve) and ACC [45] are selected as evaluation indexes to measure the performance of KGG prediction. The calculation method for AUC is as follows: Among, I P positive , P negative =    1, P positive > P positive 0.5, P positive > P negative 0, P positive < P negative (11) M represents the number of positive samples, and N represents the number of negative samples; there are a total of M × N pairs of samples (one pair of samples consists of one positive sample and one negative sample). Count the number of pairs in which the predicted probability of the positive sample is greater than that of the negative sample among all the samples.
The calculation method for ACC is as follows: where TP represents True Positive, TN represents True Negative, FP represents False Positive, and FN represents False Negative. In the KGG model, the local feature learning module and global feature learning modules are trained step-by-step, so two parts of experimental parameters are involved. The specific parameter settings are as follows: In the local interest feature learning module, different parameters are set for different datasets. In Last.FM, the entity embedding dimension d is 2048, the neighbor node size K is 8, the receptive field depth D is 1, the epoch is 20, the batchsize is 1024, the learning rate of l2 regularization λ is 10 −7 , and the model training learning rate η is 10 −3 . In Movielens-1M, the entity embedding dimension d is 512, neighbor node size K is 8, receptive field depth D is 1, the epoch is 20, the batchsize is 256, l2 regularization learning rate λ is 10 −7 , and model training learning rate η is 10 −3 . In Book-Crossing, the entity embedding dimension d is 256, the neighbor node size K is 8, the receptive field depth D is 3, the epoch is 20, the batchsize is 20, l2 regularization learning rate λ is 10 −5 , and the model training learning rate η is 10 −3 .
In the Global Interest Feature Learning module, the network layers of the generator and encoder are 4, the hidden layer size is 400, the regularization weight is 0.1, and the learning rate is 0.001. In Last.FM, epochs are 1000 and batchsize is 32. In Movielens-1M, epochs are 1500 and batchsize is 64. In Book-Crossing, epochs are 1500 and batchsize is 256.

Comparison with the Baselines (RQ1)
In this section, we compared KGG with the baselines proposed in Section 4.2, and the experimental results are shown in Table 2. Intuitively, our proposed KGG model has made great improvements compared to the existing baselines. We draw the following conclusion from the experimental results: In Last.FM, our proposed KGG method improves AUC by 4.0% and ACC by 6.5% compared with KGCN. In Movielens-1M, AUC improves by 0.7%, and ACC improves by 1.2%. In Book-Crossing, AUC improves by 8.1%, and ACC improves by 7.6. Although RippleNet, DKN, CKE, and PER can capture the rich semantic relationships among items from the knowledge graph and improve the interpretability of recommendations, these knowledge graph-based approaches suffer from certain shortcomings and cannot capture the global interest distribution of users. While KGG compares with them, the maximum improvement of AUC is 23.1%, and ACC is 20.6% in Last.FM, 23.3% and 22.1% in Movielens-1M, and 14.9% and 10.5% in Book-Crossing. The results show that compared with knowledge graph-based methods, KGG combines local interest features with global interest features to fully explore the overall interest distribution of users, improving the accuracy of the recommended model.
Wide&Deep is a classical combined linear and nonlinear model which synthesizes the linear and nonlinear features of users and can be regarded as a model for exploring the global interest distribution of users. Compared with Wide&Deep, in Last.FM, the AUC is improved by 8.1% and ACC by 8.4%; in Movielens-1M, AUC is improved by 3.0% and ACC by 3.8%; in Book-Crossing, the AUC is improved by 2.6% and ACC by 6.6%. The advantage of KGG lies in that it uses local and global interest features extracted by a neural network to obtain complete user interest features through linear fusion at the same time of linear and nonlinear combination; at the same time, we are able to use more information to explore the distribution of users' interests effectively.
Compared with the aforementioned methods on three datasets, KGAT and KGIN achieve better recommendation performance, and the attention mechanism and user intent in KGAT and KGIN can further explore the rich information in the knowledge graph, indicating that different neighbor nodes and relationships have varying degrees of importance for users. However, KGG outperforms KGAT and KGIN, indicating that local interest features alone are insufficient to represent the overall interests of users, and the fusion of local and global interest features can more effectively explore the interest distribution of users.
From the evaluation metrics, KGG shows excellent performance on Last.FM, Book-Crossing and Movielens-1M. On the datasets, Last.FM and Book-Crossing are much more sparse than Movielens-1M, while KGG's improvement on Last.FM and Book-Crossing is higher than that of Movielens-1M. We believe that KGG, which combines local and global information, can not only comprehensively explore users' interests and preferences but also alleviate the existing data sparsity problem to improve the accuracy of click rate prediction.

Hyperparameter Analysis (RQ2)
In the analysis and experiment, we set the other parameters of the model, as shown in Section 4.4.1. Figures 4-6 show the influence of β and α on model performance under Last.FM, Movielens-1M and Book-Crossing, respectively, where the abscissa indicates the values of β and α. From these three figures, it can be seen that with the increasing of β and α, the accuracy of click rate prediction is higher in different datasets. When β is 0.1 and α is 0.9, the performance of the model reaches its maximum. We will analyze cases with β = 0, α = 1 or β = 1, α = 0 in ablation experiments. also alleviate the existing data sparsity problem to improve the accuracy of click rate prediction.

Hyperparameter Analysis (RQ2)
In the analysis and experiment, we set the other parameters of the model, as shown in Section 4.4.1. Figures 4-6 show the influence of and on model performance under Last.FM, Movielens-1M and Book-Crossing, respectively, where the abscissa indicates the values of and . From these three figures, it can be seen that with the increasing of and , the accuracy of click rate prediction is higher in different datasets. When is 0.1 and is 0.9, the performance of the model reaches its maximum. We will analyze cases with = 0, = 1 or = 1, = 0 in ablation experiments.
From the experimental results shown in Figures 4-6, we can discover that as the weight of global interest features increases, the accuracy of KGG model prediction also increases. This explains to some extent the importance of global interest features in the KGG model and plays a key role in effectively exploring user interest distribution.  From the experimental results shown in Figures 4-6, we can discover that as the weight of global interest features increases, the accuracy of KGG model prediction also increases. This explains to some extent the importance of global interest features in the KGG model and plays a key role in effectively exploring user interest distribution.   Figure 7. On all three datasets, the performance of KGG outperforms that of the Local Interest Feature Module. This, to some extent, proves that using only a local interest feature module cannot sufficiently and effectively explore the user's interest distribution. The combination of local and global can not only alleviate the data sparsity problem by using different sources of information but also comprehensively integrate users' multiple interests to reasonably recommend items that they are more interested in.

Analysis of Linear Fusion Module
In order to verify the effectiveness of the linear fusion module designed in this paper,  Figure 7. On all three datasets, the performance of KGG outperforms that of the Local Interest Feature Module. This, to some extent, proves that using only a local interest feature module cannot sufficiently and effectively explore the user's interest distribution. The combination of local and global can not only alleviate the data sparsity problem by using different sources of information but also com-prehensively integrate users' multiple interests to reasonably recommend items that they are more interested in.

Analysis of Linear Fusion Module
In order to verify the effectiveness of the linear fusion module designed in this paper, we compared three fusion methods: additive, mean, and linear weighted. From Table 3, it can be seen that the performance of using additive and mean methods are inferior to using linear weighted fusion, indicating that the proportion of local and global interests in the overall interest distribution of users is not the same, and using parameter control is more conducive to improving the performance of the model.

Results in Sparse Scenarios
In KGG, we explored users' overall interest preferences using both rating information and knowledge graph information, which also further alleviates the existing data sparsity problem. To investigate the efficacy of the KGE module in sparse scenarios, we varied the ratio of the training set of MovieLens-1M from 100% to 10% (while the validation and test set are kept fixed) and reported the results of AUC in CTR prediction for all methods. The results are shown in Table 4. In Table 4, we observed that the performance of all methods degraded with the reduction in the training set; however, the performance of KGG has been consistently lower than these comparative methods. When the training set is partitioned into 10%, the AUC score decreases by 15.8%, 15.9%, 11.6%, 12.2%, 8.4% and 12.1% for PER, CKE, DKN, Wide&Deep, RippleNet and KGCN, respectively, compared with the case when the full training set is used. However, the AUC of KGG only decreases by 6.0%, which demonstrates that KGG can still maintain a decent performance in the case of data sparsity.

Conclusions
This paper proposes a recommendation algorithm that combines local and global interest features. KGG utilizes Knowledge Graph Convolutional Networks to capture rich target item representations in the knowledge graph, acquires users' local interest features through matrix factorization and learns global interest features using GANs in the interaction matrix. Finally, the designed linear fusion module effectively integrates local and global interest features. Extensive experiments on three real-world datasets demonstrate the effectiveness of the KGG model.
In the future, we plan to start from the training method of the model, from step-by-step training to joint training, to form an end-to-end model. At the same time, in the global interest feature module, we will add some unused historical interaction information (such as user comments) to improve the performance of the global interest module.