Attention Knowledge Network Combining Explicit and Implicit Information

: The existing knowledge graph embedding (KGE) method has achieved good performance in recommendation systems. However, the relevancy degree among entities reduces gradually along the spread in the knowledge graph. Focusing on the explicit and implicit relationships among entities, this paper proposes an attention knowledge network combining explicit and implicit information (AKNEI) to effectively capture and exactly describe the correlation between entities in the knowledge graph. First, we design an information-sharing layer (ISL) to realize information sharing between projects and entities through implicit interaction. We innovatively propose a cross-feature fusion module to extract high-order feature information in the model. At the same time, this paper uses the attention mechanism to solve the problem of the decline of information relevance in the process of knowledge graph propagation. Finally, the features of KGE and cross feature fusion module are integrated into the end-to-end learning framework, the item information in the recommendation task and the knowledge graph entity information are interacted implicitly and explicitly, and the characteristics between them are automatically learned. We performed extensive experiments on multiple public datasets that include movies, music, and books. According to the experimental results, our model has a great improvement in performance compared with the latest baseline.


Introduction
Currently, the amount of data has exploded, and recommendation systems have become one of the methods for solving data overload.Collaborative filtering (CF) [1] is widely used in various recommendation scenarios and is one of the most classic recommendation algorithms.As the most popular recommendation technique, CF utilizes the users' historical interaction information and the same preferences to provide the users with personalized recommendations.However, the CF method has the problems of cold start and data sparseness.To alleviate these problems and improve recommendation performance, researchers usually use the following: rich scene-related information, such as social networks [2], where the addition of social information not only reduces the sparsity of data, but also more accurately expresses users' preferences; user-item attribute [3], which provides richer types of information, allowing users to select more specific and accurate items and enrich the amount of data; and context information [4].Although the addition of scene information improves recommendation performance to a certain extent, it ignores the correlation between the information, and such improvements are limited.In further studies [5][6][7][8][9], researchers noted that information is interconnected, and through these associations, entity information (users or items) can be combined to form a knowledge graph (KG).For example, the LBSN [5] network is an integration of an RNNbased network and key-value memory network (KV-MN), and the model also uses the correlation information of the knowledge base to enhance the semantic representation of the model.Normally, KG is a graph with specific directions composed of multiple nodes and connections between nodes.The nodes in the graph represent entities (which can be users or items), and the connections between entities are called edges that represent the relationship between two entities.Profiting from the KG construction method, the algorithm based on KG has the advantages of interpretability and strong scalability.
Therefore, the KG algorithm with the above advantages has been well applied in recommendation systems.In the related research on using KGs to improve recommendation performance, a method based on knowledge graph embedding (KGE) [6,9,10] is one of them.For example, a deep knowledge perception network (DKN) [7] is a model that combines entity embedding and a convolutional neural network (CNN) [11].The network integrates knowledge graph representation for news recommendation, which is a content-based click-prediction framework.The key part of the model is a perceptive convolutional neural network using multi-channel word entity alignment, which can dynamically aggregate users' history records and candidate news.Embedded-based methods have a high degree of flexibility, but they are more suitable for tasks such as application and link prediction than for personalized recommendations.Based on the KG path method, KG is usually regarded as a network of heterogeneous information, such as the personalized entity recommendation (PER) [8] proposed by Xiao Yu et al.The model uses the potential relationships in the KG original path to make different types of combinations, and the hidden information of the user's historical interaction is represented by the meta-path.However, the graph of the PER model relies on manual design, making it impossible to automate feature learning.RippleNet [6] utilizes user preferences to propagate in the KG, the model divides KG into multiple levels, including the preferences of users at different levels so as to tap the potential interests of users.However, because users have different preferences for different layers of KG, RippleNet cannot accurately capture user interests in each layer.The above models also have a common problem: how to obtain the potential feature information between the entity object in the knowledge graph and the item that needs to be predicted.Regarding the limitations of existing problems, this paper proposes a deep network framework (AKNEI) based on multiattention mechanism joint graph embedding, which uses KGE to assist in recommendation tasks.Recommendation tasks and KGE tasks are highly related to each other.The ISL layer embeds the KG entity formed by the user interaction history with the project to supplement and share information.The high-order features extracted by the multi-layer cross feature fusion network are fused with the information extracted by the ripple network of the attention mechanism to form an end-to-end model.This article makes the following contributions: (1) We designed an ISL layer between KGE and recommendation tasks to connect KGE and recommendation tasks for feature sharing, automatically transfer interactive information during training, and obtain implicit semantics of entities and items.The ISL layer improves the model's anti-noise and generalization capabilities.(2) We propose a cross feature fusion network that can explicitly extract features.This method can perform high-level explicit cross fusion of features, and the interaction of features occurs at the vector level, which reduces the number of parameters compared with the traditional neural network model that occurs at the element level.The network can retain the key information of each layer during the propagation process, preventing key information from being lost during the propagation process.(3) Based on the ISL layer, we designed a multilayered corrugated network based on multihead attention, which can consider the changes in users' interests at different levels and achieve more accurate recommendations.Finally, combined with the cross feature fusion layer for high-order feature interaction, the shared information between 3 of 19 KG entities and items can be fully utilized.We conducted experiments on multiple datasets, and good results were achieved on both large-scale and small datasets.

AKNEI Model
In this part, the proposed AKNEI model is introduced, and we give a detailed explanation of our research methods and details.

Problem Statement
In the recommendation model combined with the KG graph, given a set The interactive information between the user and the item constitutes a matrix Y = {y wv | u ∈ U, v ∈ V}.When y uv = 1, it indicates that there have been historical interactions between users and items, such as watching, collecting and playing.Conversely, y uv = 0 means that the user has never interacted with the item.The knowledge graph G existing in the model is connected by entity relationships to form G = {h, r, t}, where h ∈ E is the head of the graph, t ∈ E is the tail of the graph, E = {e 1 , e 2 , • • • } is the entity set of the knowledge graph, and r ∈ R and R = {r 1 , r 2 , • • • } are the relationships between entities.For example, Mark Osborne, the director of the movie Kung Fu Panda, also made the movie The Little Prince.The two movies can be regarded as entities in a KG, and the common director can be regarded as the relationship between entities.Our goal is to use the interaction matrix Y and the knowledge graph G to analyze the user's behavior and predict the items V that the user is interested in, and these are items that users may like but with which they have never had historical interactions.The prediction function is ŷuv = F(u, v; θ, Y, G), and ŷuv is the possible interaction probability between user u and item v, where θ is the parameter of the prediction function F.

Model Framework
The overall framework of our proposed AKNEI is shown in Figure 1.The input of the recommendation layer is user and item embedding, and the input of the KG layer is a triple graph G.The recommendation module is a set {(h, r, t) where k is a ternary set of knowledge graphs of each layer of ripples.These collections interact with item embeddings, and the attention module is used to extract each layer of interactive information to form embeddings. CIN conducts high-level interactions with the item embeddings on the user preferences of each layer of the ripple network, forms the final embedding with the information extracted by the attention, and finally makes predictions.The ISL layer establishes a low-level informationsharing channel between the recommendation task and KGE, which can automatically learn feature interaction and complement information.

Information-Sharing Layer (ISL)
The ISL layer can implicitly interact the item information in the recommendation task with the entity information in the KG, and it can also supplement the two parts of the information.The design comes from [12].As shown in Figure 2, for entity e in item v and the KG, its features v l ∈ R d and e l ∈ R d where d is the hidden layer dimension, their interaction process is Among them, S l ∈ R d×d .Through this interaction process, the information between the item and the entity is shared, and the feature interaction is displayed and modeled.Using this matrix as the input of the next layer of interaction, the process is as follows: where w l ∈ R d and b l ∈ R d are trainable weights and bias terms, respectively.Through this operation, the interactive matrix space R d×d is compressed into the vector space R d .The ISL layer usually has a better information interaction effect at the lower layer of the network.The deeper the network layer, the more special the higher-order features, and the transferability is significantly reduced [13].

ISL Analysis
We next introduce the models related to the ISL layer theory to prove the feature interaction capabilities of ISL and explain them conceptually.
Factorization machine (FM) The factorization machine [14] is a commonly used method in recommendation systems.FM uses factorization to interactively model the input features and has a good estimate of the sparse problem.The equation of the 2-degree factorization machine model is Denote the i-th value of the input vector x as x i , p is the weight of the input vector, and •, • represents the dot product calculation between vectors.The above formula is similar to the 1-layer ISL.The L1-norm formula for the interaction of V 1 and e 1 is as follows: where w i , w j = w i + w j is the sum of scalars; below, we provide the proof of the above formula.Without retaining the previous layer of information, the formula is written as follows: When The L1-norm of V is The proof for e 1 is similar.Different from FM, the FM parameter is the dot product of the interaction vector weight parameters, while ISL is the sum of the weight parameters.Compared to FM, the total amount of parameters decreases.
Cross-stitch network Cross-stitch network [12] units can establish sharing between two tasks, and at the same time, carry out a specific representation of the two tasks: where y n and y m are given two task maps, delta is the weight between the two task graphs, and i and j are shown at position (i, j) in the figure.Input is provided for the next layer of filters through linear combinations.This is similar to our ISL unit.Without considering the bias, we can write the ISL in the following form: Similar to cross-stitch networks, the ISL unit can adapt to specific tasks through weight distribution and obtain shared information between different tasks.

KG Layer
The knowledge graph module expresses the node and the association between nodes in the KG graph in the form of vectors.There are many methods based on KGE, such as the relationship embedding of Antoine Bordes et al. [15], and the entity relationship embedding model used by Yankai Lin and others for knowledge graph completion [16], Hanxiao Liu et al.'s multirelation embedding model [17] and Maximilian Nickel's holographic embedding model of knowledge graphs [18].Their models have brought us different knowledge graph embedding methods.Our model uses the characteristics of the head node h and uses the relationship r between the nodes in the knowledge graph to predict the tail node t.The operation is as follows: where σ(•) is a nonlinear activation function and S(h) is the set of all nodes h in KG, N is the set of entities associated with h.Here, we briefly describe the interaction process through the ISL layer as I(v, h), and function f KG is the score function:

Attention Ripple Layer
This layer belongs to the recommendation module in the AKNEI framework, which uses the dissemination of user preferences on the KG and is an improvement on the framework proposed by RippleNet [6].The principle of corrugated layer propagation is to start from a clicked entity and extend outward along the different relationships between entities.As the process of outward propagation, the correlation between entities gradually declines, similar to the formation of water droplets on the water surface to form outward diffusion corrugation; the smaller the outer layer, the smaller the corrugation, as shown in Figure 3.The KG is used to spread user preferences in layers [6].Considering that the preferences of physical users in each hop (each layer) are different, we add a multihead attention mechanism [19] to assign different weights to each layer.This makes the model more accurate in predicting user preferences.The multihead attention mechanism has excellent performance in modeling complex relationships.For example, machine translation [20] and sentence embedding [21] have shown excellent performance, and they have also been applied in the similarity capture of graph embedding nodes [22].We are given an interaction matrix Y = I(v, h) and a knowledge graph G.The user's entity composition set at the k layer is as follows: where ξ 0 u = V u = {v | y uv = 1} is the collection of items that the user has interacted with, which can be regarded as the 0th layer in the ripple.The k-th jump obtains the ripple set as ξ k u , which is derived from the upper layer set ξ k−1 u in the ripple diagram: As shown in Figure 1, item v comes from item embedding v ∈ R d , and the item v is merged with the head node h i and the relationship r i in each triplet (h i , r i , t i ) in the ripplenet network ξ k u to allocate the association probability: here R i ∈ R d×d is the embedding of relation r i , h i ∈ R d is the embedding of h i , and p i is the similarity between item v and head node h in relation space R i .Probabilistically weight the similarity with the tail node t in Q 1 u : where t i ∈ R d is the embedding of the tail node t i , and the vector b 1 u is the correspondence of user's u click history v u to item v. Performing the above operations for each layer of the corrugated network can obtain the second and nth layer responses b 2 u and b n u .The difference from RippleNet is that we input the corresponding input of each layer to the multihead attention module instead of simply adding them.The correlation between any two layers b k u and b k+1 u in the above is defined under the attention head i.The specific operations are as follows: where ψ (i) (•, •) is a similarity function, which can be a neural network or an inner product.
In the model, we use the inner product because it is simple and efficient.W Key ∈ R d ×d the transformation matrix that maps the original space R d to the new feature space R d .Next, we update the b k+1 u feature under the i space of the attention head by combining the coefficient α where bk+1 u (i) is the feature updated in i space, W Value ∈ R d ×d .For the situation where there are multiple combinations of multiple features, multiple heads are used to create different head spaces for different feature interactions.The combined features in all subspaces are as follows: where ⊕ is a connection symbol that connects multiple vectors into one vector and i is the input head in the multihead attention mechanism.Finally, we retain the original combined features and obtain where W Res ∈ R d i×d is the item matrix in the case of dimensional mismatch [23], w T is the weight matrix, b bias is the amount of paranoia, and σ = 1/(1 + e −x ) converts the value to the user's click-through rate.We can stack multiple such interactive layers for feature update, and the input of the current network layer is the output of the previous network layer so that multilayer features can be modeled.The multihead attention block diagram is shown in Figure 4.For an explanation, see [24].
Block diagram of the multihead attention mechanism, where MATMUL is the matrix product, scale is the matrix of different dimensions, softmax is the activation function, and h is the number of heads.

Cross Feature Fusion Network (CFFN)
At present, traditional neural DNNs [11], CNNs [25] and other networks are used to automatically learn feature interaction models, most of which are based on the framework of factorization machines [14,26,27], and multilayer fully connected neural networks are used for automatic feature learning.For high-level interactions, however, the implicit interaction process is unexplainable and unknown.Feature interaction is at the element level, and the amount of parameters is obviously more than that at the feature level, which does not conform to the original idea of the factorization machine.The cross-feature fusion module performs explicit feature interaction and has good interpretability.The feature is interaction at the vector level, and the amount of parameters is relatively small [28].In the process of propagation, the network can retain the key information of each propagation layer and finally aggregate it so as to prevent the loss of key feature information during the propagation process.The cross feature fusion module is shown in Figure 5.The specific operations are as follows: where W m×d is a mapping matrix.The product operation of vector b 1 u and matrix W m×d can map vector b 1 u to m × d-dimensional space, X 1 j, * is the j-th column vector of the embedding matrix X 1 , X n j, * is the embedding vector 1 ≤ n < k of the ripple network corresponding to the CFFN propagation layer, X k ∈ R H k ×d is the kth layer output matrix in the cross feature fusion network, and the number of feature vectors is expressed as H k .Set the first layer to to be the parameter matrix of the h-th eigenvector.• represents the Hadamard product, for example, (a 1 , a 2 ) . The number of cross feature fusion layers determines the degree of high-order feature interaction.The pooling operation connects the output layer and the hidden layer, so the output of each layer depends on the previous hidden layer and additional layers.Therefore, it can be ensured that the output unit can obtain characteristic interaction modes of different orders.The above formula is closely related to the CNN, as shown in Figure 5.The intermediate tensor Z k+1 is introduced, which is the outer product of the hidden layer X k−1 and the n-th ripple propagation embedding matrix X n .This tensor is similar to the graph in CNN, the embedding dimension d can be seen as the number of channels, and W k,h is a filter that extracts the features of the d layer to obtain the hidden vector X k+1 i, * .Figure 1 shows the propagation process of the cross feature fusion network in the model.We set the maximum number of layers as T and then k ∈ [1, T] and connect all the layers as follows: where i ∈ [1, H k ], the pooling vector of length H k in the kth layer is , and the connection between the hidden layer and the output layer is expressed as The final prediction result is where w o is the regression parameter.
2023, 11, 724 10 of 19 Calculating the outer product of X k−1 and X n to obtain the intermediate tensor Z k+1 , the red mark indicates that the intermediate tensor is compressed into an embedding vector.

Cross Feature Fusion Network Analysis
In this part, we discussed the interactive characteristics of the cross feature fusion network.We first set the number of features of the hidden layer and the number of fields m to be equal.The first layer contains h feature maps, and its formula can be written as where [m] is a positive integer less than m.It can be seen in the formula that each feature model of the first layer has a paired interaction with O m 2 coefficients.In the same way, the second layer is calculated as follows: Our purpose is to prove that the number of cross feature fusion network parameters is only O(km 3 ), while the parameters of the classical k-order polynomial are O(m 3 ).The feature mapping of the K layer can be summarized as Because the 0th layer is used to represent the final feature map, we replace x 0 i with x i for convenience.Superscripts are used to denote operations, such as ) be a polynomial of degree k containing multiple vectors: The parameter of the above classical vector polynomial is O m k , and the cross feature fusion network parameter formula is as follows: a multi-index and P |β| is the permutation set of all indices.

Learning Algorithm
We combine each module to form the complete loss function as where V represents the item embedding matrix, E represents the entity embedding matrix, I r is the slice of the embedding tensor I in the KG representing the relationship r, and the relevant information r in KG is embedded in the matrix to obtain R. In the first term, J (∵, is the loss function between the calculated real data value ŷuv and the model prediction value ŷuv , called cross entropy loss.The second term is the square error between the true value in the KG and the reconstruction matrix.The last item is a regularization term to prevent overfitting.The optimization algorithm we choose is stochastic gradient descent (SGD).To improve computational efficiency, ref. [29] provided us with a negative sampling strategy, which can improve sampling efficiency.We apply the parameter analysis of the specific model in the experimental part.

Links to Existing Work
In this section, we introduce the difference between related work and our method.At present, deep learning is increasingly widely used in recommendation systems, and it has good performance in many recommendation scenarios to fit the interaction behavior of users and items into the neural network model [11].For example, neural collaborative filtering [30] is used for the high-level interaction of users and items.Our method differs from the above methods in terms of high-level feature interaction.The cross feature fusion network is different from traditional neural networks.The displayed feature interaction has good interpretability, and its feature interaction effectively reduces the number of parameters at the vector level.
The interpretability in the recommendation system refers to the reason why the user likes a certain item.By analyzing the reason, a more accurate personalized recommendation result can be put forward for the user.At present, interpretability is generally based on community tags [31], social networks [32], and emotional semantics [33,34].In the recommendation module, RippleNet is based on a KG to find user interests.It has strong interpretability by tracking user history and the path of related items [6].Our feature interaction layer uses a feature matrix method similar to MKR [35] and RuleRec [36] to achieve the complementarity of KG information and recommended information.The existence of the ISL layer makes our model share implicit information.Our method can be regarded as a multitask framework, using a KG for auxiliary recommendation, which can automatically transfer information and learn feature interactions.It is a dataset of an online music website that contains the listening information of 2000 users.

Experiment
We need to preprocess the above datasets before the experiment.For the MovieLens-1M and MovieLens-20M datasets, we use a score greater than or equal to 4 as positive feedback.Book-Crossing and Last.FM consider sparsity for the question; listening information and scoring information are regarded as positive feedback.Microsoft Satori is used as the KG construction engine to extract knowledge triples.Detailed information about the dataset is shown in Table 1.This model is an end-to-end recommendation method incorporating knowledge graph embeddings, which enhances the representation of latent information. •

CAKR [40]:
A new method is designed to optimize the feature interaction between items and corresponding entities in knowledge graphs, and a feature intersection unit combined with the attention mechanism is proposed to enhance the recommendation effect.

Experimental Setup
We set the corresponding hyperparameters for the four datasets.As shown in Table 2, other baseline hyperparameters default to their original settings.The parameter d in the table is the embedding dimension of the model, parameter H is the number of layers of the RippleNet, η represents the learning rate of the recommended task, µ is the weight of the KG, and λ represents the L2 regularization weight.In terms of evaluation indicators, we use AUC and ACC to evaluate our model.The AUC is an important indicator for evaluating the quality of a classifier.The higher the AUC and ACC values, the better.Divide the dataset at a ratio of 6:2:2.

Experimental Results
We performed click-through rate (CTR) predictions of different models on the 4 datasets.In the top-K recommendation, we choose Recall@ as the evaluation indicator of the model.We show the experimental results in Table 3 and Figure 6 in the form of tables and graphs respectively.The experimental analysis is as follows:

•
Compared with other baselines, DKN has the worst performance.Whether it is on movie data with large data volumes or small and sparse music and book datasets, its performance is not satisfactory.DKN entity embedding before input cannot participate in the learning process.

•
From the comparison experiment, PER performs better than DKN on each dataset but the performance is not good enough compared to other baselines.The reason is that the metapath defined by the PER has difficulty achieving the optimal value and cannot integrate heterogeneous relationship information.

•
The performance of LibFM and Wide&Deep is relatively good on the four datasets.It shows that the model has a high information utilization rate for KG, and can extract relevant information well to improve recommendation performance.

•
RippleNet has shown excellent performance, which shows that it has a significant effect in capturing user preferences.From the comparison of the four datasets, RippleNet has reduced performance in the case of sparse data and has a strong dependence on data density.

•
Compared with the MKR and the KGCN, the KGCN is more sensitive to the sparsity of data and performs weaker on sparse book datasets, but the KGCN performs better than the MKR on datasets with large data volumes, indicating that the KGCN is more suitable for obtaining large data on recommended occasions.• CAKR benefits from its improved interaction module, and the feature intersection unit of the attention mechanism has certain advantages in complementing information.Therefore, the model performs better on sparse datasets.

•
It can be seen from the experimental results that the AKNEI model proposed in this paper performs better than the baseline model mentioned above on the four datasets.In the MovieLens-20M dataset, the AUC increases by This proves that our model has a higher utilization of user behavior data.In this part, we analyze the performance of each module in the datasets Movielens-1M and Last.FM.The results are shown in Figure 7.For the Last.FM dataset, the cross feature fusion network layer has the greatest impact on the model, indicating that the cross feature fusion network can alleviate the problem of data sparseness to a certain extent.For MovieLens-1M with more data, the performance of the ISL layer is better, indicating that the interaction between the item and the entity can obtain more valuable information.Through the comparison of the two datasets, the RippleNet and the ISL modules are more sensitive to the sparsity of the data, but they perform better when the amount of data is relatively abundant.The AKNEI model integrates the advantages of each module, which makes the model more generalized and able to achieve better performance in different recommended scenarios.

Impact of RippleNet Layers and Attention Module
In this part, we analyze the number of layers of RippleNet and the influence of the attention module on model performance.As shown in Table 4, by changing RippleNet layer number H, we observe the performance changes.The model performance is best when H is 3.This phenomenon is attributed to the reality that the user's interest information will decrease as the graph propagates.At longer distances, the influence of noise will be greater than the effective information.When the distance is too close, it is difficult to dig out relevant information and dependencies between entities.Table 5 shows the influence of the attention module.It is not difficult to see that the addition of the attention module improves the performance of the model.This is because the attention module adds different weights to each layer of information to distinguish users' preferences for different layers.

Parametric Analysis
In this part, we discuss the impact of embedding dimensions and KGE weights on model performance without changing other parameters.

Impact of Embedding Dimension
We conducted an experimental analysis on the relationship between the size of the embedding dimension and the expressive power of the model.The change of the model with the embedding dimension is shown in Figure 8; the performance of the model will increase as the embedding dimension increases, and will decrease after reaching the critical point.Because a larger dimensional embedding can increase the expressiveness of the model, but the model will fit more noise, too-high dimensional embedding will cause the model to appear as over-fitting.

Impact of KEG Weight
As shown in Figure 9, the performance of the model shows a trend of first rising and then falling as the KGE weight increases.This is because a small weight will make the regularization constraint of the KGE term too small and cause overfitting of the model, while too large a weight will cause the objective function to be misled by KGE and deviate from the true value.

Sensitivity Analysis of the Model to Data Sparseness
Next, we conduct experimental analysis on the performance of the model in different sparse environments.The purpose is to analyze the model's ability to deal with data sparsity and alleviate cold-start problems.We conduct experimental comparisons under the M dataset.From the experimental results in Table 6, reduce the number of training sets to 30%, 60%, and 90% input model.It can be concluded that when the amount of data in the training set decreases, the performance of all models will decrease accordingly.In the amount of data decreases during training, the MKR model has better performance, and the AUC performance compared to the full training set decreases by 4.5%, 2.2%, and 0.7%, respectively, and ACC is 5.5%, 3.5%, and 0.1%.The AKNEI model corresponds to AUC reduction rates of 3.8%, 2.5%, and 0.4%, respectively, while the ACC is 4.3%, 3.1%, and 0.2%.The AKNEI still shows good performance, even when the data are sparse.

Figure 1 .Figure 2 .
Figure 1.AKNEI framework diagram, composed of a KG layer, cross feature fusion network layer and a multilayer corrugated network with attention.Through the ISL layer, an information-sharing channel is established between the KG and the recommendation task, and information sharing is realized at the bottom layer.

3 Figure 3 .
Figure 3. Ripple propagation diagram.Dots are entities, colors represent different categories, and arrows are the relationships between entities.As the number of hops between entities increases, the correlation decreases.

Table 1 .
Details of four public datasets.It is a feature model that combines multiple tasks, in which the KG is used as an auxiliary for recommendation tasks.
[38]CN[38]: It uses the relationship attributes of the KG to mine the associations of items to learn the potential interests of users.•Ripp-MKR [39]:

Table 3 .
The prediction result under the public dataset.

Table 4 .
Influence of the number of Ripple layers.

Table 5 .
The influence of the attention module.