Next Article in Journal
Towards Sustainable Tourism Design: What Drives Tourist Loyalty? A Structural Equation Modeling Approach to a Tourist Experience Evaluation Scale
Previous Article in Journal
Idea vs. Reality: Perspectives and Barriers to the Development of Community-Supported Agriculture in Poland
Previous Article in Special Issue
Impact of Digital Transformation on Sustainable Development of Port Performance: Evidence from Tangshan Port
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

KnoChain: Knowledge-Aware Recommendation for Alleviating Cold Start in Sustainable Procurement

1
Ningbo Development Planning Research Institute, Ningbo 315040, China
2
School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China
*
Author to whom correspondence should be addressed.
Sustainability 2026, 18(1), 506; https://doi.org/10.3390/su18010506 (registering DOI)
Submission received: 24 November 2025 / Revised: 25 December 2025 / Accepted: 26 December 2025 / Published: 4 January 2026

Abstract

When new purchasers or products are added in the supply chain management system, the recommendation system will face severe challenges of data sparsity and cold start. A knowledge graph that can enrich the representations of both procurement managers and products offers a promising pathway to mitigate the challenges. This paper proposes a knowledge-aware recommendation network for supply chain management, called KnoChain. The proposed model refines purchaser representations through outward propagation along knowledge graph links and enhances product representations via inward aggregation of multi-hop neighbourhood information. This dual approach enables the simultaneous discovery of purchasers’ latent preferences and products’ underlying characteristics, facilitating precise and personalised recommendations. Extensive experiments on three real-world datasets demonstrate that the proposed method consistently outperforms several state-of-the-art baselines, achieving average AUC improvements of 9.36%, 5.91%, and 8.81%, and average accuracy gains of 8.56%, 6.27%, and 8.67% on the movie, book, and music datasets, respectively. These results underscore the model’s potential to enhance recommendation robustness in supply chain management. The KnoChain framework proposed in this article combines purchaser-aware attention with knowledge graphs to improve the accuracy of purchaser SKU matching. The method can help enhance supply chain resilience and reduce returns caused by over-ordering, inventory backlog, and incorrect procurement. In addition, the model provides interpretable recommendation paths based on the knowledge graph, which improves trust and auditability for procurement personnel and helps balance environmental and operational costs.

1. Introduction

Recommender systems (RS) are widely recognised as essential tools for optimising supply chain operations, enabling dynamic matching between purchasers and procurement demands to improve overall responsiveness [1]. Under disruptive scenarios such as logistics delays or material shortages, alternative solutions can be rapidly identified, thereby mitigating operational risks [2]. However, conventional recommendation approaches heavily rely on historical interaction data. When new products or purchasers with limited records are encountered, the effectiveness of recommendations is often compromised—a phenomenon known as the cold-start problem [3].
Some works explore the integration of side information into RS to solve the cold-start problem. Side information can be easily obtained in real-world scenarios, including text [4], user/item attributes [5], images [6], and context data [7]. Among various forms of side information, knowledge graphs (KGs) are network structures composed of nodes and edges that contain rich relational and semantic data. KGs have been successfully applied in various domains, including search engines [8], question answering [9], and word embeddings [10].
The application of KGs in recommendation systems has been extensively studied in recent years. KG-based recommendation systems primarily utilise semantic and relational information to capture fine-grained relationships between users and items, enabling personalised recommendations. MKR [11] is a multi-task feature learning approach that integrates KG embedding (KGE) into recommendation systems. RippleNet [12] employs preference propagation along KG links to uncover users’ hierarchical interests and enrich their embeddings. KGCN [13] combines neighbourhood and biased information to represent entities within the KG. KGAT [14] addresses user preference recommendations by focusing on similar users and items. The method expands paths between users and items by considering path propagation across various relationships within the KG. DKEN [15] models the semantics of the entity and relationship in the KG using deep learning and KGE techniques. KGIN [16] employs multiple potential intents to model user–item relationships and uses relationship path-aware aggregation to highlight the dependence of relationships in long-term interactions.
KG-based methods are widely used to help supply chain management [17,18,19,20]. The structured relationship mapping between entities in KGs can also alleviate the cold-start problem in supply chain recommendation. The buyer attributes, production dynamics, and logistics status information can be integrated into the semantically related knowledge map. The structural information and semantic information of the knowledge map enrich the feature representation of buyers and products with insufficient historical interactive data, so as to realise personalised and accurate recommendations. This method can achieve dynamic and efficient matching between suppliers and purchasing demands to improve the responsiveness and flexibility of the entire supply chain. In disruptive scenarios such as logistics delays, material shortages, or geopolitical events, effective recommendations enable rapid identification of alternative suppliers or materials, thereby reducing operational risks.
In this paper, we propose a recommendation system that simultaneously extracts purchaser and product features utilising semantic and relational information from the KG called KnoChain. Inspired by RippleNet [12], KnoChain expands the purchasers’ feature by applying the outward propagation technique, utilising higher-order structural information from the KG. The product’s feature is enhanced by aggregating adjacent entity information inward from the KG using common graph convolution methods. KnoChain aims to simultaneously extract purchaser and product features by mining multi-hop relationships and higher-order semantic information between entities. The model is applied to click through rate (CTR) prediction experiments on movie, book, and music datasets. The experimental results demonstrate that the proposed model achieves average AUC gains of 9.36%, 5.91%, and 8.81%, as well as average ACC gains of 8.56%, 6.27%, and 8.67% across the movie, book, and music datasets, compared to several advanced approaches. Experimental results demonstrate the effectiveness of the dual-path framework, where purchaser behaviours are propagated to capture implicit preferences, while product characteristics are aggregated to improve representation learning.
The main contributions of this paper are summarised as follows:
  • We propose a novel bidirectional feature expansion framework that learns purchaser and product features in supply chain recommendation. Utilising the semantic and connective information of KG, the sparsity and cold-start problems can be alleviated.
  • We simultaneously design an outward propagation mechanism to expand user representations and an inward aggregation mechanism to enrich item representations, enabling a more holistic capture of supply chain entity relationships.
  • We validate the superiority of the proposed method over several state-of-the-art baselines through comprehensive experiments on three benchmark datasets, with detailed analysis on hyper-parameter sensitivity and component effectiveness.
To provide a clearer roadmap for this study, we explicitly state the core research questions (RQs) that guide our work. These RQs articulate how KG information can be leveraged to jointly enrich purchaser preferences and product features, how the proposed bidirectional feature expansion framework operates, and how its design choices affect recommendation performance—especially under data sparsity and cold-start conditions. Each RQ is addressed through the model design (Section 3), empirical evaluation (Section 4), discussion (Section 5), and analysis of practical implications (Section 6).
Explicit Research Questions RQ1: How can purchaser preferences and product features be co-enriched from a KG to alleviate data sparsity and cold-start issues in supply chain recommendation? (Answered in: Section 3—preference propagation and information aggregation modules; Section 4—cold-start/sparse-case experiments). RQ2: Can a bidirectional feature expansion framework that uses outward propagation for purchasers and inward aggregation for products effectively learn joint representations? What are its core mechanisms? (Answered in: Section 3.2—framework design and algorithm). RQ3: How does KnoChain perform relative to state-of-the-art baselines across datasets and tasks (e.g., movie/book/music CTR prediction), and how large are the improvements in sparse or cold-start scenarios? (Answered in: Section 4.4—comparative experiments and results). RQ4: Which model components and hyperparameters (e.g., hop number H, receptive-field depth N, interest-set size M, neighbour sampling K, aggregator choice) most strongly influence performance and robustness? (Answered in: Section 4.5—parameter analysis and ablation studies). RQ5: To what extent can results obtained on public recommendation datasets transfer to real supply chain deployments, and what are the next steps for constructing supply chain-specific interaction data and KG? (Addressed in: Section 4.1—concept mapping; Section 6—limitations and future work).
We will answer these RQs by describing KnoChain’s bidirectional propagation–aggregation architecture (Section 3), validating its effectiveness through systematic experiments and ablations (Section 4), discussing key points (Section 5), and future steps for supply chain-specific deployment (Section 6).

2. Related Work

2.1. Recommendation System in Supply Chain Management

A bio-inspired Termite Colony Optimisation (TCO) algorithm is proposed to enhance retail RS. Termite movement patterns are modelled to optimise recommendation decisions in continuous data streams, with empirical validation conducted using Big Bazar retail data to demonstrate improved optimisation in inventory management and personalised recommendations [21]. To balance transparency and decentralisation in food supply chain management, a hybrid blockchain model incorporating access control mechanisms is proposed; furthermore, an RS is integrated into this framework, significantly enhancing monitoring capability and inventory efficiency [22]. To address the critical challenge of counterfeit drugs in pharmaceutical supply chains, an Archimedes Optimisation with Enhanced Deep Learning-based Recommendation System (AOAEDL-RS) is developed, where a three-stage methodology incorporating word2vec embedding, context-based BiLSTM-CNN classification, and the Archimedes Optimisation Algorithm is employed to analyse drug reviews and generate reliable recommendations [23].
To enhance supply chain risk mitigation through explainable AI, an RS integrating counterfactual explanation constraints with transportation scheduling optimisation is proposed, where logistic regression-based counterfactual analysis is embedded as constraints to improve system resiliency against transportation delays while maintaining cost efficiency [24]. To overcome the cold-start problem in supply chain RS, a Hybrid Content-Based Association Clustering method is developed, where clustering and association rule techniques are integrated to address data sparsity issues and enhance recommendation accuracy for new users in e-commerce transactions [25]. To address the limitations of conventional friendship-based person-to-person recommendations in social networks, a novel framework leveraging supply chain interactions is proposed, where five hybrid methods combining artificial neural networks and fuzzy strategies are developed to enhance recommendation accuracy, with experimental validation conducted on the LinkedIn platform [26].
A blockchain-enhanced intelligent recommender system (BLC-IRS) framework is proposed to address supply chain disruptions. Smart contracts and system dynamics simulation are integrated to enable real-time resource identification and secure information exchange, providing an executable digital solution for reactive supply chain resilience management [27]. To enhance supply chain resilience through digital technologies, a data-driven intelligent RS framework is proposed. System dynamics modelling validates its effectiveness as a novel communication mechanism for disruption response, enabling supply chain participants to better mitigate and react to disruptions in the Industry 4.0 context [28]. A collaborative filtering-based RS is proposed to optimise supply chains. Similarity coefficients are utilised to identify optimal combinations of suppliers, manufacturers, and distributors that minimise costs and maximise customer satisfaction through order fulfilment process analysis [29]. To address pharmaceutical supply chain vulnerabilities and counterfeit drug challenges, an integrated blockchain and machine learning framework (DSCMR) is proposed [30]. Hyperledger Fabric enables transparent drug tracking, while LightGBM and N-gram models provide evidence-based drug recommendations through REST API integration, validated using UCI benchmark datasets.
A knowledge-based (KG) recommendation method for cold-chain logistics that addresses drawbacks due to a lack of semantic features and insufficient interpretability is proposed (called KGRCCL) [17]. SC-TKGR is based on a temporal KG and uses enhanced time-sensitive graph embedding methods to model temporal behavioural characteristics, incorporating external factors to capture market dynamics and using contrastive learning to efficiently handle sparse information [18]. A novel approach to supplier selection is proposed. This learns purchase, demand-procurement records, property-specific and global relatedness from a supply KG [19]. Disasters sometimes cause supply chain disruptions and affect the production and operations of upstream and downstream enterprises. A knowledge-based framework recommends alternative suppliers by utilising information about interactions between buyers and suppliers [20].

2.2. KG-Based Recommendation System

KG-based recommendation systems typically adopt three main strategies: path-based, embedding-based, and hybrid recommendation approaches.

2.2.1. Path-Based Recommendation

Path-based recommendation uses connections in the KG to generate recommendations. This method treats the KG as a network and constructs matching computations based on predefined rules between nodes. In its early development, it mainly integrated concepts from traditional RS to address recommendation tasks. Representative models include HeteroMF [31] and Hete-CF [32]. As research on path-based recommendation has advanced, it has been increasingly applied in real-world scenarios. A paper constructed a KG integrating patent domain information and company requirements, employed graph edit distance to generate weighted recommendation graphs, and proposed three strategies: supplementary, additive, and hybrid recommendations for patent matching [33]. Although path-based recommendation effectively utilises the network structure of the KG, it heavily depends on graph connectivity patterns and requires manual meta-path design, which limits its practical applicability.

2.2.2. Embedding-Based Recommendation

Embedding-based recommendation systems use knowledge graph embedding (KGE) to process the KG, mapping entities and relations into low-dimensional representation vectors [34]. These systems generally have two components: a graph embedding module and a recommendation module [35]. The graph embedding module learns features from the KG, while the recommendation module uses these features to create personal recommendations. Graph embedding models learn features in two ways: distance-based translation models and semantic-based matching models. Distance-based translation models project entities and relations into a continuous vector space. The KG is learned by computing the plausibility of triples using a scoring function. Examples of distance-based translation models are TransE [36] and TransH [37]. Semantic-based matching models use similarity-based scoring functions to estimate the probability of triples. Entities and relations are mapped into a latent semantic space for similarity measurement [38]. KG-based recommendation systems are widely favoured by researchers because they leverage the semantic relationships in KGs and are less affected by graph expansion. DKN proposed a content-based news recommendation model that incorporates KG [39]. The model links headline terms with KG entities, retrieves their associated entities, and generates knowledge-aware news representations through a multi-channel, word–entity-aligned Knowledge-Aware Convolutional Network (KCNN). The MKR model is a multi-task feature learning approach that integrates KGE into RS [11]. The KGE module extracts relational features from KG triples and applies a scoring function with supervised learning to optimise the predicted results. Embedding-based recommendation effectively overcomes the limitations of path-based methods by eliminating dependence on manually designed meta-paths and capturing rich semantic relationships. However, such models often fail to exploit multi-hop relationships in recommendations, resulting in limited interpretability of certain recommendation outcomes.

2.2.3. Hybrid-Based Recommendation

Some researchers have attempted to combine path-based and embedding-based approaches to form hybrid recommendation models. In these models, user preferences are first propagated throughout the KG, then learned through graph embeddings, and finally used by the recommendation module to generate recommendations. Representative hybrid-based models include RippleNet [12], KGCN [13], and KGAT [14]. RippleNet simulates the propagation of user interests within the KG. It uses items from the user’s interaction history as seed nodes and iteratively expands user interests along KG links, forming multi-layer ripple sets. The aggregation of these multi-layer ripple sets constitutes the user’s final feature representation. The propagation direction in KGCN is opposite to that in RippleNet. In KGCN, each entity representation is updated by aggregating information from its neighbouring entities. This aggregation process is repeated multiple times to obtain the final representation of entity features. KGAT integrates the user–item interaction matrix with the KG to form a collaborative KG. First, item representations are obtained through the embedding layer. Then, in the propagation layer, item representations are refined through recursive multi-hop propagation of neighbouring nodes using an attention mechanism. Finally, the prediction layer estimates the probability of a user interacting with an item to generate the final recommendation. Hybrid-based recommendation integrates the strengths of path-based and embedding-based approaches. It effectively utilises relational connections in the KG while simultaneously learning low-dimensional representations of entities and relations through embedding techniques. For KG construction and dataset preprocessing, we follow standard practices (e.g., entity linking and alignment as in KB4Rec) to ensure consistent entity coverage [40].

3. Method

This section will describe the problem formulation and then demonstrate the model’s overall framework, including preference propagation, information aggregation, recommendation module, and learning algorithm.

3.1. Problem Formulation

In supply chain management, the recommendation problem involves matching supply chain actors (procurement managers, plants) with relevant supply chain resources (suppliers, raw materials, logistics services). Let U = { u 1 ,   u 2 } denote the set of purchasers and V = { v 1 ,   v 2 } denote the set of products. The purchaser–product interaction matrix is defined as Y = { y u v | u U   , v V } according to the implicit feedback of purchasers. y u v = 1 indicates that there is an implicit interaction between purchaser u and product v, such as clicking, browsing or purchasing, otherwise y u v = 0. In addition to the interaction matrix Y, there is also a KG G available, which is composed of massive entity-relation-entity triples (h, r, t). Here, h ϵ , r γ and t ϵ denote the head, relation, and tail of a knowledge triple, respectively. ϵ and γ denote the set of entities and relations in the KG. When the purchaser–product interaction matrix Y and the KG G are given, the model aims to learn a prediction function y u v = F ( u , v | θ ,   Y ,   G ) , which denotes the probability that purchaser u will interact with product v and θ denotes the model parameters of the function F. The key notations are summarised in Table 1.

3.2. Framework of KnoChain

Figure 1 shows KnoChain’s architecture. It models complex, multi-hop relationships in supply chains. The model takes a purchaser u and a product v as input and outputs the predicted probability y u v that u will buy v.
For the input purchaser u, his historical interest set I u expands along the link to form the high-order interest set γ u , h ( h = 1 ,   2 , , H ) . The high-order interest set γ u , h ( h = 1 ,   2 , ... , H ) is far from the historical interest set h-order. These high-order interest sets are used to interact iteratively with the product embedding to get the response of purchaser u to the product v, which are then combined to form the final H-order purchaser vector representation u v . Purchasers’ potential interests can be viewed as layered extensions of their historical interests in the knowledge graph, activated by past interactions and progressively propagated along graph links.
For the input product v, firstly, the multi-hop parameter N needs to be set, and the neighbourhood entities set of product v within N hops is regarded as the receptive field N ( v ) . The purchaser-relation score ρ r u is used as a weight to compute the corresponding neighbourhood representation vector b v . Finally, the multi-hop neighbour vector representation and entity representation are aggregated to obtain the final N-order product vector representation v u .
In the end, according to the vector representation of purchaser u v and product v u , the predicted probability y u v is calculated.

3.2.1. Preference Propagation

In this section, we will show the process of purchaser feature expansion in detail. It employs outward propagation to refine the purchaser representation from the structural information of the KG. To depict the extended preferences of purchasers over KG, we recursively define the h-order-related entity set of purchaser u as follows:
ε u , h = { t | ( h ,   r ,   t ) G a n d h ε u , h 1 } , ( h = 1 ,   2 , , H )
where ε u , 0 = I u = { v | y u v = 1 } is the historical interest sets of purchaser.
Next, we define the h-order interest set of purchaser u:
γ u , h = { ( h ,   r ,   t ) | ( h ,   r ,   t ) G a n d h ε u , h 1 } , ( h = 1 ,   2 , , H )
From the above expression, we know that the purchaser’s potential interest is activated by its historical interest and spreads outward layer-by-layer along with the connection of the KG. The high-order interest set may become larger and larger and the purchaser’s interest intensity will gradually weaken as the number of orders h increases. Given this problem, attention should be paid to the following:
  • In practical operation, the further the distance of the relevant entity from the purchaser’s historical interest is, the more noise may be generated, so the maximum order H should not be too large. We will discuss the choice of H in Section 4.
  • In KnoChain, to reduce overhead computation, it is unnecessary to use the complete high-order interest set but to sample the neighbour set of a fixed size. We will discuss the choice of size of the high-order interest set M in each hop in Section 4.
Next, we will compute the relevance probability. In the relational space R i , the similarity between the product v and the head h i is compared in the triple γ u , 1 :
τ i = s o f t m a x ( V T R i h i )
where R i R d × d and h i R d are the embeddings of relation r i and head h i , respectively.
After the relevance probability is calculated, the corresponding weights of each tail t i in γ u , 1 are weighted and summed to obtain the vector a u , 1 :
a u , 1 = ( h i , r i , t i ) γ u , 1 τ i × t i
where t i R d is the embedding of the tail; a u , 1 can be seen as the 1-order response of the purchaser’s click history I u concerning product v.
It should be pointed out here that to repeat the process of preference propagation to obtain the second-order response a u , 2 of purchaser u, v in Equation (3) needs to be replaced by a u , 1 . This process can be iteratively used on the high-order interest set γ u , h of purchaser u when h = 1 ,   2 , , H .
When h = H, the responses of different orders are as follows: a u , 1 ,   a u , 2 , , a u , H . These responses are added up to the embedding of purchaser u:
u v = a u , 1 + a u , 2 + + a u , H

3.2.2. Information Aggregation

This section will introduce the product feature expansion process in detail. It uses inward aggregation to refine the product representation from their multi-hop neighbours. First of all, the process of product expansion in a single layer is introduced, where the entities set directly connected to v is represented by N ( v ) , and the relationship between entities e i and e j is represented by r e i , e j .
ρ r u shows the importance of relation r to purchaser u:
ρ r u = f ( u ,   r )
To make better use of the neighbourhood information of product v, we linearly combine all neighbourhood entities of product v, and e is the neighbourhood entity:
b v = e N ( v ) ρ r v , e u e
ρ r v , e u is the normalised purchaser-relation score.
ρ r v , e u = e ρ r v , e u e N ( v ) e ρ r v , e u
Because we need to aggregate neighbourhoods with bias based on these purchaser-specific scores, purchaser-relation scores act as an attentional mechanism when calculating the neighbourhood representation of an entity.
In a practical recommendation, we only need to select a neighbour set of a fixed size for each entity. In Section 4, we will discuss the size of parameter K.
After the neighbourhood representation b v is calculated, we aggregate it with the entity representation v and form 1-order vector representation of the product. In the process of aggregation, there are three types of aggregators:
Sum Aggregator
Firstly, the two representation vectors are added together, and then a nonlinear transformation is performed:
a g g s u m = σ ( W · ( v + b v ) + b )
W and b are the transformation weight and bias, respectively; σ is the nonlinear function ReLU.
Concat Aggregator [41]
Firstly, the two representation vectors are concatenated, then a nonlinear transformation is carried out:
a g g c o n c a t = σ ( W · c o n c a t a ( v ,   b v ) + b )
Neighbour Aggregator [42]
Outputs the neighbourhood representation of entity v directly:
a g g n e i g h b o r = σ ( W · b v + b )
Since the feature representation of an product is generated by aggregating its neighbours, aggregation is a crucial part of the model, and we will evaluate the three aggregators in the experiment.
In the previous module, the 1-order entity representation is derived from aggregating the entity itself and its neighbours. To mine the features of products from multiple dimensions, we need to extend the above process from one layer to multiple layers. The process works like this: First, the entity itself can obtain the 1-order entity feature representation by aggregating with its neighbour entities. Then, we repeat the process. The 1-order representation is propagated and aggregated to obtain the 2-order representation. The above steps are repeated for N times to get an N-order representation of an entity v u . As a result, the N-order representation of entity v is obtained by aggregating its own features with those of its N-hop neighbours.

3.2.3. Recommendation Module

Through the purchaser feature expansion module and product feature expansion module described above, we get the expanded purchaser feature representation and product feature representation. Two terms are inputted into the prediction function to calculate the probability that purchaser u clicks on product v; σ ( x ) is the sigmoid function:
y u , v = σ ( ( u v ) T v u )

3.2.4. Learning Algorithm

Equation (13) illustrates the loss function:
L = L R S + L K G + L R E G = u U , v V F ( y u v ,   y u v ) + λ 1 ( r R I r E T R E 2 2 ) + λ 2 ( V 2 2 + E 2 2 + r R R 2 2 )
In the above equation, the first term is the cross-entropy loss between the ground truth of the implicit relation matrix Y and the predicted value of KnoChain. The second term calculates the squared loss between the ground truth of I r and the reconstructed indicator matrix E T R E in the KG. The last is the regularisation term. λ 1 and λ 2 are the balancing parameters. Among them, V and E are the embedding matrices of items and all entity items, respectively. The learning algorithm of KnoChain is presented in Algorithm 1.
Algorithm 1: KnoChain algorithm
Input: Interaction matrix Y; knowledge graph G;
Output: Prediction function F ( u ,   v | θ ) ;
1: Initial all parameters;
2: Calculate the interest sets γ u , h for each purchaser u;
Calculate the receptive field Z for each product v;
3: For number of training iteration do
         For h = 1 , 2   , , H do
               For ( h i ,   r i ,   t i ) γ u , h do
                        τ i = s o f t m a x ( V T R i h i ) ;
                        a u , h = ( h i , r i , t i ) γ u , h 1 τ i × t i ;
                        u v = a u , 1 + a u , 2 + + a u , H ;
         For n = 1 ,   2 , , N do
                For e Z ( n ) do
                        ρ r u = f ( u ,   r ) ;
                        ρ r v , e u = e ρ r v , e u e N ( v ) e ρ r v , e u ;
                        b [ n 1 ] = e N ( v ) ρ r v , e u e [ n 1 ] ;
                        v [ n ] = a g g ( b [ n 1 ] ,   v [ n 1 ] ) ;
                        v u = a g g ( b [ N 1 ] ,   v [ N 1 ] ) ;
4: Calculate predicted probability y u v = f ( u v ,   v u ) ;
Update parameters by gradient descent;
To make the computation more efficient in each training iteration, we employ the stochastic gradient descent (SGD) algorithm when iteratively optimising the loss function. The gradients of the loss L concerning the model parameter θ are then calculated, and all parameters are updated based on the back-propagation of a small batch of samples. In the experimental section of the next chapter, we will discuss the selection of hyper-parameters.

3.3. Explainability

Figure 2 aims to reveal why a purchaser might be interested in another product after multi-hop in KG, which helps improve the understanding of the proposed model. KnoChain explores purchasers’ interests based on the KG of the product; it provides a new view of recommendation by tracking the paths from a purchaser’s history to a product with high relevance probability (Equation (3)). For example, a purchaser’s interest in product ”iPhone 14” might be explained by the path ”purchaser b u y iPhone 14 c o s t 4999 c o s t b y Huaweimate70pro”, the product ”Huaweimate70pro” is highly probable to be relevant to ”iPhone 14” and ”4999” in the purchaser’s one-hop and two-hop sets. KnoChain discovers relevant paths automatically, not manually as with other methods. We provide a visualised example in Figure 2 to demonstrate KnoChain’s explainability.

4. Experiments

4.1. Datasets

We evaluate the proposed model on three standard common datasets—MovieLens-1M (http://grouplens.org/datasets/movielens/1m/ accessed on 23 December 2025), Book-Crossing (http://ocelma.net/MusicRecommendationDataset/lastfm-1K.html accessed on 23 December 2025), and Last.fm (https://gitcode.com/open-source-toolkit/7a493 accessed on 23 December 2025)-to demonstrate its effectiveness. Although the dataset does not come directly from the supply chain, the model can directly adapt to the supply chain scenarios. The essence of the cold-start problem in the supply chain is that the newly launched products and new purchasers lack historical interactive data and cannot be effectively recommended. This is consistent with the cold-start problem faced by film, books, and music in recommending new films and new users. If a specific supply chain dataset, such as a manufacturing procurement database, is used in the future, the model can directly adapt to the knowledge graph structure of ”purchaser-product attributes-product”, achieving targeted cold-start mitigation. The product attributes can be manufacturer, product category, material, and so on. Taking the datasets used in this paper as examples, the correspondence of key concepts is intuitively displayed in Table 2. The basic statistics for the three datasets are shown in Table 3. Basic statistics for the three datasets are shown in Table 3; dataset preprocessing and KG linking follow standard procedures as in KB4Rec [40]. The primary reason we use widely adopted public benchmarks (MovieLens, Book-Crossing, Last.fm) is to evaluate our method in a fair, comparable, and reproducible setting against state-of-the-art baselines. We acknowledge that the abstract concept mapping in Table 2 does not fully capture the complexity of real-world supply chains, such as supplier reliability, multi-level BOM dependencies, variable lead times, capacity constraints, and disruption events. To make this limitation explicit, we note that the experiments reported here focus on the general cold-start and sparsity aspects of recommendation models rather than a comprehensive validation of every operational facet of supply chain systems. As future work, we plan to collect real procurement and supplier data through industry collaboration to construct domain-specific supply chain datasets and knowledge graphs, and to develop a reproducible simulation framework for controlled ablation and robustness studies. We will then re-evaluate KnoChain and the baselines on these real or simulated supply chain datasets, report both standard recommendation metrics and supply chain-specific KPIs (e.g., on-time fill rate, stockout frequency, average fulfilment time, and successful supplier-switch rate), and include ablation analyses isolating the effects of supplier reliability, lead times, and BOM edges.

4.2. Baselines

RippleNet [12] (KG propagation): RippleNet propagates user interests along KG paths to expand purchaser representations. We include it because it exemplifies outward multi-hop interest propagation, a key component of KnoChain and is a widely used KG-based state-of-the-art method for cold-start/interest expansion. KGCN [13] (KG neighbourhood aggregation): KGCN aggregates multi-hop neighbour information to build entity embeddings. It represents the inward aggregation family (product-centric neighbourhood aggregation), complementary to RippleNet user-centric propagation. Including KGCN lets us compare how different directions of KG information flow affect performance. DKN [39] (KG + text attention): DKN integrates KG information with text (word/entity embeddings + attention). We include DKN to represent methods that fuse KG structural information with auxiliary textual features; though less suitable for short item names in our datasets, it is an important representative of KG + content approaches. CKE [6] (KG for embedding augmentation): CKE uses KG structural information to augment item embeddings (and benefits from auxiliary modalities). It represents approaches that treat KG as external descriptor information appended to standard embedding models. PER [43] (meta-path features): PER constructs meta-path-based features from the KG. Meta-path methods are common in heterogeneous information networks and are useful baselines for capturing higher-order semantic connections. LibFM [44], Wide and Deep [45] (feature + factorisation/DNN hybrids): These non-KG baselines combine explicit feature engineering with factorisation or DNNs and are a strong general purpose recommender. They serve to show that KG-aware approaches outperform both classic factorisation and DNN hybrid models when KG structure is exploited.

4.3. Experiment Setup

We convert the explicit feedback of these three datasets into implicit feedback. For Movielens-1M, if the purchaser’s rating of the product is greater than 4, the product is marked as 1. For book-crossing and last.Fm, the product’s label is set to 1. To ensure the equality of positive and negative samples, a negative sampling strategy is adopted to extract some products that purchasers do not interact with and mark them as 0. Microsoft Satori (https://www.dbpedia.org/resources/ accessed on 23 December 2025) is employed to build the KG for each dataset.
The hyperparameters are set as follows: d denotes the dimension of embedding for purchasers, products, and the KG. H means the hop number. It is also the number of layers that purchasers’ interests propagate on KG. N means the depth of the receptive field. That is, the order of the aggregate entity neighbours. M denotes the size of the interest sets in each order. K represents the size of sampled neighbours of the entity. η indicates the learning rate. λ 1 represents the weight of KGE. λ 2 represents the weight of the regulariser.
Data splits and metrics: For all datasets, we use the same 6:2:2 train:validation:test split and convert explicit feedback to implicit labels. Evaluation metrics are Accuracy (ACC) and AUC, and negative sampling is applied to balance positives/negatives uniformly across methods.
Tuning methodology: For KnoChain and all baselines, we tuned hyperparameters on the validation set using grid search (or guided grid search) within standard ranges. The same tuning protocol and ranges were applied to each method to ensure fairness. We used early stopping based on validation AUC to avoid overfitting.
Reported final settings: Table 4 lists the final hyperparameter settings used for each dataset (embedding dimension d, hops H, interest set size M, sampled neighbour size K, receptive depth N, learning rate η , KGE weight λ 1 , regulariser weight λ 2 ). These are the values selected via validation and used for test reporting.
Typical search ranges: e.g., d { 4 ,   8 ,   16 ,   32 } , H { 1 ,   2 ,   3 } , K { 8 ,   16 ,   32 } , M { 8 ,   16 ,   32 } , η { 10 1 ,   10 2 ,   10 3 ,   10 4 } , λ 1 { 0 ,   0.01 ,   0.1 } , λ 2 { 10 3 ,   10 5 ,   10 7 } .

4.4. Analysis of Experimental Results

The experimental results are presented in Table 5 and Figure 3. DKN is more suitable for news recommendations with long titles, while the names of movies, books, and music are short and vague, so the datasets are not suitable for this model. For PER, it is difficult to design the optimal meta-path in reality. CKE had only usable structural knowledge. It performs even better with auxiliary information such as images and text. LibFM and Wide&Deep showed satisfactory performance, indicating that they were able to apply the knowledge information contained in KG to their model algorithms. RippleNet and KGCN both use the multi-hop neighbourhood structure information of entities in the KG. RippleNet focuses on using this information to spread purchasers’ interest and expand the vector representation of purchasers, ignoring the feature extraction of products. KGCN focuses on using this information to aggregate products’ neighbourhood information and expand the vector representation of products, ignoring the feature extraction of purchasers. KnoChain uses the high-order semantic and structural information of the KG to mine purchasers’ interests and extracts products’ features, thus enriching the vector representations of purchasers and products at the same time.
In general, KnoChain performs best among all methods on the three datasets for the CTR (CTR: Click-Through Rate; the prediction task is binary click/no-click, and the model outputs click probabilities) task. KnoChain achieves average ACC (ACC: Accuracy, computed as the proportion of correct predictions) gains of 8.76%, 6.57%, and 8.87% in movie, book, and music recommendation, respectively. And KnoChain achieves average AUC (AUC: Area Under the ROC Curve, a measure of the model’s ranking/discrimination ability between positive and negative samples) gains of 9.46%, 5.91%, and 8.91% in movie, book, and music recommendation, respectively. It illustrates the effectiveness of KnoChain in recommendations.
By observing the performance of the three datasets in the CTR prediction experiment, we can see that the MovieLens-1M performs best, followed by the Last.FM, and finally the Book-Crossing. This may be related to the sparsity of the dataset since the Book-Crossing dataset is the sparsest and therefore performs the worst in the experiment.

4.5. Parameter Analysis

This section analyses the performance of the model when the key superparameters take different values on three datasets.

4.5.1. Size of Sampled Neighbour (K)

We can observe from Table 6 that if K is too small, the neighbourhood information is not fully utilised, K is too large, and it is easy to be misled by the noise. According to the results of the experiment, K between 4 and 16 is the best.

4.5.2. Depth of Receptive Field (N)

The value of N was 1∼4. The results are shown in Table 7. Because a large N brings considerable noise to the model, we can see that the model will collapse seriously when N = 3. It is also easy to understand, because when inferring similarity between products, if the distance between them is too far, obviously there is not much correlation. According to the experimental results, the model works best when N is 1 or 2.

4.5.3. Dimension of Embedding (d)

Table 8 shows the best results in different dimensions for different datasets. For movie datasets, the best results occur when the embedding dimension is large. For book datasets, the best results occur when the embedding dimension is small. For music datasets, performance improves initially as d increases but degrades again if d becomes too large. The embedding dimension size may be related to the sparsity of data. To explain the different optimal embedding dimensions reported in Table 8, we link the choices of d to dataset characteristic. MovieLens-1M is relatively dense with richer KG neighbourhood information, so a larger embedding dimension (d = 16) provides the extra capacity needed to capture diverse user/item semantics and improves results. Book-Crossing is much sparser, so large dimensions are more prone to overfitting noisy or insufficient signals and a smaller d (d = 4) is more robust. Last.FM is intermediate and benefits from moderate d before performance degrades due to overfitting. Overall, these results suggest selecting d according to dataset sparsity and KG richness: increase d when interactions and KG degree are high, reduce d for sparse datasets.

4.5.4. Size of the High-Order Interest Set in Each Hop (M)

As shown in Table 9, KnoChain’s performance initially improves with a larger high-order interest set, extracting more information from the KG. However, an excessively large interest set can introduce noise, degrading performance. A size of 16 to 64 is suitable. Increasing M initially improves performance by incorporating more KG context, but beyond 16–64 it introduces noise and degrades accuracy—use moderate interest-set sizes.

4.5.5. Number of Maximum Hops (H)

As we can see from Table 10, the performance is best when the value of H is 1 or 2. The reason is that if H is too small, the entity cannot adequately obtain its neighbourhood information. However, if H is too large, it will bring more noise, which will lead to poor performance of the model. Short propagation (H = 1/2) is best. Deeper hops quickly dilute the useful signal with noise, so limit hop depth in practical deployments.
Table 4 gives the best value of the hyperparameters.

4.5.6. Debugging the Aggregator

Aggregation is a key process of the model because the feature representation of a product is achieved by aggregating itself with its neighbour information. Due to the different aggregation methods, different results should be presented in CTR prediction. Therefore, in the experiment, we also test the effect of aggregators on the prediction. We tested the three aggregators in the experiment: sum aggregator, concat aggregator, and neighbour aggregator. We can see from Figure 4 that the sum aggregator performed best, followed by the concat aggregator, and the neighbour aggregator performed worst. The sum aggregator consistently yields the best CTR performance—simple additive fusion balances signal integration and noise control better than concatenation or raw neighbour outputs.
In general, K and M control how much local and multi-hop KG information is exposed to the model: larger K / M can be beneficial when the KG and interaction graph are dense (higher average degree and higher interaction density). There is more useful neighbourhood signal to aggregate, but they introduce noise and overfitting risk on sparse datasets, so we recommend modest values (our experiments indicate M 16–64 and K in the low tens as sensible starting points) and scaling them with average node degree. H and N determine how far user interests and product receptive fields propagate: denser data and well-connected KGs can tolerate H ,   N = 2 to capture useful multi-hop relations, whereas very sparse datasets perform best with H = 1 and N = 1 to avoid amplifying weak/noisy paths.

5. Discussion

5.1. Model Explainability

Although KnoChain is a deep model, it yields several readily interpretable intermediate signals that support explanations. The relevance probabilities and purchaser-relation scores quantify how much each knowledge-graph triple and relation contributes to the prediction, while the multi-hop responses record how influence propagates from a purchaser’s historical items to the candidate product across hops. These outputs can be used to produce human-readable explanations by ranking top-contributing triples or entities, visualising dominant propagation paths (seed product to intermediate entities to recommended product), and plotting per-hop attention heatmaps. We include an example case in Figure 2 that lists the 3-order hop triples and the main propagation path for a sample recommendation. The detailed explanation can be found in Section 3.3. We emphasise that attention weights are proxies for importance rather than causal proofs, and we validate explanations via abundant experiments to increase explanation fidelity.

5.2. Discussion About Key Components That Drive Improvements

This subsection summarises the model components that appear to contribute most to the observed performance gains.
Bidirectional expansion (outward propagation and inward aggregation). The bidirectional mechanism combines outward propagation from purchaser seed nodes, which activates multi-hop entities related to a purchaser’s history, with inward aggregation for products, which refines product representations by incorporating purchaser-weighted multi-hop neighbours. These complementary flows extend semantic coverage beyond direct interactions, mitigate sparsity and cold-start effects, and improve purchaser–product matching.
Attention/purchaser-aware weighting. Attention weights ( τ i ) used to score elements in the interest set together with neighbour importance weights ( ρ ) allow the model to emphasise KG facts that are most relevant for a specific purchaser–product pair. Compared with uniform aggregation, these purchaser-aware weights provide stronger personalised signals and are a key driver of accuracy improvements.
Aggregator choice. Experimental results indicate that a simple sum (additive) aggregator often outperforms concatenation or raw neighbour-output aggregators. The sum aggregator appears to preserve complementary signals while avoiding over-parameterisation and noise amplification, yielding more robust practice performance.
Hyperparameter choices as inductive biases. Several hyperparameters function as important inductive biases and materially affect performance:
  • Interest-set size M and neighbour sample size K: moderate values (e.g., M 16 –64) capture useful higher-order context while limiting noise; overly large values introduce many irrelevant signals.
  • Hop depth H: shallow propagation ( H = 1 –2) is generally most robust; deeper hops tend to dilute effective signals and reduce accuracy.
  • Embedding dimension d: the optimal d depends on dataset density—denser datasets tolerate larger d, while sparser datasets benefit from smaller d to avoid overfitting.

5.3. Managerial Implications

Here are the specific steps of an example applied in a practical supply chain scenario. Practical steps: (1) gather interaction logs, product catalogs, and supplier metadata and normalise IDs; (2) build a domain KG linking purchasers, parts, suppliers, categories, and logistics nodes; (3) run offline validation and a focused pilot (e.g., alternative-part or supplier discovery) with human-in-the-loop review; (4) deploy via API or procurement UI and monitor KPIs. Expected benefits include faster supplier discovery, reduced stockout risk via automated substitutes, improved inventory turns and lead-time predictability. Track time-to-source, stockout incidents, substitute acceptance rate, and lead-time variance; validate via A/B tests and perturbation checks.
In supply chain management (SCM), RS support operational matching (e.g., matching purchasers to suppliers or SKUs) and rapid decision-making under disruption. Cold start is particularly acute in SCM because new SKUs, new suppliers, or newly formed buyer–supplier relationships frequently appear (for example, when product designs change, a new supplier is onboarded, or a supplier is disqualified), and historical interaction data for these entities are typically sparse. This sparsity is compounded by (a) long product lifecycles with infrequent repeat purchases of specialised items; (b) fragmentation of procurement across multiple buyers and plants; and (c) the high business cost of mistaken recommendations (quality, compliance, lead time). Prior work has shown that recommender models can enable agility and risk mitigation in SCM. KG-based is a promising side-information source to enrich sparse entity representations and reduce cold-start risk. We make this concrete below by mapping common KG constructs to supply chain concepts and by presenting three representative workflows where KnoChain can be applied.

5.4. Design Rationale for the Bidirectional Architecture

Considering the asymmetry between purchasers and products in recommendation, we combine outward propagation (purchaser → KG) with inward aggregation (KG → product). Outward propagation treats a purchaser’s interacted products as seeds and activates multi-hop entities to recover personalised, traceable high-order interests. Inward aggregation treats each product as a receptive field and aggregates neighbourhood entities and relations to enrich product semantics. Compared to alternatives, this hybrid balances expressiveness, interpretability, and noise control: (1) pure propagation (e.g., RippleNet) enriches purchasers but under-utilises product neighbourhood structure; (2) pure aggregation (e.g., KGCN) enriches products but does not explicitly reconstruct purchaser intent; (3) fully symmetric message-passing GNNs can introduce irrelevant multi-hop signals, increase computation, and reduce interpretability. We mitigate noise and complexity by limiting hop depth, sampling fixed-size neighbour/interest sets, and using relation-aware attention weights during aggregation. Empirically, the dual-path design yields consistent gains over single-path baselines, as shown in Table 5.

5.5. Prospects for Practical Application in Supply Chain Workflows

Practical application in supply chain workflows. KnoChain’s dual-path KG usage maps directly to several SCM tasks. Below, we explain three representative workflows and how the model supports them:
Supplier-to-procurement-category matching. Goal: recommend candidate suppliers or source SKUs for a given procurement request, for example, the procurement of surface-mount capacitors. Mapping: purchasers such as procurement managers and plant buyers are modelled as purchaser nodes; SKUs, suppliers, manufacturers, and product categories are modelled as product and entity nodes; relations include supplied-by, belongs-to-category, and manufactured-by. Outward propagation expands a purchaser’s seed nodes—which consist of past-purchased SKUs and preferred suppliers—into multi-hop supplier and category signals; inward aggregation enriches SKU embeddings with supplier, manufacturer, and certification attributes. Result: improved matching for new SKUs or newly assessed suppliers, because the knowledge graph supplies attribute and relational context when little or no transaction data exist.
Identifying substitute products (material substitution). Goal: recommend substitute parts or alternative materials automatically when a primary material becomes unavailable because of disruption or shortage. Mapping: relations including is substitute of, equivalent spec, and alternative material link SKUs, technical specifications, and manufacturers in the knowledge graph. KnoChain’s inward aggregation gathers multi-hop attributes such as spec similarities, shared suppliers, and certification overlap, and uses the aggregated context to rank substitutes and assign confidence scores despite sparse purchase histories.
Recommending logistics service providers. Goal: recommend carriers or third-party logistics providers for a shipment, for example, selecting a cold-chain carrier or choosing express versus standard transit, constrained by contractual terms, transit geography, and commodity characteristics. Mapping: the graph encodes shipments, carriers, transport modes, regions, and contract terms, with relations including handled by, serves region, and contract with. Outward propagation leverages a purchaser’s shipment history to surface carriers, routing options, and regional attributes; inward aggregation incorporates contract metadata, performance indicators, and capacity constraints into carrier representations. This enables recommending carriers for novel routes or low-usage carriers by supplying contextual signals when transactions are sparse.

6. Conclusions and Future Work

In this paper, we propose a bidirectional feature expansion framework. It employs outward propagation to refine the purchaser representation from their interaction history and uses inward aggregation to refine the product representation from their multi-hop neighbours. Therefore, the features of both purchasers and products are extracted in the proposed model simultaneously. This model alleviates the sparsity and cold-start problems in RS and achieves personalised recommendations in supply chain management. Experiments on three datasets demonstrate the competitiveness of our model over several advanced approaches.
Implications. Our proposed knowledge-graph-augmented recommender for purchaser–SKU matching contributes to sustainability by improving demand forecasting and procurement decisions, thereby reducing overstock, waste, and unnecessary transportation; by enabling more informed supplier/SKU substitution that favors lower-impact options; and by increasing supply chain resilience through better use of multi-source contextual knowledge (e.g., product attributes, manufacturer relationships). These effects align with prior findings that data-driven and AI methods can improve supply chain sustainability and resilience when integrated with domain knowledge [46,47]. The model’s purchaser-aware attention and explainable KG paths also support trust and acceptability among procurement purchasers, which is critical for adoption of sustainability-oriented recommendations [48].
Limitations. The primary limitation of this study is the lack of dedicated datasets from real-world supply chain settings. We evaluated our approach on public recommendation datasets that are useful for assessing cold-start capabilities but do not fully capture supply chain contract semantics, operational constraints, or characteristic graph structures. Importantly, the proposed method can be directly extended to practical supply chain scenarios; nonetheless, its performance in operational environments depends on access to domain data, the construction of domain knowledge graphs, and task-specific calibration. Accordingly, we have acknowledged this limitation in the conclusion and recommend that future work prioritises the collection of industry datasets, the development of domain knowledge graphs, and validation on real business cases. The work focuses on leveraging side information from a knowledge graph to support automatic recommendations for newly launched products in the supply chain. Therefore, the second limitation is the challenge of real-world complexity, such as supplier reliability, component interdependence, and delivery cycles. When considering these factors, models trained on movie or book datasets exhibit significant limitations in actual supply chain scenarios.
Future work. We will address these gaps by collecting proprietary supply chain interaction data and domain knowledge graphs, extending KnoChain with temporal and dynamic KG techniques, investigating scalable training and inference strategies, improving robustness to noisy graphs, enhancing explanation fidelity and causal evaluation, and validating the approach through online experiments and business level metrics. Practical deployment faces several nontrivial challenges. First, constructing and continuously maintaining a high-quality domain KG—including entity alignment, relation extraction, data cleaning, and versioning—requires substantial engineering effort and domain expertise. Second, supply chain data are typically fragmented across heterogeneous systems, making identifier alignment and data integration a time-consuming prerequisite. Third, supplier and contract records can be sensitive, so deployments must adopt privacy and compliance safeguards, such as field masking, access control, and federated or other privacy-preserving training. Finally, industrial scale graphs with millions of entities and billions of edges impose memory, sampling, and latency constraints that necessitate scalable solutions, such as neighbour sampling, graph partitioning, Cluster GCN or GraphSAINT style mini-batching, and distributed inference. We recommend staged pilots, automated incremental KG pipelines, scalable GNN engineering, and clear monitoring with KPI thresholds for time to source, stockout incidents, and substitution acceptance.

Author Contributions

Conceptualisation, P.L. and Y.M.; Methodology, P.L. and Y.M.; Software, P.L., K.H., and S.L.; Validation, K.H.; Formal analysis, S.L.; Investigation, P.L.; Data curation, S.L.; Writing—original draft, P.L. and Y.M.; Writing—review and editing, Y.M. and K.H.; Visualisation, Y.M., K.H., and S.L.; Supervision, Y.M.; Project administration, Y.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Project of 2025 Zhejiang Philosophy and Social Sciences Planning “Provincial and Municipal Cooperation” under grant No. 25SSHZ053YB, Project of Ningbo University of Technology under grant No. 2022KQ36 and Project of Ningbo Social Science Research Base under grant No. JD6-030.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data was obtained from the public database (http://grouplens.org/datasets/movielens/1m/, http://grouplens.org/datasets/movielens/1m/ and https://gitcode.com/open-source-toolkit/7a493) (all accessed on 23 December 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Dadouchi, C.; Agard, B. Recommender systems as an agility enabler in supply chain management. J. Intell. Manuf. 2021, 32, 1229–1248. [Google Scholar] [CrossRef]
  2. Liu, M.; Liu, Z.; Chu, F.; Zheng, F.; Chu, C. A new robust dynamic Bayesian network approach for disruption risk assessment under the supply chain ripple effect. Int. J. Prod. Res. 2021, 59, 265–285. [Google Scholar] [CrossRef]
  3. Goldberg, D.; Nichols, D.; Oki, B.M.; Terry, D. Using collaborative filtering to weave an information tapestry. Commun. ACM 1992, 35, 61–70. [Google Scholar] [CrossRef]
  4. Wijewickrema, M.; Petras, V.; Dias, N. Selecting a text similarity measure for a content-based recommender system: A comparison in two corpora. Electron. Libr. 2019, 37, 506–527. [Google Scholar] [CrossRef]
  5. Wang, H.; Zhang, F.; Hou, M.; Xie, X.; Guo, M.; Liu, Q. Shine: Signed heterogeneous information network embedding for sentiment link prediction. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA, 5–9 February 2018; pp. 592–600. [Google Scholar]
  6. Zhang, F.; Yuan, N.J.; Lian, D.; Xie, X.; Ma, W.Y. Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 353–362. [Google Scholar]
  7. Sun, Y.; Yuan, N.J.; Xie, X.; McDonald, K.; Zhang, R. Collaborative intent prediction with real-time contextual data. ACM Trans. Inf. Syst. (TOIS) 2017, 35, 1–33. [Google Scholar] [CrossRef]
  8. Peng, B.; Chen, G.; Tang, Y.; Sun, S.; Sun, Y. Semantic navigation of keyword search based on knowledge graph. In Proceedings of the 12th Chinese Conference on Computer Supported Cooperative Work and Social Computing, Chongqing, China, 22–23 September 2017; pp. 189–192. [Google Scholar]
  9. Dong, L.; Wei, F.; Zhou, M.; Xu, K. Question answering over freebase with multi-column convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 260–269. [Google Scholar]
  10. Xu, C.; Bai, Y.; Bian, J.; Gao, B.; Wang, G.; Liu, X.; Liu, T.Y. Rc-net: A general framework for incorporating knowledge into word representations. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, Shanghai, China, 3–7 November 2014; pp. 1219–1228. [Google Scholar]
  11. Wang, H.; Zhang, F.; Zhao, M.; Li, W.; Xie, X.; Guo, M. Multi-task feature learning for knowledge graph enhanced recommendation. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2000–2010. [Google Scholar]
  12. Wang, H.; Zhang, F.; Wang, J.; Zhao, M.; Li, W.; Xie, X.; Guo, M. Ripplenet: Propagating user preferences on the knowledge graph for recommender systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 417–426. [Google Scholar]
  13. Wang, H.; Zhao, M.; Xie, X.; Li, W.; Guo, M. Knowledge graph convolutional networks for recommender systems. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 3307–3313. [Google Scholar]
  14. Wang, X.; He, X.; Cao, Y.; Liu, M.; Chua, T.S. Kgat: Knowledge graph attention network for recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 950–958. [Google Scholar]
  15. Guo, X.; Lin, W.; Li, Y.; Liu, Z.; Yang, L.; Zhao, S.; Zhu, Z. DKEN: Deep knowledge-enhanced network for recommender systems. Inf. Sci. 2020, 540, 263–277. [Google Scholar] [CrossRef]
  16. Wang, X.; Huang, T.; Wang, D.; Yuan, Y.; Liu, Z.; He, X.; Chua, T.S. Learning intents behind interactions with knowledge graph for recommendation. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 878–887. [Google Scholar]
  17. Li, X.; Xie, Q.; Zhu, Q.; Ren, K.; Sun, J. Knowledge graph-based recommendation method for cold chain logistics. Expert Syst. Appl. 2023, 227, 120230. [Google Scholar] [CrossRef]
  18. Wang, M.; Huo, Y.; Zheng, J.; He, L. SC-TKGR: Temporal knowledge graph-based GNN for recommendations in supply chains. Electronics 2025, 14, 222. [Google Scholar] [CrossRef]
  19. Lv, C.; Lu, Y.; Yan, X.; Lu, W.; Tan, H. Supplier recommendation based on knowledge graph embedding. In Proceedings of the 2020 Management Science Informatization and Economic Innovation Development Conference (MSIEID), Guangzhou, China, 18–20 December 2020; pp. 514–518. [Google Scholar]
  20. Tu, Y.; Li, W.; Song, X.; Gong, K.; Liu, L.; Qin, Y.; Liu, S.; Liu, M. Using graph neural network to conduct supplier recommendation based on large-scale supply chain. Int. J. Prod. Res. 2024, 62, 8595–8608. [Google Scholar] [CrossRef]
  21. Banerjee, S.; Ghali, N.I.; Roy, A.; Hassanein, A.E. A bio-inspired perspective towards retail recommender system: Investigating optimization in retail inventory. In Proceedings of the 2012 12th International Conference on Intelligent Systems Design and Applications (ISDA), Kochi, India, 27–29 November 2012; pp. 161–165. [Google Scholar]
  22. Sun, F.; Wang, P.; Zhang, Y.; Kar, P. βFSCM: An enhanced food supply chain management system using hybrid blockchain and recommender systems. Blockchain Res. Appl. 2025, 6, 100245. [Google Scholar] [CrossRef]
  23. Rathor, K.; Chandre, S.; Thillaivanan, A.; Raju, M.N.; Sikka, V.; Singh, K. Archimedes optimization with enhanced deep learning based recommendation system for drug supply chain management. In Proceedings of the 2023 2nd International Conference on Smart Technologies and Systems for Next Generation Computing (ICSTSN), Villupuram, India, 21–22 April 2023; pp. 1–6. [Google Scholar]
  24. Ordibazar, A.H.; Hussain, O.; Saberi, M. A recommender system and risk mitigation strategy for supply chain management using the counterfactual explanation algorithm. In Proceedings of the International Conference on Service-Oriented Computing, Virtual Event, 22–25 November 2021; pp. 103–116. [Google Scholar]
  25. Amin, R.; Kaur, G. Integration of data science and artificial intelligence for effective online supply chain recommendation system. In Proceedings of the 2024 2nd International Conference on Advancement in Computation & Computer Technologies (InCACCT), Gharuan, India, 2–3 May 2024; pp. 93–98. [Google Scholar]
  26. Zare, A.; Motadel, M.R.; Jalali, A. A hybrid recommendation system based on the supply chain in social networks. J. Web Eng. 2022, 21, 633–659. [Google Scholar] [CrossRef]
  27. Hu, Y. A blockchain-based intelligent recommender system framework for enhancing supply chain resilience. arXiv 2024, arXiv:2404.00306. [Google Scholar]
  28. Hu, Y.; Ghadimi, P. A data-driven intelligent supply chain disruption response recommender system framework. In Proceedings of the 2024 Winter Simulation Conference (WSC), Orlando, FL, USA, 15–18 December 2024; pp. 596–607. [Google Scholar]
  29. Jalali, S.; Golpayegani, S.A.H.; Ghavamipoor, H. Designing a model of decision making in layers of supply, manufacturing, and distribution of the supply chain: A recommender-based system. In Proceedings of the 8th International Conference on e-Commerce in Developing Countries: With Focus on e-Trust, Mashhad, Iran, 24–25 April 2014; pp. 1–9. [Google Scholar]
  30. Abbas, K.; Afaq, M.; Ahmed Khan, T.; Song, W.C. A blockchain and machine learning-based drug supply chain management and recommendation system for smart pharmaceutical industry. Electronics 2020, 9, 852. [Google Scholar] [CrossRef]
  31. Jamali, M.; Lakshmanan, L. Heteromf: Recommendation in heterogeneous information networks using context dependent factor models. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 643–654. [Google Scholar]
  32. Luo, C.; Pang, W.; Wang, Z.; Lin, C. Hete-cf: Social-based collaborative filtering recommendation using heterogeneous relations. In Proceedings of the 2014 IEEE International Conference on Data Mining, Shenzhen, China, 14–17 December 2014; pp. 917–922. [Google Scholar]
  33. Deng, W.; Ma, J. A knowledge graph approach for recommending patents to companies. Electron. Commer. Res. 2021, 22, 1435–1466. [Google Scholar] [CrossRef]
  34. Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge graph embedding: A survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 2017, 29, 2724–2743. [Google Scholar] [CrossRef]
  35. Guo, Q.; Zhuang, F.; Qin, C.; Zhu, H.; Xie, X.; Xiong, H.; He, Q. A survey on knowledge graph-based recommender systems. IEEE Trans. Knowl. Data Eng. 2022, 34, 3549–3568. [Google Scholar] [CrossRef]
  36. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-Relational Data. Adv. Neural Inf. Process. Syst. 2013, 26, 2787–2795. [Google Scholar]
  37. Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge graph embedding by translating on hyperplanes. Proc. AAAI Conf. Artif. Intell. 2014, 28, 1112–1119. [Google Scholar] [CrossRef]
  38. Li, L.; Shi, Y.; Zhang, K.; Ren, Y. A co-attention model with sequential behaviors and side information for session- based recommendation. In Proceedings of the 2020 IEEE International Conference on Web Services (ICWS), Beijing, China, 19–23 October 2020; pp. 118–125. [Google Scholar]
  39. Wang, H.; Zhang, F.; Xie, X.; Guo, M. DKN: Deep knowledge-aware network for news recommendation. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 1835–1844. [Google Scholar]
  40. Zhao, W.X.; He, G.; Yang, K.; Dou, H.; Huang, J.; Ouyang, S.; Wen, J.R. Kb4rec: A data set for linking knowledge bases with recommender systems. Data Intell. 2019, 1, 121–136. [Google Scholar] [CrossRef]
  41. Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1024–1034. [Google Scholar]
  42. Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
  43. Zhao, H.; Yao, Q.; Li, J.; Song, Y.; Lee, D.L. Meta-graph based recommendation fusion over heterogeneous information networks. In Proceedings of the the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 635–644. [Google Scholar]
  44. Rendle, S. Factorization machines with libfm. Assoc. Comput. Mach. Trans. Intell. Syst. Technol. (TIST) 2012, 3, 1–22. [Google Scholar] [CrossRef]
  45. Cheng, H.-T.; Koc, L.; Harmsen, J.; Shaked, T.; Chandra, T.; Aradhye, H.; Anderson, G.; Corrado, G.; Chai, W.; Ispir, M.; et al. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 15 September 2016; pp. 7–10. [Google Scholar]
  46. Riad, M.; Naimi, M.; Okar, C. Enhancing supply chain resilience through artificial intelligence: Developing a comprehensive conceptual framework for AI implementation and supply chain optimization. Logistics 2024, 8, 111. [Google Scholar] [CrossRef]
  47. Aylak, B.L. SustAI-SCM: Intelligent Supply Chain Process Automation with Agentic AI for Sustainability and Cost Efficiency. Sustainability 2025, 17, 2453. [Google Scholar] [CrossRef]
  48. Chhetri, T.R. Improving Decision Making Using Semantic Web Technologies. In Proceedings of the European Semantic Web Conference, Virtual Event, 24–28 October 2021; pp. 165–175. [Google Scholar]
Figure 1. The framework of KnoChain. It takes a purchaser and a product as input, and outputs the predicted probability that the purchaser will buy the product.
Figure 1. The framework of KnoChain. It takes a purchaser and a product as input, and outputs the predicted probability that the purchaser will buy the product.
Sustainability 18 00506 g001
Figure 2. Illustration of interest set of “iPhone 14” in KG of a mobile phone supply chain. The ellipse circles denotes the sets with different hops. The fading yellow indicates decreasing relatedness between the center and surrounding entities. In practice, the sets of different hops are not necessarily disjoint.
Figure 2. Illustration of interest set of “iPhone 14” in KG of a mobile phone supply chain. The ellipse circles denotes the sets with different hops. The fading yellow indicates decreasing relatedness between the center and surrounding entities. In practice, the sets of different hops are not necessarily disjoint.
Sustainability 18 00506 g002
Figure 3. Comparison of CTR prediction results.
Figure 3. Comparison of CTR prediction results.
Sustainability 18 00506 g003
Figure 4. The different performance of the three aggregators.
Figure 4. The different performance of the three aggregators.
Sustainability 18 00506 g004
Table 1. Notation.
Table 1. Notation.
SymbolDefinition
U = { u 1 ,   u 2 } Sets of purchasers
V = { v 1 ,   v 2 } Sets of products
Y = { y u v u U ,   v V } Purchaser–product interaction matrix
y u v = 1 Implicit interaction between purchaser u and product v, such as clicking, browsing or purchasing
y u v = 0 Else
G = ( E ,   R ) Knowledge graph.
u U ,   v V Purchaser and product.
u , v R d Embeddings (dimension d).
I u Historical interaction set of u.
Γ u , h , ε u , h h-order interest triples/related entities.
( h i ,   r i ,   t i ) KG triple; h i ,   t i R d are embeddings.
G = { ( h ,   r ,   t ) h ϵ ,   r γ and t ϵ } Knowledge graph
h ϵ , r γ and t ϵ denote the head, relation, tail of a knowledge triple, respectively
ϵ and γ denote the set of entities and relations in the KG
R i Relation transform ( R d × d ).
τ i Attention weight; τ i = softmax ( v R i h i ) .
a u , h h-order response (e.g., a u , 1 = τ i t i ).
u v = h = 1 H a u , h Purchaser vector with respect to query v.
N ( v ) Neighbours of v in receptive field; N = receptive depth.
ρ u , r ,   b v ,   v u Relation weight, local neighbour vector, final product rep.
AGG ( · ) Aggregator (sum/concat/neighbour).
y u v = F ( u ,   v θ ,   Y ,   G ) Probability that purchaser u will have an interaction with product v
θ denotes the model parameters of the function F
d ,   K ,   H ,   M Embedding dim, max neighbours, hops, interest set size.
Table 2. Mapping public dataset concepts to supply chain domain.
Table 2. Mapping public dataset concepts to supply chain domain.
Public Dataset ConceptSupply Chain Domain Mapping ConceptExplanation and Description
UserSupply Chain ParticipantCan be mapped to “Purchaser”, “Manufacturer”, “Retailer”, or collectively referred to as “Business Unit”.
Item (Movie/Book)Supply Chain Material/ProductCan be mapped to “Raw Material”, “Component”, “Finished Product”, “SKU” (Stock Keeping Unit).
Rating/PurchaseBusiness InteractionCan be mapped to “Purchase Record”, “Transaction Frequency”, “Cooperation Satisfaction”, “Supplier Rating”.
Movie Director/Actor
Book Author/PublisherSupplier/ManufacturerThese are the creators and sources of “items”, similar to “Suppliers” or “Manufacturers” of products in the supply chain.
Movie Genre/Book Tag Music GenreMaterial Classification/Product AttributeCan be mapped to “Material Type” (e.g., electronic, structural), “Product Category”, “Technical Specification”, “Industry Standard”.
Table 3. Basic statistics for the three datasets.
Table 3. Basic statistics for the three datasets.
DatasetUsersItemsInteractionsEdge TypesEntityRelationship
MovieLens-1M60362347753,77212182,0112,483,990
Book-Crossing17,86014,910139,7462577,903303,000
Last.FM1872384642,34660936631,036
Table 4. Hyper-parameters settings for the three datasets.
Table 4. Hyper-parameters settings for the three datasets.
DatasetHyper-Parameters Settings
MovieLens-1M d = 16 , H = 2 , M = 32 , K = 16 , N = 1 , η = 10 2 , λ 1 = 0.01 , λ 2 = 10 7
Book-Crossing d = 4 , H = 1 , M = 32 , K = 8 , N = 1 , η = 2 × 10 3 , λ 1 = 0.01 , λ 2 = 10 5
Last.FM d = 16 , H = 1 , M = 32 , K = 8 , N = 1 , η = 7 × 10 4 , λ 1 = 0.01 , λ 2 = 10 4
Table 5. The results of AUC and Accuracy in CTR prediction. Bold indicates the best result.
Table 5. The results of AUC and Accuracy in CTR prediction. Bold indicates the best result.
MovieLens-1MBook-CrossingLast.FM
ModelAUCACCAUCACCAUCACC
LibFM0.8920.8120.6850.6400.7770.709
(−2.9%)(−3.2%)(−5.7%)(−5.6%)(−3.6%)(−4.3%)
PER0.7100.6640.6230.5880.6330.596
(−21.1%)(−18.0%)(−11.9%)(−10.8%)(−18.0%)(−15.6%)
CKE0.8010.7420.6710.6330.7440.673
(−12.0%)(−10.2%)(−7.1%)(−6.3%)(−6.9%)(−7.9%)
Wide&Deep0.8980.8200.7120.6240.7560.688
(−2.3%)(−2.4%)(−3.0%)(−7.2%)(−5.7%)(−6.4%)
DKN0.6550.5890.6220.5980.6020.581
(−26.6%)(−25.5%)(−12.0%)(−9.8%)(−21.1%)(−17.1%)
RippleNet0.9200.8420.7290.6620.7680.691
(−0.1%)(−0.2%)(−1.3%)(−3.4%)(−4.5%)(−6.1%)
KGCN0.9160.8400.7380.6880.7940.719
(−0.5%)(−0.4%)(−0.4%)(−0.8%)(−1.9%)(−3.3%)
KnoChain0.9210.8440.7420.6960.8130.752
Table 6. The results of AUC and ACC with different neighbour sampling size K. Bold indicates the best result.
Table 6. The results of AUC and ACC with different neighbour sampling size K. Bold indicates the best result.
MovieLens-1MBook-CrossingLast.FM
KAUCACCAUCACCAUCACC
20.9140.8360.7430.6930.8120.749
40.9170.8410.7440.6930.8130.754
80.9190.8440.7430.6960.8130.751
160.9200.8440.7420.6940.8130.751
320.9200.8430.7400.6910.8110.749
Table 7. The results of AUC and ACC with different depth of receptive field N. Bold indicates the best result.
Table 7. The results of AUC and ACC with different depth of receptive field N. Bold indicates the best result.
MovieLens-1MBook-CrossingLast.FM
NAUCACCAUCACCAUCACC
10.9190.8440.7430.6950.8130.752
20.9150.8410.7410.6890.8100.748
30.8080.8320.7420.6900.8070.746
Table 8. The results of AUC and ACC with different dimension of embedding d. Bold indicates the best result.
Table 8. The results of AUC and ACC with different dimension of embedding d. Bold indicates the best result.
MovieLens-1MBook-CrossingLast.FM
dAUCACCAUCACCAUCACC
40.9060.8310.7430.6950.8140.748
80.9120.8360.7410.6930.8150.750
160.9200.8430.7380.6820.8130.752
320.9210.8450.7360.6830.8120.750
Table 9. The results of AUC and ACC different sizes of the high-order interest set M. Bold indicates the best result.
Table 9. The results of AUC and ACC different sizes of the high-order interest set M. Bold indicates the best result.
MovieLens-1MBook-CrossingLast.FM
MAUCACCAUCACCAUCACC
80.9170.8410.7370.6810.8120.752
160.9170.8420.7420.6890.8130.754
320.9190.8440.7430.6940.8130.752
640.9170.8430.7420.6970.6530.535
1280.9140.8380.7420.6960.5470.545
Table 10. The results of AUC and ACC with different hop numbers H. Bold indicates the best result.
Table 10. The results of AUC and ACC with different hop numbers H. Bold indicates the best result.
MovieLens-1MBook-CrossingLast.FM
HAUCACCAUCACCAUCACC
10.9150.8400.7430.6960.8130.752
20.9210.8460.7350.6870.7200.656
30.9210.8450.7130.6660.7090.655
40.9220.8460.7370.6940.5100.500
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, P.; Ma, Y.; Hou, K.; Li, S. KnoChain: Knowledge-Aware Recommendation for Alleviating Cold Start in Sustainable Procurement. Sustainability 2026, 18, 506. https://doi.org/10.3390/su18010506

AMA Style

Li P, Ma Y, Hou K, Li S. KnoChain: Knowledge-Aware Recommendation for Alleviating Cold Start in Sustainable Procurement. Sustainability. 2026; 18(1):506. https://doi.org/10.3390/su18010506

Chicago/Turabian Style

Li, Peijia, Yue Ma, Kunqi Hou, and Shipeng Li. 2026. "KnoChain: Knowledge-Aware Recommendation for Alleviating Cold Start in Sustainable Procurement" Sustainability 18, no. 1: 506. https://doi.org/10.3390/su18010506

APA Style

Li, P., Ma, Y., Hou, K., & Li, S. (2026). KnoChain: Knowledge-Aware Recommendation for Alleviating Cold Start in Sustainable Procurement. Sustainability, 18(1), 506. https://doi.org/10.3390/su18010506

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop