Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises

Tang, Shanshan; Wang, Kaiyi; Yang, Feng; Pan, Shouhui

doi:10.3390/su17167522

Open AccessArticle

Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises

by

Shanshan Tang

¹,

Kaiyi Wang

^1,2,3,*,

Feng Yang

^2,3,* and

Shouhui Pan

^2,3

¹

College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China

²

Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

³

National Innovation Center for Digital Seed Industry, Beijing 100097, China

^*

Authors to whom correspondence should be addressed.

Sustainability 2025, 17(16), 7522; https://doi.org/10.3390/su17167522

Submission received: 17 July 2025 / Revised: 14 August 2025 / Accepted: 16 August 2025 / Published: 20 August 2025

(This article belongs to the Special Issue ESG Management, Performance, and Corporate Sustainability: Insights and Innovations)

Download

Browse Figures

Versions Notes

Abstract

The innovation capability of seed enterprises reflects their core competitiveness and serves as a vital foundation for sustainable agricultural development and modernization. Therefore, evaluating this capability is of great importance. However, existing evaluation methods primarily focus on internal enterprise attributes, overlooking the complex inter-enterprise relationships and lacking sufficient feature fusion capabilities to capture latent information. To address these limitations, this paper proposes a Multi-Channel Graph Convolutional Network (MGCN) model that integrates enterprise attributes with three types of relational graphs. The model adopts a multi-channel architecture for feature extraction and employs a gated attention mechanism for cross-graph feature fusion, jointly considering node features and relation information to improve prediction accuracy. Experimental results demonstrate that MGCN achieves an average accuracy of 83.59% under five-fold cross-validation, outperforming several mainstream models such as Random Forest and traditional GCN. Case studies further reveal that MGCN not only captures key features of individual enterprises but also leverages features and label distribution from neighboring enterprises, facilitating more context-aware classification decisions. In conclusion, the MGCN model provides an effective method for the intelligent evaluation of innovation capability in seed enterprises and supports the formulation of sustainable strategic plans at both the national and enterprise level.

Keywords:

innovation capacity evaluation; graph neural network; multi-channel; seed enterprises sustainability

1. Introduction

As key players in agricultural innovation, seed enterprises play a vital role in driving technological progress in the industry. However, under the globalized economic landscape, they are faced with unprecedented competitive pressure [1]. Innovation capacity is a key determinant of enterprise competitiveness and is crucial for maintaining market position and ensuring sustainable development. Therefore, it is essential to evaluate the innovation capacity of seed enterprises. Evaluating seed enterprise capabilities can assist governments in formulating policies. At present, China is prioritizing support for 276 key seed enterprises with sustainable development potential. By continuously monitoring and assessing seed enterprises, it is possible to adjust national support policies and target beneficiaries accordingly. For seed enterprises, evaluating their innovation capabilities can help them to gain an understanding of themselves, develop sustainable strategic plans, strengthen overall enterprise capacity, and enhance competitiveness. Currently, enterprises evaluation methods are generally classified into three types, including classical statistical models [2], machine learning models [3], and deep learning models [4].

Commonly used classical statistical models include the Balanced Scorecard (BSC) [5], Key Performance Indicators (KPI) [6], Analytic Hierarchy Process (AHP) [7], Data Envelopment Analysis (DEA) [8], and Grey Relational Analysis (GRA) [9], among others. Compared with earlier DuPont analysis [10], these approaches are no longer limited to financial indicators but evaluate enterprises from multiple dimensions, providing more comprehensive and scientific assessment results. For example, Xiang employed the DEA method to evaluate the innovation efficiency of listed companies, analyzing the relationship between innovation inputs and outputs, and proposed improvement suggestions [11]. Yu applied the AHP to evaluate the credit level of technology-based small and medium-sized enterprises, pointing out deficiencies in the existing credit index system, such as an excessive focus on financial capacity while neglecting technical and talent indicators [12].

Machine learning has become an increasingly important tool in enterprise evaluation, especially as the volume and complexity of enterprise data grow and the limitations of traditional statistical methods become more evident. Machine learning techniques can handle large-scale datasets and capture complex, non-linear relationships. Methods such as Support Vector Machine (SVM) [13] and Random Forest (RF) [14] have been introduced into the field of enterprise evaluation. For example, Zhang constructed a credit risk evaluation index system and proposed a model based on SVM to predict default behavior in small and medium-sized enterprise [15]. In some cases, researchers face incomplete data. To address the issue of missing evaluation labels, Akman applied Principal Component Analysis (PCA) and the k-means clustering algorithm to determine data labels and then employed algorithms such as RF to predict enterprise innovation capacity. This method integrates unsupervised and supervised machine learning techniques, offering a relatively objective approach for label generation while providing a fast and accurate evaluation procedure [16].

Deep learning is increasingly applied in enterprise evaluation due to its capabilities in automatic feature extraction and modeling complex, non-linear relationships in data. For example, Lian employed an artificial neural network with a multilayer perceptron (MLP) architecture to assess the credit risk of listed companies in China based on financial indicators, and found that fixed asset turnover was the most influential factor in bank lending decisions [17]. Shang proposed an enterprise performance evaluation model for cross-border e-commerce based on deep learning. The model integrates a multidimensional indicator system with recurrent neural networks, deep belief networks, and attention mechanisms, which together improve both prediction accuracy and the efficiency of performance classification [18]. Hosaka introduced a novel method that converts financial ratios into grayscale images and applies a convolutional neural network to predict corporate bankruptcy. This image-based representation of financial data improved the accuracy of corporate financial health assessment [19].

Graph neural networks (GNNs) effectively capture both node features and relational dependencies in graph-structured data, making them well-suited for tasks involving complex inter-entity interactions [20]. For example, Feng represented each enterprise’s individual features as nodes within separate graph structures, analyzing the relationships between these feature nodes and the information flow among them to perform corporate credit prediction [21]. However, early studies mainly focused on the internal attributes of individual enterprises, overlooking the complex inter-enterprise interactions that influence prediction outcomes. To address this limitation, Wei modeled all seed enterprises as nodes in a single graph, where each node encodes basic attributes and litigation information, while heterogeneous edges represent investment, stakeholder, and related connections. By leveraging hypergraph neural networks to capture shared risk factors across industry sectors, regions, and stakeholders, the accuracy of enterprise bankruptcy prediction was significantly improved [22]. Bi considered the contributions of different types of inter-enterprise relationships by applying attention mechanisms to assign weights to both neighboring enterprises and relationship types. The fused features are then used to determine the risk category of the target enterprise, enabling more effective risk assessment within the context of industrial supply chains [23]. It is evident that graph neural networks play a crucial role, such as Graph Convolutional Networks (GCN) [24] and Graph Attention Networks (GAT) [25], all of which effectively capture the topological information of graphs. However, when both node features and graph topology impact the task, these models still face challenges in adaptively balancing their relative importance. To address this issue, Wang proposed the AM-GCN model, which performs graph convolutions simultaneously on both the topology and feature graphs, adaptively adjusting their contributions to generate more robust and scalable embeddings [26].

Effectively leveraging various types of inter-enterprise relationships, rather than relying solely on intrinsic enterprise features, and accurately capturing their distinct impacts remain key challenges in evaluating the innovation capacity of seed enterprises. This paper proposes the MGCN model, which integrates multi-channel graph convolution with a gated attention [27] to fully exploit both the intrinsic attributes of seed enterprises and various types of inter-enterprise relationships. The model takes three types of relational graph as the input and processes them through four parallel GCN channels: three dedicated to extracting features from specific relations, and one designed to capture common patterns. Compared with approaches that merge all relations into a single heterogeneous graph, this architecture enables clearer separation and modeling of individual relationship effects, reducing information interference and enhancing representational capacity. Channel outputs are adaptively fused via a gated attention mechanism, allowing context-aware feature selection that emphasizes relationship information most relevant to innovation capability prediction. By integrating multi-source information and comprehensively modeling complex inter-enterprise connections, MGCN significantly improves the accuracy of innovation capability assessment for seed enterprises, providing a powerful tool to enhance enterprise competitiveness and promote sustainable agricultural development.

To validate the effectiveness of the proposed model, we conducted extensive experiments, including comparisons with nine baseline methods, ablation studies to evaluate the contribution of each model component, relational graph analysis to assess the impact of removing specific relationship types on prediction accuracy, feature analysis based on top-k correlated features to examine their influence on model performance, and case studies to explore the model’s decision-making logic and practical utility.

2. Materials and Methods

2.1. Data Collection and Preprocessing

Prior to the evaluation of seed enterprises’ innovation capacity, data collection and organization were conducted as essential preparatory steps. The data used in this study were mainly obtained from the China Seed Industry Big Data Platform and the Aiqicha website. These datasets included both internal enterprise information and external relational data, ensuring comprehensiveness and accuracy.

In total, 1000 enterprise records were collected, covering leading seed enterprises in China, along with large, medium, and small enterprises engaged in breeding. After data processing, 981 valid samples were retained. Based on innovation capacity, the samples were categorized into four levels: “Excellent”, “Good”, “Moderate”, and “Poor”, with corresponding sample sizes of 86, 304, 356, and 235, respectively.

The internal data of seed enterprises mainly include basic information, innovation outputs, qualifications, and honors, as detailed in Table 1.

The external relational data were primarily composed of shareholding and cooperation relationships among seed enterprises, while an additional set of innovation similarity data was constructed based on the enterprises’ primary innovation outputs. These various forms of relational information were subsequently utilized for the construction of enterprise relationship graphs, which served as the structural foundation for downstream modeling and analysis.

During data preprocessing, strict screening procedures were applied to ensure the representativeness and consistency of the research samples. Non-market-oriented seed-related entities, including agricultural research institutions, agricultural technology service centers, and agricultural technology extension centers, were excluded.

Some collected data on seed enterprises showed strong right-skewness, contained categorical variables, and had inconsistent measurement units, which could adversely affect model performance if left unprocessed. To ensure feature comparability and improve model learning efficiency, the following steps were applied:

Variables such as registered capital, paid-in capital, and number of insured employees exhibited large value ranges and significant right-skewness. To reduce skewness, a log transformation was applied using the following formula:

n^{'} = \ln (n + 1)

(1)

where

n

denotes any single value from registered capital, paid-in capital, or number of insured employees.

The categorical variable “province” was converted into a numerical format using one-hot encoding, whereby each province is represented as a separate binary feature.

The Z-score normalization method was applied to standardize the data to a distribution with a mean of 0 and a standard deviation of 1, thereby eliminating the differences in scales among different features. The formula is given by:

x = \frac{X - μ}{σ}

(2)

where

X

represents a feature in the dataset,

μ

is the mean of that feature, and

σ

is its standard deviation.

2.2. Construction of Inter-Enterprise Relationship Graphs

To more accurately evaluate the innovation capacity of seed enterprises, three inter-enterprise relationship graphs—namely the shareholding graph, the collaboration graph, and the innovation similarity graph—were constructed based on a curated dataset of 981 seed enterprises. These graphs capture diverse relational structures within the seed industry. Inter-enterprise relationships, such as equity holdings and strategic collaborations, are critical for the flow of innovation resource [28,29,30], and may exert an indirect but significant impact on innovation capacity. The innovation similarity graph is intended to reveal latent connections among enterprises, identify innovation clusters, and further enhance the model’s predictive performance.

In the shareholding graph, an edge is established between two seed enterprises if a shareholding relationship exists, reflecting ownership or investment ties that may facilitate the transfer of innovation resources, technologies, and strategic advantages. Given that this study focuses on evaluating the innovation capacity of seed enterprises, the number of variety approvals, variety registrations, and variety protections are considered key indicators of innovation performance [31,32]. Accordingly, in the collaboration relationship graph, an edge is created between two enterprises if they have jointly applied for any of the three aforementioned activities. Similarly, in the innovation similarity graph, cosine similarity is computed based on the three indicators. For each enterprise, those with an innovation similarity exceeding 80% are identified, from which the top three are selected to form edges with the focal enterprise. The cosine similarity between two enterprises (Enterprise A and Enterprise B) is calculated as follows:

c (A, B) = \frac{A \cdot B}{‖A‖ ‖B‖}

(3)

Using the aforementioned methods, three types of graphs were constructed: the shareholding relationship graph (Graph-S), the collaboration relationship graph (Graph-C), and the innovation similarity relationship graph (Graph-I) of seed enterprises. These three graphs share an identical node set, with each node corresponding to the same enterprise across all graphs; however, they differ substantially in their edge configurations, which reflect different forms of relational information. The number of nodes and edges in each graph is shown in Table 2.

To provide a more intuitive and comprehensive understanding of their structural characteristics, the graphs were visualized using the spring layout algorithm, which generates graphical representations where the distances and spatial arrangements between nodes clearly reflect their connectivity within the network. It can clearly be observed that the edges in Graph-S and Graph-C are relatively sparse, especially in Figure 1a, where most nodes lack connecting edges or form isolated components. Therefore, Graph-S may have a limited impact on the subsequent evaluation of innovation capacity. Given that innovation resources can be transferred between shareholder and investee enterprises [30,33], thereby affecting their innovation capability, the shareholding graph is modeled as an undirected graph in this study.

2.3. Evaluation Model

2.3.1. Overall Framework

This study constructs an innovation capacity evaluation model for seed enterprises based on a multi-channel graph convolutional network. The overall architecture of the proposed MGCN model is illustrated in Figure 2.

The model takes as input three heterogeneous relationship graphs representing shareholding (Graph-S), collaboration (Graph-C), and innovation similarity (Graph-I) among seed enterprises. These graphs are processed through four parallel graph convolutional channels: three dedicated to extracting relationship-specific features, and one common feature processing channel that aggregates features across all graphs using shared convolutional weights. Each channel propagates innovation-related information along the edges of its corresponding graph, allowing each node to integrate features from its neighbors. This message-passing mechanism ensures that a node’s embedding reflects both its own attributes and the structural context of connected enterprises. As shown in the “Multi-channel Node Features Aggregation” section of Figure 2, each channel generates a feature vector for each node (e.g.,

V_{3}

) via GCN-based aggregation, thereby encoding information related to the enterprise’s innovation capacity into vector representations. The Common Convolution module applies shared weights

W_{c}

to capture relationally invariant patterns. All four outputs are then stacked and passed into a Gated Attention module, which generates gating signals and attention weights to adaptively regulate the contribution of each feature. This allows the model to emphasize informative features while suppressing irrelevant ones. Finally, as shown in the “Output” block, the fused embeddings are fed into a fully connected layer to predict innovation capacity. This architecture enables the dynamic integration of multiple heterogeneous relationships and allows the model to adaptively adjust to diverse feature compositions, enhancing its capability to handle complex and varied data and supporting an accurate and comprehensive assessment of seed enterprises’ innovation capacity.

2.3.2. Specific Feature Processing Channel for Single Graph

In this channel, Graph Convolutional Networks (GCNs) are used to update node features by aggregating information from each node and its neighbors. In standard GCNs, each neighboring node contributes equally, and through multi-layer propagation, nodes can indirectly capture information from more distant nodes. The update rule is defined as

H^{(l + 1)} = σ (D^{- \frac{1}{2}} (A + I) D^{- \frac{1}{2}} H^{(l)} W^{(l)})

(4)

Here,

l

denotes the layer number of the GCN,

σ

is a non-linear activation function,

A

is the original adjacency matrix, and

I

is the identity matrix used to introduce self-loops.

D

is the degree matrix, where the diagonal elements represent the degree of each node, and

W^{(l)}

is the trainable weight matrix at layer

l

. Based on this formulation, the embeddings obtained from the feature processing channels for the shareholding graph, the collaboration graph, and the innovation similarity graph are denoted as

H_{s}

,

H_{p}

, and

H_{o}

.

2.3.3. Common Feature Processing Channel for Multiple Graphs

To capture shared patterns across different graphs, a common feature processing channel applies a shared weight matrix

W_{c}^{(l)}

to all graphs. The GCN update rule is applied independently to each graph using this shared weight, Taking the shareholding graph (Graph-S) as an example, the node embedding update is computed as

H_{c s}^{(l + 1)} = σ (D_{s}^{- \frac{1}{2}} (A_{s} + I) D_{s}^{- \frac{1}{2}} H_{c s}^{(l)} W_{c}^{(l)})

(5)

Using this method, embeddings

H_{c s}

,

H_{c p}

and

H_{c o}

. for the shareholding, collaboration, and innovation similarity graphs are obtained and combined via weighted fusion to produce the common channel output:

H_{c} = w_{1} H_{c s} + w_{2} H_{c p} + w_{3} H_{c o}

(6)

The parameters

w_{1}

,

w_{2}

,

w_{3}

are trainable weights bounded between 0 and 1, satisfying the condition that their sum equals 1.

2.3.4. Gated Attention for Feature Fusion

The four embeddings

H_{s}

,

H_{p}

,

H_{o}

and

H_{c}

. are stacked to form

H_{z}

, which is input into a gated attention mechanism. This mechanism adaptively weighs the contribution of each channel through a combination of attention scores and gating signals. Specifically, linear projections are applied to derive query, key, value, and gate matrices:

Q = w_{q} H_{z}, K = w_{k} H_{z}, V = w_{v} H_{z}, G = σ (w_{g} H_{z})

(7)

Here,

w_{q}

,

w_{k}

,

w_{v}

, and

w_{g}

represent the linear transformation weight matrices for the Query, Key, Value, and Gate, respectively. The function

σ

denotes the Sigmoid activation function, which constrains the gating signal to a range between 0 and 1. Next, the attention scores are computed using the Query and Key:

S c o r e s = \frac{Q K^{T}}{\sqrt{D^{'}}}

(8)

Here,

K^{T}

is the transpose of the Key matrix,

D^{'}

is set to the dimensionality of the hidden layer in the gated attention mechanism and

\sqrt{D^{'}}

is a scaling factor used to prevent the dot-product results from becoming too large. Attention scores are normalized via softmax to obtain

A t t n

. The gated value is computed by element-wise multiplying the Value matrix with the gate

V_{g} = V ⊙ G

(9)

The final output is obtained by a weighted sum followed by a linear transformation:

o u t p u t = W_{f} (\sum_{x = 1}^{X} A t t n V_{g})

(10)

Among them,

W_{f}

is the weight matrix of the final linear transformation.

\sum_{x = 1}^{X} A t t n V_{g}

represents the sum of the features of all sequence positions and combines the features of all sequence positions for each sample into one feature vector.

2.3.5. Multi-Objective Loss Function

To guide training, a multi-objective loss is defined with three components: classification loss, consistency constraint, and diversity constraint. These components work together to optimize the overall performance of the model.

The classification loss measures the difference between the model’s predictions and ground-truth labels. To address sample imbalance, Weighted Cross-Entropy Loss is used, with class weights based on the number of training samples per class. This helps the model focus more on minority classes during training, enhancing robustness to imbalanced data. The class weights are set as follows:

{\tilde{w}}_{n} = \frac{\frac{1}{{s u p p o r t}_{n}}}{\sum_{k = 1}^{N} \frac{1}{{s u p p o r t}_{k}}}

(11)

Here,

N

denotes the number of classes,

{s u p p o r t}_{n}

represents the number of samples in class

n

, and

{\tilde{w}}_{n}

is the normalized weight for class

n

. The weighted cross-entropy loss is formulated as

L_{c l s} = - \sum_{i \in t r a i n n o d e s} w_{y_{i}} \log (p_{i, y_{i}})

(12)

where

w_{y_{i}}

denotes the weight corresponding to the true class

y_{i}

of node

i

and

p_{i, y_{i}}

represents the predicted probability that node

i

belongs to its true class

y_{i}

, which is obtained from the output layer of the model.

Encourages embeddings from different graphs in the common feature channel to be similar. For each embedding

H_{c s}

,

H_{c p}

,

H_{c o}

, a similarity matrix

S = H H^{T}

, is computed and normalized. The consistency loss is

L_{c} = {‖S_{c s} - S_{c p}‖}_{F}^{2} + {‖S_{c p} - S_{c o}‖}_{F}^{2}

(13)

We ensure distinctiveness between embeddings from the specific and common channels of the same graph using HSIC [34]:

H S I C (H_{s}, H_{c s}) = \frac{1}{{(n - 1)}^{2}} t r a c e (K_{s} H K_{c s} H)

(14)

Here, trace denotes the trace of a matrix.

K_{s}

and

K_{c s}

are the Gram matrices defined as

K_{s} = H_{s} H_{s}^{T}

,

K_{c s} = H_{c s} H_{c s}^{T}

, respectively. The matrix

H

is the centering matrix given by

H = I - \frac{1}{n} E E^{T}

, where

I

is the identity matrix and

E

is a column vector of all ones. The total diversity loss is:

L_{d} = H S I C (H_{s}, H_{c s}) + H S I C (H_{p}, H_{c p}) + H S I C (H_{o}, H_{c o})

(15)

The overall loss function is formulated as a weighted sum of the classification loss, the consistency constraint, and the diversity constraint:

L = L_{c l s} + α L_{c} + β L_{d}

(16)

Here,

α

and

β

are hyperparameters used to control the weights of the consistency and diversity constraints, respectively. During training, α decreases while β increases. Early on, a higher α helps the model stabilize faster by emphasizing consistency. Later, reducing α and increasing β encourages more diversity in embeddings, improving the model’s generalization.

2.3.6. Seed Enterprise Innovation Capability Prediction

The

o u t p u t

from the gated attention mechanism is used for the multi-class classification task. Specifically, it is fed into a fully connected layer to obtain the raw scores for each class:

{\hat{y}}_{l o g i t s} = W_{f c} \cdot o u t p u t + b_{f c}

(17)

where

{\hat{y}}_{l o g i t s}

denotes the output of the fully connected layer, while

W_{f c}

and

b_{f c}

are the weight matrix and bias vector, respectively. The raw scores

{\hat{y}}_{l o g i t s}

are then transformed into a probability distribution using the softmax function, which ensures that each class score is between 0 and 1, and the sum of all class scores equals 1. The softmax function is defined as

{\hat{y}}_{i} = \frac{e x p ({\hat{y}}_{l o g i t s, i})}{\sum_{j = 1}^{N} e x p ({\hat{y}}_{l o g i t s, j})}

(18)

where

N

denotes the number of classes, and

\hat{y}

represents the predicted probability for the

i

-th class.

2.4. Evaluation Metrics

To comprehensively evaluate the performance of the model, 5-fold cross-validation was conducted, and the average values of various evaluation metrics were calculated. These metrics include accuracy, precision, recall, and F1-score. specifically, precision, recall, and F1-Score were computed using the Macro-average method [35], resulting in Macro Precision, Macro Recall, and Macro F1-Score, Macro-averaging is performed by calculating the metric independently for each class and then taking the average of these values. This method assigns equal weight to all classes, regardless of the number of instances in each individual class, making it particularly suitable for imbalanced classification scenarios. The mathematical formulations of these evaluation metrics are presented as follows:

A C C = \frac{T P + T N}{T P + T N + F P + F N}

(19)

P r e c i s i o n = \frac{T P}{T P + F P}

(20)

R e c a l l = \frac{T P}{T P + F N}

(21)

F 1 - S c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(22)

TP refers to the number of instances correctly predicted as positive by the model; TN refers to the number of instances correctly predicted as negative; FP refers to the number of instances incorrectly predicted as positive; and FN refers to the number of instances incorrectly predicted as negative.

3. Results

3.1. Experimental Setup

A series of experiments were designed and conducted in this study, including model comparison experiments, ablation studies on the model architecture, analysis of the effectiveness of relation graphs, and feature analysis. The proposed model was evaluated against nine representative baselines: LR [36], SVM [15], RF [37], MLP [38], GCN [24], GAT [25], GraphSAGE [39], GraphTransformer [40], and R-GCN [41]. Since LR, SVM, RF, and MLP cannot process graph data, only node features of seed enterprises were used as input in these experiments. GCN, GAT, GraphSAGE, GraphTransformer, and R-GCN support graph structures, so the three types of relationships were merged into a single graph as input.

Due to the presence of class imbalance in the data, targeted handling methods were applied to different models to enhance their robustness and overall learning performance. For the LR, SVM, and RF models, the SMOTETomek method was used on the training data. This hybrid approach addresses class imbalance by both increasing the number of minority class samples through interpolation and reducing noise by removing majority class samples near the decision boundary. For the GCN, GAT, GraphSAGE, GraphTransformer, R-GCN, and MGCN models, conventional sampling methods may disrupt the graph topology; therefore, a weighted cross-entropy loss function was employed, assigning weights to different classes to improve the models’ ability to handle imbalanced data. In addition, focal loss was also explored as an alternative loss function to further enhance the model’s sensitivity to hard-to-classify minority samples. The MLP model adopted both approaches simultaneously.

All experiments were conducted using 5-fold stratified cross-validation to mitigate potential bias from random data partitioning, ensuring each fold maintained a class distribution consistent with the overall dataset. To guarantee reproducibility, a fixed random seed of 33 was used throughout.

To determine the optimal configuration for each model, an exhaustive grid search was carefully performed. The final hyperparameter settings used in our experiments are summarized in Table 3. The full range of hyperparameter values explored during the search process is provided in Appendix A.

3.2. Comparative Experiments

By comparing with various classical machine learning models (LR, SVM, RF), neural network models (MLP), and graph neural network models (GCN, GAT, GraphSAGE, GraphTransformer, R-GCN), we comprehensively evaluated the proposed MGCN model in terms of performance, stability, and node embedding quality. The results demonstrate its significant advantages and robustness in assessing the innovation capability of seed enterprises, establishing it as an effective and powerful technical tool.

In Table 4, the MGCN model achieved the best performance, with an accuracy (ACC) of 0.8359 ± 0.0273, Macro Precision of 0.8173 ± 0.0316, Macro Recall of 0.8277 ± 0.0536, and Macro F1-Score of 0.8186 ± 0.0398. These results indicate that the model achieves high accuracy while effectively identifying positive classes and maintaining strong overall performance. This superiority is attributed to the model’s multi-channel architecture, which processes different relation graphs separately to capture the varied impacts of each relationship type on innovation capability. Moreover, the gated attention mechanism enables efficient feature selection and fusion based on multi-channel embeddings, allowing the model to integrate node features and graph topology. This suggests that innovation capability predictions are influenced by both internal attributes and inter-enterprise relationships.

In comparison, RF performed best among traditional machine learning models, achieving 78.8% accuracy. However, it cannot process graph data and thus ignores topological structure information, limiting its effectiveness. LR, a simple linear classifier, performed worse due to its limited capacity to model complex data. The SVM model, with a linear kernel, outperformed other kernels, suggesting that the relationship between features and labels tends to be more linear. By using the kernel trick to project data into higher-dimensional space and finding a margin-maximizing hyperplane, SVM is more effective than LR. However, it is sensitive to outliers and may underperform when the sample size greatly exceeds the number of features, as it relies on samples to define the decision boundary.

In Table 4, graph neural networks clearly outperform traditional models due to their ability to exploit structural information. GCN achieves 80.63% accuracy by aggregating fixed-weight neighborhood features, but this rigidity limits its ability to capture complex graph variations. GAT improves F1 score to 79.51% by assigning attention weights dynamically, though it is more sensitive to noisy neighbors and still processes only a single homogeneous graph. GraphSAGE attains 80.33% accuracy by sampling neighbors and using max pooling, which boosts efficiency but fails to distinguish edge types, weakening relational modeling. GraphTransformer utilizes multi-head attention to capture global graph patterns, achieving an accuracy of 79.92%. However, due to the simple graph structure and limited number of nodes and features, the advantages of global attention are limited and prone to introducing noise, which affects performance improvement. R-GCN explicitly models multiple relation types through relation-specific transformations, achieving a high recall of 82.49%, which highlights the importance of relation awareness. However, due to the lack of a gating fusion mechanism, it lacks flexibility in integrating multi-relational information. In contrast, MGCN attains higher accuracy (83.59%) and F1 score (81.86%). Its multi-channel design independently processes each relation’s subgraph, allowing it to better capture the distinct features and structural differences displayed by each relation. The dynamic fusion of features and topology enables MGCN to integrate multi-relational information more flexibly.

All models exhibited varying degrees of performance fluctuation in their prediction results (Table 4), mainly due to dataset differences across 5-fold cross-validation. To further assess model stability, three rounds of 5-fold cross-validation with different random seeds were conducted, generating 15 results per model. In Figure 3, the box represents the interquartile range (IQR) of the data, with the line inside indicating the median, reflecting the central tendency of the dataset. A small square within the box marks the mean value, allowing comparison with the median to assess data skewness. The whiskers show the range of non-outlier data. The overlaid scatter points represent all individual data values, helping to visualize the spread and distribution of the data.

In Figure 3, the scatter plots display the individual results from multiple runs of each model, while the height and spread of the box plots reflect the models’ central tendency and robustness, highlighting those that are both efficient and stable. MGCN consistently shows the highest medians and the narrowest interquartile ranges across accuracy, precision, recall, and F1-score, indicating superior performance with low variability. GCN, R-GCN, and GraphTransformer also exhibit relatively high medians and compact distributions, suggesting stable results attributed to their effective graph representation learning mechanisms. In contrast, SVM, RF, and MLP have wider box plots and more dispersed scatter points, especially in recall and F1-score, reflecting greater fluctuations and lower stability in their performance, possibly because these models lack the ability to fully exploit graph relational information, making them more sensitive to data variability.

To demonstrate model effectiveness in feature processing, t-SNE was used to reduce dimensionality and visualize the standardized input features and final-layer node embeddings from various neural network models. In Figure 4, nodes are colored by innovation capability labels, where Classes 1, 2, 3, and 4 correspond to innovation levels of “Poor”, “Moderate “, “Good”, and “Excellent”, respectively.

In Figure 4a, the original input features produce a highly fragmented distribution in the embedding space, forming multiple small and scattered clusters. Nodes from different classes are intermixed without clear boundaries. In Figure 4b, the node embeddings learned by the Multi-Layer Perceptron (MLP) model exhibit a certain degree of class separation, with improved intra-class similarity compared to the original input features. However, there is still significant overlap between different classes—particularly between classes 2 and 3. This indicates that the MLP, lacking access to graph structural information, is limited in its ability to model complex inter-enterprise relationships, resulting in suboptimal node representations in multi-relational contexts. Graph-based models generally produce better results, showing clearer class separation compared to non-graph models. However, overlap between classes 2 and 3 remains a common issue across several methods. Overall, the MGCN model demonstrates the best class separability, with more distinct and compact clusters. R-GCN and GraphSAGE also achieve relatively good performance in clustering. Additionally, nodes belonging to class 4 consistently form well-defined clusters across different graph models, suggesting that the innovation capability represented by this class is more easily distinguishable from others.

3.3. Ablation Experiments of MGCN Components

The contributions of the main components of the MGCN model to the overall performance were evaluated by removing each component individually. Specifically, the following variants were tested:

MGCN-noF: the specific feature processing channels were removed;

MGCN-noC: the common feature processing channel was removed;

MGCN-noG: the gated attention mechanism was replaced with average weighting;

MGCN-noLc: the consistency constraint was removed from the loss function;

MGCN-noLd: the diversity constraint was removed from the loss function.

Table 5 presents the experimental results of these variants as well as the complete MGCN model. To provide a more intuitive comparison of the models’ performance, the results were visualized using a bar chart (Figure 5). In Figure 5, the complete MGCN model outperforms all its variants across all evaluation metrics, indicating that each component contributes positively to the overall performance. Notably, the MGCN-noG variant shows the most significant performance drop, highlighting the powerful role of the gated attention mechanism in dynamically selecting and integrating features.

The performance of the MGCN-noF variant also drops considerably, indicating that capturing structural information specific to each relational graph is essential. Without these dedicated channels, the model fails to fully exploit the heterogeneity across different relation types, thereby limiting its representational capacity. Removing the common feature channel (MGCN-noC) results in a smaller but still noticeable decline in performance. This suggests that although the overlap among the three relational graphs is limited, the common channel still provides valuable integrative information that complements the specific channels. Their combined effect enables the MGCN model to balance the trade-off between relation-specific information and generalization.

The MGCN-noLc and MGCN-noLd variants, which remove the consistency and diversity constraints, respectively, also exhibit slight performance declines. The consistency constraint ensures that the common channel learns coherent and stable features across different relational graphs, while the diversity constraint encourages the specific channels to capture distinctive and discriminative features. These constraints contribute to a more effective optimization process, enhancing the model’s overall robustness and generalization capability.

In summary, the ablation study results clearly demonstrate that each component of the MGCN model plays a critical and indispensable role in improving the accuracy of innovation capability prediction for seed enterprises. The specific and common feature processing channels enhance the model’s feature extraction from both specificity and commonality perspectives. The gated attention mechanism significantly improves feature utilization efficiency by dynamically adjusting feature fusion and emphasizing key features. The consistency and diversity constraints optimize the loss function to enhance the aggregation of node features in both channels from the perspectives of commonality and specificity, further boosting the model’s performance. These findings not only validate the effectiveness of the model architecture but also provide guidance for future improvements and optimizations.

3.4. Analysis of the Effectiveness of Relational Graphs

To investigate the impact of different relational graphs on innovation capability evaluation, we conducted an analysis by removing, respectively, the shareholding graph (MGCN-noGS), the cooperation graph (MGCN-noGC), and the innovation similarity graph (MGCN-noGI) from the MGCN model.

As shown in Table 6, the model using the complete set of relational graphs outperforms all its variants across all evaluation metrics, indicating that each relational graph plays a critical role in enhancing the accuracy of innovation capability evaluation. In particular, the MGCN-noGI model exhibits the most significant performance degradation, highlighting the importance of the innovation similarity graph in identifying clusters of enterprises with similar innovation capabilities. Innovation similarity graph is constructed based on the number of variety approvals, registrations, and protections, indirectly verifying the importance of these features. The performance drop in the MGCN-noGS model is relatively smaller, likely due to the limited number of edges in the shareholding graph, which reduces the amount of information propagated through the graph. Nonetheless, shareholding relationships still have a certain impact on innovation prediction. The performance degradation in MGCN-noGC is more notable, suggesting that cooperation relationships facilitate the flow of innovation resources among enterprises, underscoring their importance in improving prediction accuracy.

Moreover, to provide a clearer and more intuitive understanding of the classification effectiveness of each model, the outcomes from the third fold of the five-fold cross-validation are presented in the form of confusion matrices, thereby facilitating a detailed analysis of performance variations and classification accuracy across different categories.

In Figure 6a, the MGCN model achieves an accuracy of 84.18%, demonstrating strong predictive performance. Misclassifications are primarily concentrated around the diagonal of the confusion matrix, suggesting that most errors occur between similar or neighboring classes. Notably, no misclassification is observed between class 1 and class 4 or between class 2 and class 4, which highlights clear distinctions between these categories and a reduced likelihood of confusion.

For the MGCN-noGS model (Figure 6b), the accuracy drops slightly to 82.14%, indicating a modest decline in overall predictive performance compared to the full MGCN model. Notably, the classification performance for class 3 deteriorates, with more instances being misclassified into adjacent categories. This observation suggests that the inclusion of the shareholding graph contributes positively to distinguishing between classes 2 and 3, thereby enhancing the model’s ability to capture nuanced relational information that aids in more accurate categorization.

In the MGCN-noGC model (Figure 6c), the accuracy decreases to 80.10%. Although classification performance for classes 1 and 2 remains relatively strong, misclassifications between classes 2 and 3 increase significantly—particularly with many samples labeled as “Good” (class 3) being predicted as “Moderate” (class 2). This indicates that cooperation relationships are especially important for distinguishing between these two categories, and that inter-enterprise cooperation may help improve innovation capabilities in mid-level seed enterprises.

The MGCN-noGI model (Figure 6d) achieved an accuracy of 76.53%, with overall classification performance inferior to that of the other models. In particular, its ability to distinguish between class 1 and class 2 significantly declined, leading to frequent misclassifications between these two categories. This indicates that some seed enterprises in class 1 and class 2 are relatively similar. The similarity relationships—constructed based on the number of variety approvals, registrations, and protections—focus on the similarity of innovation outputs among seed enterprises, which helps the model better distinguish between class 1 and class 2. Moreover, a number of class 4 enterprises were misclassified as class 3, indicating the model’s limited capacity to identify enterprises with outstanding innovation capabilities when similarity-based innovation output signals are removed.

In summary, the shareholding graph, cooperation graph, and innovation similarity graph all have significant impacts on the model’s performance. Among them, the innovation similarity graph proves to be the most important, contributing the most to predictive accuracy. It is particularly effective when enterprises share similar features, as it helps highlight the differences in their innovation outcomes. Although the cooperation and shareholding graphs have relatively smaller contributions, they remain indispensable. These relationship graphs capture the complex inter-enterprise connections from different perspectives, and their combined effect enhances the predictive capability of the model.

3.5. Analysis of Enterprises Features

In the evaluation of innovation capability in seed enterprises, it is important to recognize that not all features contribute equally to the final prediction results. Therefore, to gain a more comprehensive and deeper understanding of the role and significance of each individual feature, Pearson correlation coefficients were computed, allowing for an objective assessment of the correlation between each feature and the categorized levels of innovation capability, as illustrated in Figure 7.

As shown in Figure 7, features such as the number of approved varieties, number of insured employees, number of protected varieties, and registered capital exhibit strong correlations with innovation capability, indicating their significant influence in the evaluation process. The “province” feature, to some extent, implicitly reflects geographic location and regional policy support. For example, Hunan Province, compared to more remote areas, is more suitable for crop cultivation and field trials, which may explain its stronger correlation with innovation performance. In contrast, features from certain provinces show weaker or even negative correlations with innovation capability. In fact, the number of transformants—a result of transgenic technology—also serves as an important indicator of enterprise innovation capability [42], However, due to the high complexity and difficulty of transformations development, the number of transformations currently developed by enterprises remains very limited, resulting in a low Pearson correlation coefficient that does not fully reflect their potential value in evaluating innovation capability.

To further validate the impact of feature selection on model performance, we conducted a series of controlled experiments using subsets of features ranked by the absolute value of their Pearson correlation coefficients. Specifically, we selected the top 10 (MGCN-F10), top 20 (MGCN-F20), top 30 (MGCN-F30), and top 40 (MGCN-F40) most relevant features for testing, thereby examining how varying the quantity of features affects the predictive accuracy of the model.

As shown in Table 7, the MGCN model using all features achieved the best performance, indicating that a more comprehensive feature input leads to higher model accuracy, The performance of MGCN-F10, which used only the top 10 features, still reached an accuracy of 79.31%. As the number of selected features increased, model performance improved steadily. Additionally, it was observed that performance improvement was more pronounced when the number of features was relatively low; however, the rate of improvement diminished as more features were added. This suggests that features with a high correlation with innovation capability play a dominant role in prediction, while features with a lower Pearson correlation, though individually less informative, provide supplementary information that enhances the model’s generalization ability and predictive accuracy.

3.6. Case Analysis

Understanding the innovation capabilities of seed enterprises is crucial for both government policy formulation and enterprise strategic planning. Such evaluations can help governments identify strong and weak enterprises, allocate resources effectively, and implement targeted support, while enabling enterprises to develop strategies that foster sustainable innovation and enhance competitiveness. The proposed Multi-Channel Graph Convolutional Network (MGCN) model provides an accurate and comprehensive assessment of innovation capabilities, offering reliable technical support for these policies and strategic initiatives.

To better illustrate the practical value and interpretability of the proposed model, we conduct a case study on representative enterprises from each innovation capability category, including both correctly and incorrectly classified cases. A summary of the selected cases is presented in Table 8.

To gain deeper insight into how the model makes decisions across different scenarios, we visualize the two-hop neighborhoods of eight representative nodes. As shown in Figure 8, in each subgraph, the target node is represented as a blank node, while all other nodes are colored according to their ground-truth labels: Label 1 (Poor), Label 2 (Moderate), Label 3 (Good), and Label 4 (Excellent). Edges in the subgraphs correspond to three types of relationships: Graph-S (blue) for shareholding, Graph-C (green) for cooperation, and Graph-I (orange) for innovation output similarity. These visualizations help us understand how the model aggregates information from heterogeneous relations and multi-hop neighbors.

In graph-structured data, the connections between nodes play a critical role in both information propagation and pattern recognition. By examining the model’s prediction outcomes, we observe a clear trend: when a target node is surrounded by a large proportion of neighbors from a particular class, the model tends to assign the target node to that same class.

For instance, as shown in Figure 8c, the feature representation of Enterprise 3 is relatively inconspicuous, lacking strong signals indicative of either extremely high or extremely low innovation capability. In such cases, information from neighboring nodes becomes a key reference for the model’s decision-making. Since Enterprise 3 is connected via cooperation and innovation similarity relations to several highly innovative enterprises, the model classifies it as belonging to the “Good” innovation capability category. This outcome demonstrates that when node-level features are insufficient or ambiguous, the surrounding graph structure provides valuable support for the model’s inference.

In most cases, the model is able to correctly identify the class of the target node, particularly when the node’s features are distinctive and consistent with those of its neighbors. For example, in Figure 8d, Enterprise 4 is accurately classified. This enterprise is characterized by outstanding innovation capability: it is large in scale, has a substantial number of employees, holds 196 approved new varieties, and has been designated as a national-level enterprise technology center. With a well-recognized leadership position in the industry and strong innovation achievements, the enterprise presents highly distinctive features and is surrounded by neighbors with consistent labels, making it easy for the model to make a correct prediction.

However, classification based on graph structure is not always reliable. In some cases, the model may over-rely on the label distribution of neighboring nodes while neglecting the intrinsic features of the target node, leading to misclassification. For example, in Figure 8e, Enterprise 5 is incorrectly classified as Label 2 (Moderate), although its ground-truth label is Label 1 (Poor). Upon examining the enterprise’s profile, we find that it has published only two scientific papers and lacks other key indicators of innovation such as new variety development or patents, suggesting weak innovation capability. Nevertheless, in the graph, it maintains a cooperation relationship with one Label 2 node and shares innovation similarity with several Label 3 nodes, which likely misled the model due to neighborhood influence. A similar case appears in Figure 8g, where Enterprise 7 is misclassified as Label 2 (Moderate) instead of its true label, Label 3 (Good). Structural analysis shows that this node is surrounded predominantly by nodes of Label 2, which likely exerted a strong influence on the model’s decision, preventing it from accurately identifying the enterprise’s true level of innovation.

These visualizations reveal how the model performs relational reasoning within a multi-relational graph and how structural information contributes to classification decisions. At the same time, they highlight potential risks, such as over-dependence on local neighborhood signals, underscoring the importance of balancing structural context with node features in graph neural network applications. Although a few misclassifications occur, the overall results indicate that the model reliably captures key innovation patterns and provides informative assessments of enterprise innovation capabilities.

In practice, target enterprises for evaluation can first be identified, and their relationship graphs can be constructed based on relevant data, or an industry-wide graph covering all seed enterprises can be directly built. The MGCN model is then applied to assess the innovation capabilities of these enterprises. The resulting innovation capability ratings can help governments optimize support policies, allocate resources more effectively, and track the progress of key enterprises. Enterprises can also use the assessment results for self-positioning, formulate targeted improvement measures, and plan strategies for sustainable enhancement of their innovation capabilities.

4. Discussion

Seed enterprises play a crucial role in agricultural innovation and sustainable development, contributing to food security and industry growth [43]. Accurately evaluating their innovation capability is essential for guiding resource allocation and fostering industrial development. However, most existing methods primarily focus on the intrinsic characteristics of individual enterprises, while neglecting the complex inter-enterprise relationships that significantly influence innovation performance. Moreover, they often suffer from limited feature integration, failing to capture the multifaceted factors embedded in enterprise networks.

To address these challenges, we propose a Multi-Channel Graph Convolutional Network (MGCN) model that effectively integrates multi-source relational data and intrinsic enterprise features. Unlike traditional models that rely on manually crafted features or single-graph structures, MGCN leverages three types of enterprise relational graphs. By leveraging graph-based data structures, the proposed model effectively overcomes the inability of traditional approaches to represent the heterogeneous and structurally complex interactions among enterprises, which often results in the omission of critical relational features during modeling [44]. The model employs four parallel channels to process node and structural features independently and uses a gated attention mechanism for dynamic cross-graph feature fusion. This multi-channel architecture [45] enables the extraction of diverse local and multi-level factors influencing innovation capability, while gated attention helps in the integration of node features and graph structural features, as validated by our ablation studies. Prior studies have demonstrated the advantages of attention mechanisms in multimodal fusion and graph-structured learning [46,47].

Traditional approaches such as logistic regression (LR) and support vector machines (SVM) typically rely on manually constructed features and exhibit limited capacity for modeling relational structures [48]. These methods often lack effective mechanisms for feature integration [49], which constrains their performance—especially under the high-dimensional, heterogeneous, and unstructured data environment characteristics of agricultural big data. Although such models may demonstrate reasonable interpretability and stability on small to medium-sized datasets [50], their generalization and representational power remain insufficient in more complex settings. Graph neural networks (GNNs), by contrast, offer inherent advantages in modeling structured data and have shown promise in capturing the intricate relational patterns among enterprises. However, most existing GNN-based models rely on a single graph structure, limiting their ability to capture the multiple types of relationships of real-world enterprise network [51]. In comparison, the proposed MGCN model integrates multi-channel node feature embeddings and deep feature fusion to effectively leverage diverse information sources associated with seed enterprises. This enables MGCN to address several key limitations of both traditional machine learning methods and existing GNNs in the context of modeling agricultural enterprises. Specifically, under five-fold cross-validation, MGCN obtains an average accuracy of 83.59% ± 2.73%, surpassing traditional models such as logistic regression (77.17%), SVM (77.37%), random forest (78.80%), and multilayer perceptron (76.05%). It also outperforms classical GNN models, including GCN (80.63%), GAT (81.35%), GraphSAGE (80.33%), GraphTransformer (79.92%), and R-GCN (81.45%). The improvement over R-GCN, the strongest among baselines, is approximately 2.14 percentage points. These results underscore MGCN’s advantage in integrating multi-relational structures and capturing complex inter-enterprise dynamics, which are crucial for evaluating innovation capability in real-world agricultural networks.

While the proposed MGCN model exhibits superior predictive performance, it comes with increased computational cost due to its multi-channel architecture and gated attention mechanism. Specifically, the average running time per fold reaches 95.95 s, significantly higher than those of GAT (6.08s), GraphSAGE (7.89s), and R-GCN (21.39s). This increase is primarily attributed to the added model capacity and cross-graph fusion components. However, to mitigate potential overfitting risks associated with such complexity, we employ Dropout regularization and validate model robustness through five-fold cross-validation. Experimental results show stable performance across all folds, suggesting that the added complexity brings net performance gains without evident overfitting. Nevertheless, future work will explore more lightweight architectures to further balance accuracy and efficiency.

To deepen understanding of MGCN’s behavior, we conducted case analyses on correctly and incorrectly classified seed enterprises. The results show that the model not only leverages target enterprise features but also strongly depends on neighboring nodes’ features and predicted labels. Frequently, a target node is assigned the category dominant among its neighbors, highlighting the influence of local relational structure. While this reflects effective graph topology utilization, it also reveals potential vulnerability to misclassification when neighbors’ labels are mixed or noisy. These insights suggest future directions to refine neighborhood aggregation and enhance robustness against label noise.

The MGCN model also has limitations related to data and interpretability. First, it is based on static graphs, which represent the enterprise relationships at a fixed point in time and are simpler to construct and analyze. Static graphs effectively capture the overall relational structure and are less computationally intensive, making them a practical choice when temporal data are limited or unavailable. However, static graphs cannot model the temporal evolution of enterprise relationships, such as changing collaborations, equity transfers, or shifting innovation trajectories, which are common in real-world settings. Dynamic graphs offer the advantage of explicitly capturing such temporal patterns, potentially improving the accuracy and timeliness of innovation capability evaluation [52]. Due to current data constraints and modeling complexity, this study uses static graphs, with future research planned to incorporate dynamic graph neural networks—using temporal convolutional or sequential modeling layers—to better represent temporal dynamics.

Moreover, current node features are mainly derived from public statistical data and lack important innovation-related indicators like technology commercialization rates and research team composition. Expanding feature dimensionality and diversity could improve characterization depth. Finally, despite GNNs’ power in modeling complex structures, their “black-box” nature limits interpretability, hindering adoption in policy and management. Future studies may integrate explainability methods [53] to enhance transparency and trustworthiness, facilitating broader practical application.

5. Conclusions

As a fundamental driver of agricultural productivity, the innovation capability of seed enterprises plays a critical role in promoting sustainable agricultural development. Accordingly, precise evaluation of this capability is key to identifying their developmental stages and informing targeted resource allocation. Existing approaches tend to emphasize enterprise-level attributes in isolation, overlooking the complex relational dynamics among enterprises that critically shape innovation outcomes. In addition, their limited capacity for feature integration constrains their ability to capture the high-order, multi-source information embedded within enterprise networks.

To address these limitations, we propose a Multi-Channel Graph Convolutional Network (MGCN) model that integrates internal enterprise attributes with three types of inter-enterprise relational graphs—shareholding, collaboration, and innovation similarity. The model adopts a multi-channel architecture and a gated attention mechanism to enable cross-graph feature fusion, effectively combining node-level features with structural information. Experimental results show that MGCN achieves an average accuracy of 83.59% under five-fold cross-validation, outperforming nine baseline models, including Random Forest, MLP, and several GCN variants. Ablation studies further confirm that the gated attention mechanism contributes most significantly to performance improvements. Among the three types of relational graphs, innovation similarity exerts the most significant influence on prediction accuracy. It facilitates the formation of meaningful innovation clusters and enhances the model’s ability to capture and infer complex relational patterns. Beyond quantitative metrics, the case analysis demonstrates that MGCN not only learns the intrinsic features of individual enterprises but also leverages the structural context and label distribution of neighboring nodes. The model tends to classify target nodes into the predominant category within their local subgraphs. This observation highlights the critical role of graph topology in prediction, while also revealing potential vulnerabilities—such as sensitivity to noisy or inconsistent neighbor labels. In practical applications, such evaluations can guide governments in optimizing support policies, allocating resources more effectively, and monitoring the progress of key enterprises, while enterprises themselves can leverage the results to identify weaknesses, plan targeted improvements, and design strategies for sustainable innovation and enhanced competitiveness.

Although the MGCN model demonstrates strong performance, it still has some limitations. First, it relies on static graphs and cannot capture the temporal evolution of enterprise relationships, such as changing collaborations or equity structures. Integrating dynamic graph models could better reflect real-world variations. Second, node features are primarily derived from publicly available data, which may not fully capture certain innovation-related aspects, such as team composition or technology commercialization activities. Finally, the interpretability of MGCN remains limited, potentially constraining its adoption in policymaking and management practices. Future work will explore the integration of temporal modeling, enriched feature sets, and explainability techniques.

Author Contributions

All authors contributed to the study’s conception and design. Conceptualization, S.P. and K.W.; Methodology, S.T. and K.W.; Software, S.T.; Validation, S.P.; Formal Analysis, S.P.; Investigation, F.Y. and S.P.; Resources, K.W.; Data Curation, F.Y.; Writing—Original Draft Preparation, S.T.; Writing—Review and Editing, S.T. and F.Y.; Visualization, S.T.; Supervision, K.W.; Project Administration, K.W.; Funding Acquisition, F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Biological Breeding-National Science and Technology Major Project (2023ZD0406104).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A

In this study, the hyperparameters of all models were optimized using grid search. To ensure a fair comparison, we designed a reasonable and consistent search space for each model. The specific hyperparameters and their corresponding search spaces are detailed in the table below.

Table A1. Hyperparameter search spaces used for all models in this study.

Model	Hidden Units	Dropout Rate	Learning Rate	Weight Decay	Loss Function	Other Hyperparameters
LR	-	-	-	-	-	penalty: ‘l1’, ‘l2’, ‘elasticnet’; C: 0.01, 0.1, 1, 10, 100; solver: ‘lbfgs’, ‘newton-cg’, ‘sag’, ‘liblinear’; max_iter: 100, 300, 500, 1000; l1_ratio: 0.3, 0.5, 0.7
SVM	-	-	-	-	-	kernel: ‘rbf’, ‘poly’, ‘linear’; C: 0.01, 0.1, 1, 10, 100; gamma: 0.001, 0.01, 0.1, 1, 10, 100; degree: 2, 3, 4
RF	-	-	-	-	-	n_estimators: 50, 100, 200,500; max_depth: 6, 7, 8, 9, 10; min_samples_split: 2, 3, 4, 5; min_samples_leaf: 2, 3, 4, 5
MLP	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	-
GCN	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	-
GAT	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	heads: 2, 4, 8
GraphSAGE	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	aggr: ‘mean’, ‘max’
GraphTransformer	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	heads: 2, 4, 8
R-GCN	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	-
MGCN	16, 32, 64	0.4, 0.5, 0.6	0.01, 0.001, 0.0001	1 × 10⁻²,1 × 10⁻³,1 × 10⁻⁵	focal, cross entropy	-

References

Li, L.; Zhang, L.; Wang, X. Research on the Dynamic Evaluation of the Competitiveness of Listed Seed Enterprises in China. Agriculture 2024, 14, 1213. [Google Scholar] [CrossRef]
Dewangan, V.; Godse, M. Towards a holistic enterprise innovation performance measurement system. Technovation 2014, 34, 536–545. [Google Scholar] [CrossRef]
Zhang, L.; Qiu, H.; Chen, J.; Li, H.; Wan, X. How to Enhance Enterprises’ Radical Innovation Performance Through Multiple Pathways—A Machine Learning Analysis of SRDI Enterprises in China. Systems 2025, 13, 198. [Google Scholar] [CrossRef]
Zhen, Z.; Yao, Y. Optimizing deep learning and neural network to explore enterprise technology innovation model. Neural Comput. Appl. 2021, 33, 755–771. [Google Scholar] [CrossRef]
Mio, C.; Costantini, A.; Panfilo, S. Performance measurement tools for sustainable business: A systematic literature review on the sustainability balanced scorecard use. Corp. Soc. Responsib. Environ. Manag. 2022, 29, 367–384. [Google Scholar] [CrossRef]
Peng, J. Performance appraisal system and its optimization method for enterprise management employees based on the KPI index. Discret. Dyn. Nat. Soc. 2022, 2022, 1937083. [Google Scholar] [CrossRef]
Wang, J.; Gao, X.; Jia, R.; Zhao, L. Evaluation Index System Construction of High-Quality Development of Chinese Real Enterprises Based on Factor Analysis and AHP. Discret. Dyn. Nat. Soc. 2022, 2022, 8733002. [Google Scholar] [CrossRef]
Lan, X.; Li, Z.; Wang, Z. An investigation of the innovation efficacy of Chinese photovoltaic enterprises employing three-stage data envelopment analysis (DEA). Energy Rep. 2022, 8, 456–465. [Google Scholar] [CrossRef]
Cai, Z.; Qian, M.; Wang, L. Comprehensive ESG Score and financial performance of carbon-neutral concept enterprises—Based on entropy weight-TOPSIS and grey relational analysis. Open J. Bus. Manag. 2022, 11, 133–148. [Google Scholar] [CrossRef]
Li, D. Analysis of Enterprise Profitability Based on Dupont Analysis Method-Taking China Life Insurance (Group) Company as an Example. In Proceedings of the E3S Web of Conferences, Guangzhou, China, 18–20 December 2021; p. 01173. [Google Scholar]
Ma, X.; Liu, Z.; Gao, Y.; Liang, N. Innovation efficiency evaluation of listed companies based on the DEA method. Procedia Comput. Sci. 2020, 174, 382–386. [Google Scholar] [CrossRef]
Yu, A.; Jia, Z.; Zhang, W.; Deng, K.; Herrera, F. A dynamic credit index system for TSMEs in China using the delphi and analytic hierarchy process (AHP) methods. Sustainability 2020, 12, 1715. [Google Scholar] [CrossRef]
Yao, G.; Hu, X.; Wang, G. A novel ensemble feature selection method by integrating multiple ranking information combined with an SVM ensemble model for enterprise credit risk prediction in the supply chain. Expert Syst. Appl. 2022, 200, 117002. [Google Scholar] [CrossRef]
Guamán-Lloacana, H.; Muzo-Bombón, A.; Sánchez-Briceño, C.; Varela-Aldás, J. A Literature Review on Enterprise Credit Assessment Using Random Forest. In Proceedings of the 2024 IEEE Eighth Ecuador Technical Chapters Meeting (ETCM), Cuenca, Ecuador, 15–18 October 2024; pp. 1–8. [Google Scholar]
Zhang, L.; Hu, H.; Zhang, D. A credit risk assessment model based on SVM for small and medium enterprises in supply chain finance. Financ. Innov. 2015, 1, 14. [Google Scholar] [CrossRef]
Akman, G.; Yorur, B.; Boyaci, A.I.; Chiu, M.-C. Assessing innovation capabilities of manufacturing companies by combination of unsupervised and supervised machine learning approaches. Appl. Soft Comput. 2023, 147, 110735. [Google Scholar] [CrossRef]
Lian, Z.; Su, Z.; Zheng, T. Analysis and research on credit risk factors of listed companies in China based on CVM-MLP. In Proceedings of the 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 3–5 December 2021; pp. 542–547. [Google Scholar]
Shang, H.; Li, W.; Li, G.; Zhao, S.; Li, L.; Li, Y. Analysis and Application of Enterprise Performance Evaluation of Cross-Border E-Commerce Enterprises Based on Deep Learning Model. Mob. Inf. Syst. 2022, 2022, 1058175. [Google Scholar] [CrossRef]
Hosaka, T. Bankruptcy prediction using imaged financial ratios and convolutional neural networks. Expert Syst. Appl. 2019, 117, 287–299. [Google Scholar] [CrossRef]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Feng, B.; Xu, H.; Xue, W.; Xue, B. Every corporation owns its structure: Corporate credit rating via graph neural networks. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Shenzhen, China, 14–17 October 2022; pp. 688–699. [Google Scholar]
Wei, S.; Lv, J.; Guo, Y.; Yang, Q.; Chen, X.; Zhao, Y.; Li, Q.; Zhuang, F.; Kou, G. Combining intra-risk and contagion risk for enterprise bankruptcy prediction using graph neural networks. Inf. Sci. 2024, 659, 120081. [Google Scholar] [CrossRef]
Bi, K.; Liu, C.; Guo, B. Enterprise risk assessment model based on graph attention networks. Appl. Intell. 2025, 55, 229. [Google Scholar] [CrossRef]
Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph convolutional networks: A comprehensive review. Comput. Soc. Netw. 2019, 6, 11. [Google Scholar] [CrossRef]
Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. Statistics 2017, 1050, 10–48550. [Google Scholar]
Wang, X.; Zhu, M.; Bo, D.; Cui, P.; Shi, C.; Pei, J. AM-GCN: Adaptive multi-channel graph convolutional networks. In Proceedings of the Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 1243–1253. [Google Scholar]
Xue, L.; Li, X.; Zhang, N.L. Not all attention is needed: Gated attention network for sequence data. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 6550–6557. [Google Scholar]
Zeng, S.X.; Xie, X.M.; Tam, C.M. Relationship between cooperation networks and innovation performance of SMEs. Technovation 2010, 30, 181–194. [Google Scholar] [CrossRef]
Stuart, T.E. Interorganizational alliances and the performance of firms: A study of growth and innovation rates in a high-technology industry. Strateg. Manag. J. 2000, 21, 791–811. [Google Scholar] [CrossRef]
Hsieh, T.-J.; Yeh, R.-S.; Chen, Y.-J. Business group characteristics and affiliated firm innovation: The case of Taiwan. Ind. Mark. Manag. 2010, 39, 560–570. [Google Scholar] [CrossRef]
Fernandez-Cornejo, J.; Schimmelpfennig, D.E. Have seed industry changes affected research effort? Amber Waves Econ. Food Farming Nat. Resour. Rural Am. 2004, 2, 14–19. [Google Scholar]
Zhao, Y.; Deng, H.; Hu, R.; Xiong, C. Impact of government policies on seed innovation in China. Agronomy 2022, 12, 917. [Google Scholar] [CrossRef]
Liu, J.; Li, W.; Shi, J. Synergistic governance and innovation: The positive effect of cross-holding partners. Econ. Innov. New Technol. 2024, 1–35. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, Y.; Zhou, T. Statistical insights into HSIC in high dimensions. Adv. Neural Inf. Process. Syst. 2023, 36, 19145–19156. [Google Scholar]
Opitz, J. A closer look at classification evaluation metrics and a critical reflection of common evaluation practice. Trans. Assoc. Comput. Linguist. 2024, 12, 820–836. [Google Scholar] [CrossRef]
Zhao, Y.; Lin, D. Prediction of micro-and small-sized enterprise default risk based on a logistic model: Evidence from a bank of China. Sustainability 2023, 15, 4097. [Google Scholar] [CrossRef]
Chen, T.; Yin, X.; Peng, L.; Rong, J.; Yang, J.; Cong, G. Monitoring and recognizing enterprise public opinion from high-risk users based on user portrait and random forest algorithm. Axioms 2021, 10, 106. [Google Scholar] [CrossRef]
Bahnsen, A.C.; Gonzalez, A.M. Evolutionary algorithms for selecting the architecture of a MLP neural network: A credit scoring case. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, Vancouver, BC, Canada, 11 December 2011; pp. 725–732. [Google Scholar]
Hajibabaee, P.; Malekzadeh, M.; Heidari, M.; Zad, S.; Uzuner, O.; Jones, J.H. An empirical study of the graphsage and word2vec algorithms for graph multiclass classification. In Proceedings of the 2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 27–30 October 2021; pp. 0515–0522. [Google Scholar]
Yun, S.; Jeong, M.; Kim, R.; Kang, J.; Kim, H.J. Graph transformer networks. Adv. Neural Inf. Process. Syst. 2019, 32. Available online: https://proceedings.neurips.cc/paper/2019/hash/9d63484abb477c97640154d40595a3bb-Abstract.html (accessed on 13 August 2025).
Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. In Proceedings of the European Semantic Web Conference, Heraklion, Greece, 3–7 June 2018; pp. 593–607. [Google Scholar]
Cai, J.; Hu, R.; Huang, J.; Wang, X. Innovations in genetically modified agricultural technologies in China’s public sector: Successes and challenges. China Agric. Econ. Rev. 2017, 9, 317–330. [Google Scholar] [CrossRef]
Luo, X.; Zhou, Y. Potential Food Security Risks and Countermeasures under the Background of Seed Industry Innovation Based on Industry 4.0. Mob. Inf. Syst. 2022, 2022, 9905894. [Google Scholar] [CrossRef]
Robinson, J.; Ranjan, R.; Hu, W.; Huang, K.; Han, J.; Dobles, A.; Fey, M.; Lenssen, J.E.; Yuan, Y.; Zhang, Z. Relational deep learning: Graph representation learning on relational databases. In Proceedings of the NeurIPS 2024 Third Table Representation Learning Workshop, Vancouver, BC, Canada, 14 December 2024. [Google Scholar]
Aslan, M.S.; Hailat, Z.; Alafif, T.K.; Chen, X.-W. Multi-channel multi-model feature learning for face recognition. Pattern Recognit. Lett. 2017, 85, 79–83. [Google Scholar] [CrossRef]
Chen, B.; Long, S. A novel end-to-end corporate credit rating model based on self-attention mechanism. IEEE Access 2020, 8, 203876–203889. [Google Scholar] [CrossRef]
Zhang, L.; Song, Q. Multimodel integrated enterprise credit evaluation method based on attention mechanism. Comput. Intell. Neurosci. 2022, 2022, 8612759. [Google Scholar] [CrossRef]
Bhatti, U.A.; Tang, H.; Wu, G.; Marjan, S.; Hussain, A. Deep learning with graph convolutional networks: An overview and latest applications in computational intelligence. Int. J. Intell. Syst. 2023, 2023, 8342104. [Google Scholar] [CrossRef]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef]
Lan, S.; Hu, H. Evaluating Method of the Innovation ability for Agricultural Science and Technology Enterprises Based on Machine Learning. Pak. J. Agric. Sci. 2023, 60, 726–737. [Google Scholar]
Shabani, N.; Wu, J.; Beheshti, A.; Sheng, Q.Z.; Foo, J.; Haghighi, V.; Hanif, A.; Shahabikargar, M. A comprehensive survey on graph summarization with graph neural networks. IEEE Trans. Artif. Intell. 2024, 5, 3780–3800. [Google Scholar] [CrossRef]
Zheng, Y.; Yi, L.; Wei, Z. A survey of dynamic graph neural networks. Front. Comput. Sci. 2025, 19, 196323. [Google Scholar] [CrossRef]
Ying, Z.; Bourgeois, D.; You, J.; Zitnik, M.; Leskovec, J. Gnnexplainer: Generating explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 2019, 32. Available online: https://proceedings.neurips.cc/paper_files/paper/2019/hash/d80b7040b773199015de6d3b4293c8ff-Abstract.html (accessed on 13 August 2025).

Figure 1. Seed enterprise relationship graph visualization. (a) Seed enterprise shareholding relationship graph (Graph-S). (b) Seed enterprise cooperation relationship graph (Graph-C). (c) Seed enterprise innovation similarity relationship graph (Graph-I).

Figure 2. The structure diagram of the prediction model for the innovation ability of seed enterprises based on multi-channel graph convolutional networks. Three heterogeneous graphs—shareholding (Graph-S), collaboration (Graph-C), and innovation similarity (Graph-I)—are input into four parallel GCN channels: three for relationship-specific features and one shared channel with common convolution. The resulting node embeddings are fused via a gated attention mechanism and passed to a fully connected layer for innovation capacity prediction.

Figure 3. Box-and-scatter plots showing the results of 15 runs for each model. (a) ACC. (b) Macro Precision. (c) Macro Recall; (d) Macro F1-Score.

Figure 4. Visualization of node embedding. (a) Original Data. (b) MLP. (c) GCN. (d) GAT. (e) GraphSAGE. (f) GraphTransformer. (g) R-GCN. (h) MGCN. 1: Poor innovation level. 2: Moderate innovation level. 3: Good innovation level. 4: Excellent innovation level.

Figure 5. Ablation study results comparison of the MGCN. (a) ACC. (b) Macro Precision. (c) Macro Recall; (d) Macro F1-Score.

Figure 6. Confusion matrix results for effectiveness analysis of relational graphs. (a) MGCN. (b) MGCN-noGS. (c) MGCN-noGC. (d) MGCN-noGI. 1: Poor innovation level. 2: Moderate innovation level. 3: Good innovation level. 4: Excellent innovation level.

Figure 7. Feature correlation coefficients.

Figure 8. 2-Hop Subgraph Visualization of Node1-8. (a) Enterprise 1. (b) Enterprise 2. (c) Enterprise 3. (d) Enterprise 4. (e) Enterprise 5. (f) Enterprise 6. (g) Enterprise 7. (h) Enterprise 8. Label 1: Poor, Label 2: Moderate, Label 3: Good, Label 4: Excellent; Graph-S: Shareholding relationship, Graph-C: Cooperation relationship, Graph-I: Innovation output similarity.

Table 1. Description of internal feature data in seed enterprises.

Type	Features
Basic Information	year of establishment; registered province; registered capital; paid-in capital; number of insured employees.
Innovation Output	number of variety approval applications; number of variety registration applications; number of plant variety protection applications; number of transformants; number of patents; number of published papers.
Qualifications and Honors	number of national-level awards; designation as a national leading enterprise in a dominant variety cluster; designation as a national supplementary enterprise in a weak link cluster; designation as a national breakthrough enterprise in a bottleneck cluster; status as a national enterprise technology center; status as a provincial enterprise technology center; recognition as a “Little Giant” Enterprises with Specialization, Refinement, Distinctiveness, and Innovation.

Table 2. Seed enterprise shareholding relationship graph (Graph-S); seed enterprise cooperation relationship graph (Graph-C); seed enterprise innovation similarity relationship graph (Graph-I).

Graph	Nodes	Edges	Classes	Features
Graph-S	981	89	4	48
Graph-C	981	678	4	48
Graph-I	981	2261	4	48

Table 3. Parameters setting.

Model	Setting
LR	max_iter = 500; multi_class = multinomial; penalty = ‘l2’; solver = lbfgs.
SVM	C = 10; kernel = linear; decision_function_shape = ovr.
RF	n_estimators = 200; max_depth = 10; min_samples_leaf = 2; min_samples_split = 2.
MLP	hidden layer = 2; hidden1 = 16, hidden2 = 16; optimizer = Adam; learning rate = 0.001; weight decay = 1 × 10⁻⁴; activation function = relu; epochs = 200, loss = ‘focal’.
GCN	hidden layer = 2; hidden1 = 64, hidden2 = 16; activation function = relu; dropout rate = 0.5; optimizer = Adam; learning rate = 0.01; weight decay = 1 × 10⁻⁵; epochs = 200; loss = ‘cross entropy’.
GAT	hidden layer = 2; hidden1 = 16, hidden2 = 16; heads = 4; activation function = relu; dropout rate = 0.6; optimizer = Adam; learning rate = 0.001; weight decay = 1 × 10⁻⁵; epochs = 300, loss = ‘focal’.
GraphSAGE	hidden layer = 2; hidden1 = 32, hidden2 = 32; aggr = ‘max’; activation function = relu; dropout rate = 0.4; optimizer = Adam; learning rate = 0.01; weight decay = 1 × 10⁻⁵; epochs = 200; loss = ‘cross entropy’.
GraphTransformer	hidden layer = 2; hidden1 = 16, hidden2 = 16; heads = 4; activation function = relu; dropout rate = 0.5; optimizer = Adam; learning rate = 0.001; weight decay = 1 × 10⁻³; epochs = 300, loss = ‘ cross entropy’.
R-GCN	hidden layer = 2; hidden1 = 16, hidden2 = 32; activation function = relu; dropout rate = 0.6; optimizer = Adam; learning rate = 0.01; weight decay = 1 × 10⁻³; epochs = 400; loss = ‘cross entropy’.
MGCN	hidden layer = 2; hidden1 = 32, hidden2 = 64; activation function = relu; dropout rate = 0.6; optimizer = Adam; learning rate = 0.01; weight decay = 1 × 10⁻³; epochs = 200; loss = ‘cross entropy’.

Table 4. Prediction results of the innovation ability of seed enterprises by different methods.

Method	ACC	Macro Precision	Macro Recall	Macro F1-Score
LR	0.7717 ± 0.0304	0.7736 ± 0.0384	0.7568 ± 0.0367	0.7580 ± 0.0317
SVM	0.7737 ± 0.0299	0.7556 ± 0.0306	0.7645 ± 0.0373	0.7556 ± 0.0315
RF	0.7880 ± 0.0387	0.7790 ± 0.0295	0.8043 ± 0.0389	0.7852 ± 0.0343
MLP	0.7605 ± 0.0255	0.7492 ± 0.0322	0.7389 ± 0.0311	0.7412 ± 0.0308
GCN	0.8063 ± 0.0191	0.7824 ± 0.0330	0.7718 ± 0.0314	0.7729 ± 0.0278
GAT	0.8135 ± 0.0286	0.7850 ± 0.0303	0.8206 ± 0.0309	0.7951 ± 0.0279
GraphSAGE	0.8033 ± 0.0302	0.7800 ± 0.0257	0.7990 ± 0.0445	0.7840 ± 0.0316
GraphTransformer	0.7992 ± 0.0142	0.7981 ± 0.0102	0.7986 ± 0.0378	0.7941 ± 0.0219
R-GCN	0.8145 ± 0.0166	0.7859 ± 0.0225	0.8249 ± 0.0212	0.7974 ± 0.0239
MGCN	0.8359 ± 0.0273	0.8173 ± 0.0316	0.8277 ± 0.0536	0.8186 ± 0.0398

Table 5. The test results of the validity of the model structure.

Method	ACC	Macro Precision	Macro Recall	Macro F1
MGCN	0.8359 ± 0.0273	0.8173 ± 0.0316	0.8277 ± 0.0536	0.8186 ± 0.0398
MGCN-noF	0.8084 ± 0.0300	0.8000 ± 0.0367	0.8179 ± 0.0279	0.8036 ± 0.0290
MGCN-noC	0.8166 ± 0.0288	0.7967 ± 0.0323	0.8214 ± 0.0256	0.8043 ± 0.0295
MGCN-noG	0.8003 ± 0.0341	0.7926 ± 0.0323	0.8137 ± 0.0347	0.7994 ± 0.0317
MGCN-noLc	0.8094 ± 0.0049	0.7979 ± 0.0139	0.8157 ± 0.0203	0.8015 ± 0.0056
MGCN-noLd	0.8145 ± 0.0372	0.8074 ± 0.0379	0.8148 ± 0.0382	0.8070 ± 0.0375

Table 6. The test results of the validity analysis of the relationship diagram.

Method	ACC	Macro Precision	Macro Recall	Macro F1
MGCN	0.8359 ± 0.0273	0.8173 ± 0.0316	0.8277 ± 0.0536	0.8186 ± 0.0398
MGCN-noGS	0.8155 ± 0.0253	0.7930 ± 0.0290	0.8165 ± 0.0338	0.7997 ± 0.0297
MGCN-noGC	0.7911 ± 0.0249	0.7807 ± 0.0235	0.8125 ± 0.0284	0.7883 ± 0.0248
MGCN-noGI	0.7533 ± 0.0337	0.7561 ± 0.0287	0.7370 ± 0.0399	0.7433 ± 0.0331

Table 7. Prediction results of the innovation ability of seed enterprises based on different numbers of characteristics.

Method	ACC	Macro Precision	Macro Recall	Macro F1
MGCN-F10	0.7931 ± 0.0258	0.7784 ± 0.0299	0.7808 ± 0.0332	0.7775 ± 0.0301
MGCN-F20	0.8084 ± 0.0385	0.7966 ± 0.0475	0.7874 ± 0.0391	0.7869 ± 0.0399
MGCN-F30	0.8186 ± 0.0142	0.7989 ± 0.0145	0.8110 ± 0.0222	0.8023 ± 0.0155
MGCN-F40	0.8257 ± 0.0190	0.8108 ± 0.0245	0.8246 ± 0.0307	0.8136 ± 0.0254
MGCN	0.8359 ± 0.0273	0.8173 ± 0.0316	0.8277 ± 0.0536	0.8186 ± 0.0398

Table 8. Case Details.

Enterprise	True Label	Predicted Label	Correct Prediction
Enterprise 1	Poor	Poor	TRUE
Enterprise 2	Moderate	Moderate	TRUE
Enterprise 3	Good	Good	TRUE
Enterprise 4	Excellent	Excellent	TRUE
Enterprise 5	Poor	Moderate	FALSE
Enterprise 6	Moderate	Good	FALSE
Enterprise 7	Good	Moderate	FALSE
Enterprise 8	Excellent	Good	FALSE

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, S.; Wang, K.; Yang, F.; Pan, S. Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises. Sustainability 2025, 17, 7522. https://doi.org/10.3390/su17167522

AMA Style

Tang S, Wang K, Yang F, Pan S. Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises. Sustainability. 2025; 17(16):7522. https://doi.org/10.3390/su17167522

Chicago/Turabian Style

Tang, Shanshan, Kaiyi Wang, Feng Yang, and Shouhui Pan. 2025. "Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises" Sustainability 17, no. 16: 7522. https://doi.org/10.3390/su17167522

APA Style

Tang, S., Wang, K., Yang, F., & Pan, S. (2025). Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises. Sustainability, 17(16), 7522. https://doi.org/10.3390/su17167522

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Channel Graph Convolutional Network for Evaluating Innovation Capability Toward Sustainable Seed Enterprises

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Collection and Preprocessing

2.2. Construction of Inter-Enterprise Relationship Graphs

2.3. Evaluation Model

2.3.1. Overall Framework

2.3.2. Specific Feature Processing Channel for Single Graph

2.3.3. Common Feature Processing Channel for Multiple Graphs

2.3.4. Gated Attention for Feature Fusion

2.3.5. Multi-Objective Loss Function

2.3.6. Seed Enterprise Innovation Capability Prediction

2.4. Evaluation Metrics

3. Results

3.1. Experimental Setup

3.2. Comparative Experiments

3.3. Ablation Experiments of MGCN Components

3.4. Analysis of the Effectiveness of Relational Graphs

3.5. Analysis of Enterprises Features

3.6. Case Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI