Next Article in Journal
Investigation of the Effects of Different Plyometric Training Protocols on Punching Force and Muscle Performance in Male Boxers
Previous Article in Journal
Modeling of Distorted Degradation Data Based on Oil Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Go Source Code Vulnerability Detection Method Based on Graph Neural Network

School of Cyber Science and Engineering, Sichuan University, Chengdu 610207, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(12), 6524; https://doi.org/10.3390/app15126524
Submission received: 1 May 2025 / Revised: 1 June 2025 / Accepted: 6 June 2025 / Published: 10 June 2025

Abstract

With the widespread application of the Go language, the demand for vulnerability detection in Go programs is increasing. Existing detection models and methods have deficiencies in extracting source code features of Go programs and mainly focus on detecting concurrency vulnerabilities. In response to these issues, we propose a Go program vulnerability detection method based on a graph neural network (GNN). The core of this approach is to utilize GraphSAGE to extract the global structure and deep semantic information of each concurrent function, maximizing the learning of concurrency vulnerability features. To capture contextual information of fine-grained code fragments in source code, we employ taint analysis to extract taint propagation chains and use a Transformer model with a multi-head attention mechanism, based on lexical analysis, to extract fine-grained vulnerability features. We integrate graph-level and token-level features to maximize the detection of various complex types of vulnerabilities in Go source code. Experimental results on a real-world vulnerability dataset demonstrate that our model outperforms existing detection methods and tools, achieving an F1-score of 91.35%. Furthermore, ablation experiments confirm that the proposed feature fusion method effectively extracts deep vulnerability features.

1. Introduction

Under the von Neumann architecture, software vulnerabilities are inevitable. As software scales grow larger and functional modules become increasingly complex, vulnerabilities continue to emerge. The rise of various open-source communities has also made code sharing more common. During code replication, developers may inadvertently introduce vulnerabilities, making code reuse a significant cause of software vulnerabilities [1]. According to the National Vulnerability Database (NVD), over 18,000 security vulnerabilities were disclosed in 2021, with more than 3000 classified as high-risk [2]. Conducting vulnerability detection on source code can help security professionals identify software vulnerabilities early and ensure system security.
Deep learning-based vulnerability detection methods are generally categorized into sequence-based and graph-based approaches [3]. Sequence-based models are adept at learning code semantics, but they can only capture the surface structure of source code text, making it difficult to extract structural and semantic features of the source code [4]. As shown in [5], existing LSTM-based methods often suffer from poor accuracy. Although GNN-based techniques can capture global structures and deep semantic information and are more effective than text-based approaches [6,7], they still have three major issues:
1.
Different forms of code representation in these methods retain only partial information (syntax or semantics) [8], making it difficult to integrate the contextual semantic information of the original code [3].
2.
These methods often represent a function as a single graph, where each node corresponds to a statement, neglecting fine-grained information within the statements [4].
3.
Directly feeding raw source code, which contains a large amount of redundant code, into a graph neural network significantly increases the training time [9].
Additionally, while deep learning-based static analysis techniques have proven effective in detecting vulnerabilities in programming languages, especially mainstream ones like C/C++, Java, and PHP, relatively few studies have specifically focused on the unique characteristics of the Go language. Existing Go program detection methods, such as GCatch, GFix [10], and GFUZZ [11], primarily focus on detecting concurrency errors and overlook other types of vulnerabilities in Go programs, such as SQL injection, cross-site scripting (XSS), and insecure file uploads.
To address these challenges, we propose GoVulDect, a hybrid semantic-based Go source code vulnerability detection model that comprehensively captures Go source code information for vulnerability detection. We use graph random walk networks [12] to extract deep semantic and structural information of each concurrent vulnerability in Go source code. Additionally, we introduce lexical analysis on top of taint analysis and leverage a Transformer model with a multi-head attention mechanism to learn the contextual semantic information of various types of vulnerabilities in Go source code. Finally, we concatenate information from both dimensions and use an XGBoost classifier [13] for classification and detection, minimizing feature omission and enabling the detection of various complex types of vulnerabilities in Go programs. To reduce time overhead, we employ program pruning techniques during preprocessing to extract essential code segments and adopt a synchronized feature extraction strategy.
Overall, this paper makes the following contributions:
(1) Go Program Pruning Method for Structural Integrity: We propose a Go program pruning method that ensures structural integrity. It pre-filters code lines closely related to Go vulnerabilities and retains original structural relationships using differential comparison techniques, enabling a more comprehensive extraction of vulnerability context information.
(2) Hybrid Semantic-Based Graph Neural Network Vulnerability Detection Framework: We propose a hybrid semantic-based GNN Go source code vulnerability detection framework. The system employs graph random walk networks and Transformer models with multi-head attention mechanisms to extract both graph-level and token-level features of Go source code. The integration of both dimensions enhances detection effectiveness.
(3) Validation on Real-World Vulnerabilities: To validate the effectiveness of our method, we conducted comprehensive experiments on a real-world dataset. We selected one vulnerability detection tool and three GNN-based detection models (RATS, CSGVD [14], VDoTR [15], and AMPLE [16]) for comparison. Experimental results show that our method achieves an F1-score of 91.35% for Go source code vulnerability detection.
The rest of the paper is organized as follows: Section 2 reviews the related works while Section 3 shows preliminaries. Section 4 represents the methodology, and Section 5 discusses the results. Finally, the conclusions and future work directions are summarized in Section 6.

2. Related Work

2.1. Go Program Vulnerability Detection

As a mainstream programming language, memory corruption attacks targeting C/C++ have become a major threat to computer systems. The recently developed Go programming language is designed to prevent such attacks through its robust static type system, appropriate compiler optimizations, and runtime boundary checks [17]. In fact, Go is considered one of the best languages for developing secure systems [18] and has been progressively deployed in many popular applications and codebases. While Go prevents memory corruption and includes garbage collection to provide temporary security, it is still prone to vulnerabilities when interacting with other languages. For instance, when Go interacts with C’s memory management, it may introduce use-after-gc errors and more complex double free errors. Additionally, Go is vulnerable to a new type of supply chain attack targeting source code, known as the Trojan Source attack [1].
To address Go program vulnerabilities, researchers have been continuously improving Go source code vulnerability detection methods to enhance program security. Traditional source code vulnerability detection approaches can be categorized into rule-based program analysis [19] and pattern-based machine learning [20]. (1) Rule-based methods are inspired by traditional error detection techniques [21]. (2) Pattern-based methods employ conventional machine learning techniques to automatically learn vulnerability patterns from previously collected training samples [8].
However, both approaches heavily rely on the expertise of developers and security professionals, often resulting in high false-positive and false-negative rates. Additionally, they struggle to detect unknown vulnerabilities and require extensive manual verification [14]. Traditional Go vulnerability detection methods primarily depend on manual review and automated static analysis tools. However, these approaches typically demand substantial human effort and time while being limited to detecting concurrency-related errors, making it difficult to identify complex and subtle vulnerabilities. In fact, an analysis of the dataset we collected shows that about 98.25% of Go vulnerabilities are not related to concurrency, underscoring the need for more comprehensive detection approaches.
Deep learning (DL) has recently been introduced into the field of vulnerability detection due to its ability to process large amounts of software code and vulnerability data [22,23,24,25]. DL models automatically capture the structural representation of programs from training samples and use this information for detection [26,27]. In vulnerability detection, deep learning-based approaches can be divided into two main categories: (1) Sequence-based methods [23,26,27]: These approaches represent source code or its structural features as lexical token sequences [28] and apply natural language processing (NLP) techniques [26] to detect vulnerabilities by learning sequential features. (2) Graph-based methods [5,22,25,29]: These approaches transform source code into heterogeneous graph structures and utilize graph neural networks (GNNs) to capture local structures and dependencies [24].
In recent years, deep learning-based vulnerability detection techniques have made some progress, but relatively few models have been specifically designed for Go programs. We propose representing Go source code as graphs (e.g., abstract syntax trees (ASTs), control flow graphs (CFGs), etc.) and analyzing them using GNNs to extract Go program features for vulnerability detection. However, solely using GNN-based models has drawbacks—it can capture global structures and deep semantic information but often fails to retain contextual information.

2.2. Taint Analysis-Based Vulnerability Detection

Taint analysis is a program analysis technique used to track the flow of sensitive data [30]. It has been widely applied in fields such as vulnerability detection [31], cryptographic key misuse detection [32], and privacy leakage detection [33]. Additionally, taint analysis has been used to analyze various programming languages, including Java [34] and C [35], frameworks such as Android [36] and iOS, and microservices [37]. In vulnerability detection, research on taint analysis is mainly divided into static taint analysis and dynamic taint analysis.
In the field of static taint analysis, PATA [38] introduced a path-aware taint analysis model capable of accurately identifying repeatedly occurring variables based on execution path information. Fluffy [39] proposed a bi-modal taint analysis method that allows machine learning models to predict whether a taint flow is expected or unexpected based on the natural language information embedded within it. LATTE [40] combined large language models with static binary taint analysis, making it more cost-effective for vulnerability detection. However, these approaches have certain limitations. PATA integrates dynamic taint analysis, leading to significant performance overhead, and its complex path constraints hinder the effectiveness of fuzz testing. Fluffy is limited to JavaScript and relies on manually labeled evaluation data, which is prone to human error. LATTE struggles to analyze complex nested or jump-based code fragments, especially when public information about such vulnerabilities is lacking, making it difficult for large language models to analyze them effectively.
In the field of dynamic taint analysis, Spectre [41] applies dynamic taint analysis at the system level to detect vulnerability fragments associated with Spectre-type attacks. Another approach [42] introduced an efficient container tagging scheme based on a simplified ordered binary decision diagram, accelerating container tag execution efficiency in areas such as protocol reverse engineering and fuzz testing. AirTaint [43] integrates basic block-level taint abstraction with assembly-level instrumentation, enabling faster and more efficient high-level dynamic taint analysis. However, these dynamic taint analysis techniques generally come with high performance overhead.
Compared with the above-mentioned approaches, our proposed taint analysis strategy introduces several innovations and distinctions at the application level. Existing methods such as PATA, Fluffy, and LATTE rely on techniques like path constraint modeling, natural language inference, and large language model reasoning. However, these approaches are not well-suited to the structural characteristics of Go programs, suffering from high performance overhead, limited domain adaptability, and heavy reliance on manual effort. In contrast, our method is specifically tailored for Go language programs, integrating program structure pruning and syntax-based analysis. By  leveraging Go’s explicit API calls and well-defined static structure, we optimize the taint analysis process for this context.

3. Preliminaries

A well-designed preprocessing step can significantly reduce training overhead, improve detection efficiency, and enhance accuracy. To maximize the retention of vulnerability-related information during slicing and enable comprehensive vulnerability mining, we adopt a Go program pruning method that ensures structural integrity. This method allows us to extract highly relevant vulnerability-related code while preserving the original structural relationships of the code.

3.1. Preprocessing

Go source code contains extensive semantic information, which cannot be fully captured by simply using graph structures. Additionally, large projects often consist of thousands of lines of code, whereas vulnerabilities are usually concentrated within just a few lines. Therefore, we first perform preprocessing operations on Go source code to reduce the amount of code and thus reduce the training overhead.
(1) Filter Go files and remove redundant code: We retain only files with the .go extension from the project since large projects often contain multiple programming languages. Our model focuses solely on detecting vulnerabilities in Go code. We also remove comments, test files, and import statements, as they are unrelated to actual vulnerability detection and only increase the code volume.
(2) Extract each API function and construct an AST: Most security vulnerabilities are related to function calls, so extracting each function is essential for comprehensive vulnerability analysis. The Go language provides the g o / a s t and g o / p a r s e r packages, which allow us to parse Go source code into an abstract syntax tree (AST). By traversing the AST, we can accurately locate each function call.

3.2. Graph-Level Features

To comprehensively extract graph-level features and global structures, we apply the following operations during the preprocessing stage before extracting graph-level features.
(1) Identify potential concurrent functions and generate slicing sequences: Due to Go’s concurrency features, vulnerabilities in Go programs are often concurrency-related. We perform precise control flow and data flow analysis on the AST to determine control dependencies and data dependencies, allowing us to identify concurrency patterns in the program. These slicing sequences help capture execution paths while significantly reducing code size.
(2) Generate sliced code based on slicing sequences: The Go language provides interfaces and type systems, which facilitate module interaction. However, existing slicing methods often disrupt source code structure, altering semantic dependencies and affecting vulnerability detection accuracy. To address this, we design a Go program pruning method that ensures structural integrity. This method performs differential analysis (diff operation) between the slice and the original source code, supplementing the sliced code structure based on the source code structure. As a result, the pruned code maintains semantic completeness.
(3) Standardize variable naming: To improve the accuracy and consistency of program analysis, we perform standardized renaming of user-defined variables and functions. This step facilitates better comprehension and modeling of source code, significantly reduces the token count, and enhances the efficiency of neural network training, as well as the precision of automated vulnerability detection tools. Specifically, we adopt a one-to-one mapping strategy by replacing user-defined variables with symbolic names (e.g., “VAR1”, “VAR2”) and renaming functions similarly (e.g., “FUN1”, “FUN2”). Variable renaming is performed with respect to lexical scope and lifetime information to ensure that variables with the same name in different scopes are not mistakenly conflated, thereby avoiding semantic ambiguity. Moreover, given that Go supports closures, goroutines, and cross-file function calls, we carefully model inter-procedural variables. We track the propagation and scope transitions of such variables across functions to ensure their original semantic context is maintained throughout the abstraction and renaming process.
(4) Complete sliced code structures to ensure independent execution: For Go programs, we need to restore goroutines, their corresponding channels, and deferred function calls to maintain logical consistency. Algorithm 1 illustrates the process, where the SliceCode S is derived through Steps 1, 2, and 3.
Algorithm 1: Go graph-level feature sliced code completion algorithm.
Applsci 15 06524 i001

3.3. Semantic Features

Although we preserve rich semantic information during graph-level feature processing, some contextual information loss is inevitable during slicing. This loss may hinder the detection of certain vulnerabilities with complex trigger conditions. To address this, we extract token sequences to capture the contextual semantic features of vulnerable code. Taint analysis enables us to track the flow of tainted data, so we leverage taint analysis as the preprocessing step for token-level feature extraction.
After constructing the AST, we start from defined taint sources, analyze the AST, and  trace taint propagation along the data flow graph. This allows us to identify and extract taint propagation chains, which include all key code lines involved in taint propagation. We maintain a queue to record tainted variables and, finally, arrange tainted code lines in sequence, standardize variable names, and complete sliced code structures.

4. Methodology

4.1. System Framework

Existing detection models and methods for Go source code fail to comprehensively extract lexical, syntactic, and semantic features, limiting their ability to identify various types of vulnerabilities. To address this issue, we propose a hybrid semantic-based graph neural network vulnerability detection method for Go programs, named GoVulDect.
Figure 1 illustrates the detailed design of GoVulDect, which consists of three main modules: (1) Graph-Level Feature Extraction Module: This module represents potentially concurrent Go functions as code property graphs (CPGs) and utilizes GraphSAGE, a graph neural network based on random walks, to extract graph-level features that incorporate multiple types of semantic information. (2) Token-Level Feature Extraction Module: This module extracts token sequences using taint analysis and SpanBERT, a pre-trained model, to embed them into vectors. A Transformer model with multi-head attention is then employed to extract fine-grained token-level features. (3) Detection Module: This module fuses the source code features extracted by the previous two modules. Specifically, the graph-level features obtained via GraphSAGE are concatenated with the token-level features extracted by a Transformer-based model, forming a comprehensive representation of the source code. This fused feature vector is then passed to a pre-trained XGBoost classifier [13] to detect vulnerabilities in Go source code. XGBoost, as a scalable end-to-end tree boosting system, allows us to efficiently handle large-scale imbalanced data.
In contrast to existing hybrid detection approaches such as HyVulDect, which are primarily designed for traditional programming languages like C/C++, GoVulDect is specifically tailored to the concurrency semantics and structural characteristics of the Go language. During preprocessing, we enhance the static modeling of goroutines and channels to more accurately reconstruct potential concurrent execution paths. For feature extraction, GoVulDect incorporates a pretrained SpanBERT model with a span boundary objective (SBO) to generate semantically rich token representations. In addition, GraphSAGE is employed to capture global contextual features from the structural graphs of source code, while the Transformer further models deep semantic dependencies over long token distances. These design choices significantly improve the model’s capacity to represent and detect vulnerability patterns in Go source code. Detailed explanations of each component are provided in the following sections.

4.2. Graph-Level Feature Extraction

(1) Code Representation. Vulnerabilities often arise from improper function calls and parameter references. Although normal and vulnerable code may differ by only a few lines, control flow and data flow dependencies reveal clear distinctions. To  comprehensively extract these dependencies, we transform preprocessed Go source code into a code property graph (CPG). The code property graph (CPG) is a unified graph representation that integrates abstract syntax trees (ASTs), control flow graphs (CFGs), and data flow graphs (DFGs) to comprehensively model both syntactic and semantic aspects of programs. In a CPG, nodes represent key program entities such as functions, variables, operators, and control structures. Each node is enriched with type information, scope, and source code location. Edges capture relationships between these entities, including call edges, data flow dependencies, and control flow dependencies.
First, we use Go’s official standard packages g o / p a r s e r and g o / a s t to parse the source code and generate the corresponding abstract syntax tree (AST). Then, we perform type checking and static semantic analysis on the AST using the g o / t y p e s package to resolve the types of variables, functions, and expressions. Based on the type-annotated AST, we utilize the g o l a n g . o r g / x / t o o l s / g o / s s a package to convert the code into static single sssignment (SSA) intermediate representation. As the intermediate representation adopted by the Go compiler, SSA simplifies control and data flow analysis, enhances concurrency-related analysis, and significantly improves the compiler’s optimization capabilities. The control flow graphs (CFGs) and data flow graphs (DFGs) constructed based on SSA are further used to model control and data dependencies within the program.
Although the initial CPG representation contains static structural information, control flow, and data flow semantics, further optimization is necessary. We remove redundant nodes and prune unnecessary parts that do not affect analysis results. Figure 2 shows an optimized CPG example derived from the Go code in Figure 3.
Finally, to convert Go programs into semantic vector representations suitable for neural network input, we employed a pretrained SpanBERT model to embed nodes in the source code, generating high-dimensional semantic vectors for each code fragment. In this study, we observed that most nodes had effective feature lengths no greater than 20 after vectorizing the nodes and edges in the graph. Therefore, to ensure both representational completeness and consistency in vector dimensions, we set the feature vector length to 20. A detailed introduction to SpanBERT is provided in the next section, while this section focuses on the GraphSAGE model.
(2) Graph-Level Feature Extraction. Treating code solely as text overlooks critical control dependencies and data dependencies. To extract deep semantic information more effectively from Go source code, we construct a detection model based on the GraphSAGE network, which employs random walk sampling on graphs.
GraphSAGE is particularly well-suited for handling large-scale and complex projects because it learns node embeddings by iteratively sampling a fixed number of neighboring nodes rather than requiring access to every node during training. This makes it capable of efficiently learning from graphs of varying sizes and structures. Additionally, when aggregating features, GraphSAGE allows the model to select different aggregation functions based on specific tasks. This approach enhances the understanding of local graph structure features, efficiently captures control and data dependencies, and  provides a novel method for obtaining global graph features and contextual information, potentially improving classification performance. The overall framework of GraphSAGE is illustrated in Figure 4.
At each iteration k, we randomly sample a fixed-size neighborhood S v k for each node v. Then, we apply a mean aggregation function to merge the feature vectors of the neighboring nodes and update the representation of node v. The  mean aggregation function is defined as follows:
h N ( v ) ( k ) = 1 | S v k | u S v k h u ( k 1 )
where h N ( v ) ( k ) denotes the aggregated feature vector of node v s neighborhood at the k-th layer. Specifically, we average the feature vectors h u ( k 1 ) of all sampled neighbors u S v k to obtain a representation of the local neighborhood of node v at the current iteration.
Subsequently, node v updates its own representation by incorporating the aggregated neighborhood feature vector h N ( v ) ( k ) , using the following update function:
s h v ( k ) = σ ( W ( k ) · C O N C A T ( h v ( k 1 ) , h N ( v ) ( k ) ) )
where s h v ( k ) represents the weight matrix at the k-th layer, σ is the nonlinear activation function, and  W ( k ) is the learnable weight matrix. The  operator CONCAT ( · ) denotes concatenation, which combines the current features of node v with the aggregated features from its neighbors.
To prevent excessive feature scale growth and maintain consistency, we normalize each node’s feature vector after every update. Specifically, we apply L2 normalization as follows:
h v ( k ) = h v ( k ) h v ( k ) 2
The final graph representation h G integrates the entire graph’s structural and feature information without relying on expensive matrix operations or requiring storage of the complete graph structure. Therefore, in practical applications, GraphSAGE efficiently and accurately learns to distinguish the graph patterns of vulnerable and benign Go code, enabling effective detection of potential security vulnerabilities in Go source code.

4.3. Semantic Feature Extraction

(1) Code Representation. To extract contextual semantic features, we first apply lexical analysis to convert the preprocessed Go source code into a sequence of tokens, providing a more fine-grained representation of the source code, as illustrated in Figure 5. We then utilize SpanBERT to embed the tokens extracted from the code slices.
SpanBERT is an extension and optimization of the BERT model specifically designed to enhance the modeling of span-level semantic information in text, making it particularly suitable for extracting semantic features from complex program code. The model learns relationships between different words in the text and maps each word to a high-dimensional vector that captures both its semantic meaning and contextual information. Words with similar meanings are located closer together in the embedding space. As a self-supervised pre-trained model, SpanBERT introduces a novel span boundary objective (SBO), which strengthens the model’s ability to represent span boundaries. This is especially beneficial for identifying structural elements in code, such as function calls and control blocks, which are critical for understanding code semantics. The inclusion of SBO also enables more efficient access to span-level information during fine-tuning, allowing for more comprehensive extraction of both local and global structural features in Go source code. Figure 6 illustrates how SpanBERT extracts and represents features from Go code.
Formally, a line of code is decomposed into a token sequence X = [ x 1 , x 2 , , x n ] , where each token x i represents a lexical unit. For each token x i , we integrate positional embeddings and segment embeddings to obtain the final input representation of Go code. The position embedding P helps the model understand structural elements such as loops and conditional statements, while the segment embedding S enhances the model’s ability to distinguish different code blocks and maintain strong contextual awareness in complex programs:
H 0 = [ x 1 P 1 S , x 2 P 2 S , , x n P n S ]
Next, the SpanBERT model learns code representations through two pre-training tasks: masked language modeling (MLM) and span boundary objectives (SBOs). After pre-training, the model produces deep bidirectional representations. During forward propagation, the model generates a series of hidden layer states H l . Notably, the Transformer model integrates a self-attention mechanism, allowing it to capture long-range dependencies between tokens.
H l = T r a n s f o r m e r ( H l 1 ) , l = 1 , , L
Ultimately, the final layer output representation of the code line is denoted as H L . After passing through L layers of Transformer networks, it aggregates the contextual semantic information of the entire Go code slice. The same process is applied to other preprocessed code slices, as described in Section 3.3.
(2) Semantic Feature Extraction. While various neural network architectures are available for natural language processing (NLP), many suffer from the vanishing gradient (VG) problem, which can lead to ineffective training.To better capture the contextual semantic information in taint propagation chains, we take the final representation H L of the token sequence X embedded by SpanBERT. We use multiple H L vectors as input, and employ a Transformer model with a multi-head self-attention mechanism to extract contextual semantic features from the code slice.
The Transformer model’s self-attention mechanism addresses the challenge of long-range dependencies in code while mitigating the vanishing gradient and exploding gradient problems. By attending to different parts of the token sequence simultaneously, the model can effectively learn complex relationships between different code components, improving vulnerability detection accuracy.

5. Experiments and Results

5.1. Dataset

We collect a real-world Go program vulnerability dataset [44] from two sources: the GitHub Security Advisory Database and open-source projects on GitHub. The GitHub Security Advisory Database contains source code vulnerabilities associated with CWE identifiers. For open-source projects, we focus on high-star repositories, from which we extract CVE vulnerability files, patch files, and diff files based on commit information.
In the raw vulnerability dataset, we collected a total of 630 CWE-labeled vulnerabilities and, after preprocessing, segmented them into 129,978 Go code snippets, with an equal number of positive and negative samples (64,989 each). During data splitting, we strictly followed a project-level division strategy to ensure that code from the same GitHub project does not appear in both the training and test sets. The final dataset was divided into training, validation, and test sets in an 8:1:1 ratio. Table 1 shows examples of selected CWE vulnerability samples in our dataset.

5.2. Experimental Setup and Evaluation Metrics

(1) Experimental Setup. We conducted experiments using a machine equipped with an NVIDIA RTX 2080TI GPU and an Intel(R) i9 CPU with 128 GB RAM. The complete experimental environment is detailed in Table 2.
(2) Model Hyperparameter Settings. For neural networks, hyperparameter selection is critical as it directly impacts model performance, training speed, and generalization ability. Since hyperparameters are set before training and cannot be adjusted automatically, careful selection and tuning are necessary. Table 3 lists the hyperparameters used in our GraphSAGE model. To determine appropriate values, we performed a random search over the hyperparameter space and selected the configuration that yielded the best validation performance across multiple trials. In addition, XGBoost was used with its default parameters: 100 trees, a maximum depth of 6, a learning rate of 0.3, no early stopping, and a scale_pos_weight of 1.
(3) Evaluation Metrics. We use four widely adopted evaluation metrics to comprehensively measure the vulnerability detection capability of our model:
  • A c c u r a c y ( A ) = T P + T N T P + T N + F P + F N ,
  • P r e c i s i o n ( P ) = T P T P + F P ,
  • T r u e P o s i t i v e R a t e ( T P R ) = T P T P + F N ,
  • F 1 _ s c o r e ( F 1 ) = 2 × P × T P R P + T P R .
These metrics are calculated using true positives ( T P ), false positives ( F P ), true negatives ( T N ), and false negatives ( F N ).
To further assess classification performance, we also employ ROC curves and AUC values. Since AUC is insensitive to class distribution, it serves as a robust metric for evaluating model performance. An AUC score closer to 1 indicates better classification performance.

5.3. Results and Analysis

(1) Model Classification Performance. To comprehensively evaluate the classification performance and generalization ability of the GoVulDect model, we analyzed the trends of training loss and validation accuracy during the training and validation phases, as shown in Figure 7. The training loss continuously decreased and stabilized around 0.1 after approximately 500 steps, indicating convergence. Meanwhile, the validation accuracy steadily increased and eventually plateaued at around 95%, without any noticeable fluctuation or decline. The close alignment between the training loss and validation accuracy suggests that the model effectively converged without signs of overfitting, demonstrating strong stability and generalization performance.
To further assess the binary classification capability of the model, we tested it across various sample categories. As presented in Table 4, the GoVulDect model achieves precision, recall, and F1-scores above 94% for both benign and vulnerable samples, with an overall accuracy of 95%. These results indicate that the model demonstrates strong classification performance and is capable of effectively distinguishing between vulnerable and non-vulnerable code. Moreover, the close values of the “Macro AVG” and “Weighted AVG” metrics suggest that the model performs consistently across classes, without exhibiting significant bias toward any particular category.
To verify whether the model can accurately identify different types of vulnerabilities, we generate a multi-class confusion matrix, as shown in Figure 8. Notably, the model achieves a detection accuracy of over 81% for CWE-79 and CWE-200, demonstrating that GoVulDect exhibits strong classification capability and high accuracy when handling various types of vulnerabilities.
(2) Effectiveness of Feature Fusion. To validate the effectiveness of our feature fusion approach, we visualized the distribution of token-level features (bottom), graph-level features (middle), and their fused features (top) in both 2D (left) and 3D (right) spaces, as illustrated in Figure 9. The results clearly show that the fused features (top) exhibit the best separability, indicating that feature fusion significantly improves the model’s performance.
To further confirm the effectiveness of feature fusion, we conducted an ablation study, comparing models using only graph-level features (GoVulDect-Graph), only token-level features (GoVulDect-Tokens), and our full GoVulDect model. Table 5 presents the results.
The results indicate that GoVulDect-Graph achieves a precision of 90.58%, slightly outperforming GoVulDect-Tokens (88.78%). However, the full GoVulDect model, which integrates both graph and token-level features, achieves a significantly higher precision of 94.77%. This demonstrates that feature fusion effectively improves vulnerability detection, as it captures both structural information and contextual semantics of vulnerable code.
(3) Comparative Experiments. The code property graph (CPG) integrates multiple semantic representations, making it a powerful method for representing source code. To verify the advantage of CPG in vulnerability detection, we compared the ROC curves of the GraphSAGE model when applied to different code representations, as shown in Figure 10.
The results indicate that the CPG-based model achieves the highest AUC score of 0.984, demonstrating that CPG captures a more comprehensive and rich set of graph structural features. Additionally, we observe that the AUC scores of the control flow graph (CFG) and control dependence graph (CDG) are lower than those of the data dependence graph (DDG). This suggests that data flow plays a more significant role in vulnerability detection compared to control flow.
To achieve optimal classification performance, we compared several widely used classification models, including traditional machine learning algorithms (e.g., k-nearest neighbors (KNN), support vector machine (SVM), random forest (RF)), neural network-based models (e.g., multi-layer erceptron (MLP), bi-directional LSTM), and ensemble learning methods such as XGBoost. These classifiers were selected to represent a diverse set of learning paradigms and are commonly used in software vulnerability detection tasks as baseline models. Although some classifiers such as KNN and SVM may be considered less advanced in modern deep learning research, we include them to ensure a complete and comprehensive evaluation of the effectiveness of our proposed graph-based feature representation. It is worth noting that SVM, especially when using the RBF kernel, suffers from scalability issues due to its quadratic complexity with respect to the number of training samples [45]. In our study, SVM is used solely as a baseline model for comparison purposes. For practical deployment and large-scale performance, more scalable classifiers such as XGBoost and BiLSTM are preferred.
As shown in Table 6, XGBoost achieves the highest performance across all evaluation metrics, with an accuracy exceeding 94%. This demonstrates that XGBoost provides the most effective classification results in our vulnerability detection task. Furthermore, the consistently strong performance of our model across different classifiers also confirms the robustness and general applicability of the learned graph features.
To comprehensively evaluate GoVulDect’s vulnerability detection performance, we compared it with one commonly used detection tool and three state-of-the-art detection models. The results are summarized in Table 7 and Table 8.
We observe the following key findings: (1) RATS, a multi-language static vulnerability analysis tool, achieves the lowest performance across all metrics. This is because RATS relies solely on expert-defined vulnerability patterns, which leads to high false negatives due to the lack of adaptability to novel vulnerabilities. (2) CSGVD, a code semantic graph-based vulnerability detection model, outperforms RATS, but its performance remains the lowest among the four learning-based models. This is because CSGVD primarily captures shallow sequential local semantic features, limiting its effectiveness. (3) VDoTR introduces circular gated graph neural networks (CircleGGNNs) to embed node feature vectors and employs 1D convolutional layers for vulnerability classification. By capturing richer structural information, VDoTR outperforms CSGVD across all evaluation metrics. (4) AMPLE incorporates edge-aware graph convolutional networks to aggregate heterogeneous edge information into node representations. To handle long-range dependencies among distant nodes, AMPLE employs kernel-scaled representation techniques, significantly enhancing its ability to analyze complex code structures. As a result, AMPLE outperforms VDoTR in all aspects. (5) GoVulDect achieves over 91% in all detection metrics, surpassing all existing methods and tools. This demonstrates the effectiveness of our model in Go vulnerability detection and highlights the significance of our contributions.
To provide a more comprehensive evaluation, we also compare the training and detection time of GoVulDect with the models listed above. The results are summarized in Table 9.
The key observations are as follows: (1) GoVulDect achieves the shortest training and detection times among all learning-based models. This performance gain is mainly attributed to three key aspects. First, during preprocessing, we effectively eliminate redundant and non-essential code. Second, we apply program slicing techniques to accurately extract code segments that are highly relevant to vulnerability detection. Together, these two steps significantly reduce the volume of code that needs to be processed, thereby lowering computational overhead. Finally, we design a parallel architecture that enables the synchronized extraction of graph-level and token-level features, further improving the overall efficiency of both training and inference. (2) RATS, as a static vulnerability analysis tool, does not require training and achieves the shortest detection time. However, its accuracy is significantly lower than learning-based models. (3) CSGVD does not preprocess raw source code, and its training process requires separate training of the PE-BL module before node embedding, resulting in higher time consumption. (4) VDoTR requires the longest training time because it captures global and complex graph structures during training. (5) AMPLE applies graph simplification techniques, reducing the number of graph nodes, which significantly decreases both training and detection time.

6. Conclusions

At present, relatively few deep learning-based detection models have been specifically designed for Go language vulnerability detection. Most existing Go vulnerability detection tools focus only on concurrency errors and do not address other types of vulnerabilities. Therefore, we propose GoVulDect, a fine-grained, hybrid semantic-based graph neural network system for Go source code vulnerability detection.
First, we extract each function and generate slices, then represent them as code property graphs (CPGs) and use GraphSAGE to extract graph-level structural features. Although we strive to retain as much graph structure information as possible, some local and context-based semantic information loss is inevitable. To address this, we apply taint analysis to extract vulnerability slices and capture fine-grained token-level features. Finally, we use XGBoost to classify the fused features, enabling vulnerability detection. The fused features not only incorporate global control dependencies, data dependencies, and other semantic information of the source code but also preserve contextual semantics and local details of vulnerable code. Experiments on CVE vulnerability datasets demonstrate that GoVulDect achieves an F1-score of over 91%, significantly outperforming all existing vulnerability detection tools and models.
For future work, we plan to address several important limitations and explore further improvements. (1) Enhancing semantic feature representation: The current token-level semantic extraction module has limited capacity in representing complex semantics. With the rapid advancement of large language models (LLMs), we aim to adopt more powerful pretrained models to improve semantic understanding and representation. (2) Improving feature fusion between graph and semantic views: At present, graph and token-level features are fused via simple concatenation, which may underutilize their complementary nature. We plan to explore more integrated fusion strategies, such as co-training frameworks inspired by GraphCodeBERT, to enable joint learning and deeper feature interaction. (3) Handling multi-label vulnerability attribution: In real-world scenarios, a single code segment may correspond to multiple CWE types. For example, a buffer overflow vulnerability may involve both missing boundary checks (CWE-119) and use-after-free issues (CWE-416). Our current model assumes single-label classification, which limits its applicability. We plan to incorporate multi-label classification techniques to better capture such complex cases.

Author Contributions

Conceptualization, L.Y.; methodology, L.Y.; software, L.Y.; validation, L.Y.; formal analysis, Y.F.; investigation, Q.Z.; resources, L.Y.; data curation, Y.F.; writing—original draft preparation, L.Y.; writing—review and editing, Y.F. and Y.X.; visualization, Q.Z.; supervision, Z.L.; project administration, Q.Z.; funding acquisition, Z.L. and Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Laboratory of Data Protection and Intelligent Management, Ministry of Education, Sichuan University (SCUSACXYD202401).

Data Availability Statement

The data used in this paper are collected through our own experiments and are not yet publicly available. However, data may be obtained from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Boucher, N.; Anderson, R. Trojan source: Invisible vulnerabilities. In Proceedings of the 32nd USENIX security symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 6507–6524. [Google Scholar]
  2. Zhang, C.; Liu, B.; Xin, Y.; Yao, L. Cpvd: Cross project vulnerability detection based on graph attention network and domain adaptation. IEEE Trans. Softw. Eng. 2023, 49, 4152–4168. [Google Scholar] [CrossRef]
  3. Liu, R.; Wang, Y.; Xu, H.; Sun, J.; Zhang, F.; Li, P.; Guo, Z. Vul-LMGNNs: Fusing language models and online-distilled graph neural networks for code vulnerability detection. Inf. Fusion 2025, 115, 102748. [Google Scholar] [CrossRef]
  4. Qiu, F.; Liu, Z.; Hu, X.; Xia, X.; Chen, G.; Wang, X. Vulnerability detection via multiple-graph-based code representation. IEEE Trans. Softw. Eng. 2024, 50, 2178–2199. [Google Scholar] [CrossRef]
  5. Wang, H.; Ye, G.; Tang, Z.; Tan, S.H.; Huang, S.; Fang, D.; Feng, Y.; Bian, L.; Wang, Z. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. Inf. Forensics Secur. 2020, 16, 1943–1958. [Google Scholar] [CrossRef]
  6. Xu, Y.; Fang, Y.; Liu, Z.; Zhang, Q. PWAGAT: Potential Web attacker detection based on graph attention network. Neurocomputing 2023, 557, 126725. [Google Scholar] [CrossRef]
  7. Xu, Y.; Zhang, Q.; Deng, H.; Liu, Z.; Yang, C.; Fang, Y. Unknown web attack threat detection based on large language model. Appl. Soft Comput. 2025, 173, 112905. [Google Scholar] [CrossRef]
  8. Cao, S.; Sun, X.; Bo, L.; Wei, Y.; Li, B. Bgnn4vd: Constructing bidirectional graph neural-network for vulnerability detection. Inf. Softw. Technol. 2021, 136, 106576. [Google Scholar] [CrossRef]
  9. Guo, W.; Fang, Y.; Huang, C.; Ou, H.; Lin, C.; Guo, Y. HyVulDect: A hybrid semantic vulnerability mining system based on graph neural network. Comput. Secur. 2022, 121, 102823. [Google Scholar] [CrossRef]
  10. Liu, Z.; Zhu, S.; Qin, B.; Chen, H.; Song, L. Automatically detecting and fixing concurrency bugs in go software systems. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual, 19–23 April 2021; pp. 616–629. [Google Scholar]
  11. Liu, Z.; Xia, S.; Liang, Y.; Song, L.; Hu, H. Who goes first? Detecting go concurrency bugs via message reordering. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February–4 March 2022; pp. 888–902. [Google Scholar]
  12. Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1025–1035. [Google Scholar]
  13. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  14. Tang, W.; Tang, M.; Ban, M.; Zhao, Z.; Feng, M. CSGVD: A deep learning approach combining sequence and graph embedding for source code vulnerability detection. J. Syst. Softw. 2023, 199, 111623. [Google Scholar] [CrossRef]
  15. Fan, Y.; Wan, C.; Fu, C.; Han, L.; Xu, H. VDoTR: Vulnerability detection based on tensor representation of comprehensive code graphs. Comput. Secur. 2023, 130, 103247. [Google Scholar] [CrossRef]
  16. Wen, X.C.; Chen, Y.; Gao, C.; Zhang, H.; Zhang, J.M.; Liao, Q. Vulnerability detection with graph simplification and enhanced graph representation learning. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 14–20 May 2023; pp. 2275–2286. [Google Scholar]
  17. Mergendahl, S.; Burow, N.; Okhravi, H. Cross-Language Attacks. In Proceedings of the NDSS, Chengdu, China, 18–20 December 2022; pp. 1–18. [Google Scholar]
  18. Jackson, J. Microsoft: Rust Is the Industry’s ‘Best Chance’ at Safe Systems Programming, 20 October 2020. Available online: https://thenewstack.io/microsoft-rust-is-the-industrys-best-chance-at-safe-systems-programming/ (accessed on 1 June 2025).
  19. Xu, Z.; Chen, B.; Chandramohan, M.; Liu, Y.; Song, F. Spain: Security patch analysis for binaries towards understanding the pain and pills. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), Buenos Aires, Argentina, 20–28 May 2017; pp. 462–472. [Google Scholar]
  20. Scandariato, R.; Walden, J.; Hovsepyan, A.; Joosen, W. Predicting vulnerable software components via text mining. IEEE Trans. Softw. Eng. 2014, 40, 993–1006. [Google Scholar] [CrossRef]
  21. Zhou, T.; Sun, X.; Xia, X.; Li, B.; Chen, X. Improving defect prediction with deep forest. Inf. Softw. Technol. 2019, 114, 204–216. [Google Scholar] [CrossRef]
  22. Li, Y.; Wang, S.; Nguyen, T.N. Vulnerability detection with fine-grained interpretations. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 23–28 August 2021; pp. 292–303. [Google Scholar]
  23. Li, Z.; Zou, D.; Xu, S.; Jin, H.; Zhu, Y.; Chen, Z. Sysevr: A framework for using deep learning to detect software vulnerabilities. IEEE Trans. Dependable Secur. Comput. 2021, 19, 2244–2258. [Google Scholar] [CrossRef]
  24. Wu, F.; Wang, J.; Liu, J.; Wang, W. Vulnerability detection with deep learning. In Proceedings of the 2017 3rd IEEE international conference on computer and communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 1298–1302. [Google Scholar]
  25. Zhou, Y.; Liu, S.; Siow, J.; Du, X.; Liu, Y. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 10197–10207. [Google Scholar]
  26. Li, Z.; Zou, D.; Xu, S.; Ou, X.; Jin, H.; Wang, S.; Deng, Z.; Zhong, Y. Vuldeepecker: A deep learning-based system for vulnerability detection. arXiv 2018, arXiv:1801.01681. [Google Scholar]
  27. Lin, G.; Wen, S.; Han, Q.L.; Zhang, J.; Xiang, Y. Software vulnerability detection using deep neural networks: A survey. Proc. IEEE 2020, 108, 1825–1848. [Google Scholar] [CrossRef]
  28. Nie, X.; Li, N.; Wang, K.; Wang, S.; Luo, X.; Wang, H. Understanding and tackling label errors in deep learning-based vulnerability detection (experience paper). In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, Seattle, WA, USA, 17–21 July 2023; pp. 52–63. [Google Scholar]
  29. Li, X.; Xin, Y.; Zhu, H.; Yang, Y.; Chen, Y. Cross-domain vulnerability detection using graph embedding and domain adaptation. Comput. Secur. 2023, 125, 103017. [Google Scholar] [CrossRef]
  30. Wang, C.; Ko, R.; Zhang, Y.; Yang, Y.; Lin, Z. Taintmini: Detecting flow of sensitive data in mini-programs with static taint analysis. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 14–20 May 2023; pp. 932–944. [Google Scholar]
  31. Chen, S.; Lin, Z.; Zhang, Y. {SelectiveTaint}: Efficient data flow tracking with static binary rewriting. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Vancouver, BC, Canada, 11–13 August 2021; pp. 1665–1682. [Google Scholar]
  32. Zhang, L.; Chen, J.; Diao, W.; Guo, S.; Weng, J.; Zhang, K. {CryptoREX}: Large-scale analysis of cryptographic misuse in {IoT} devices. In Proceedings of the 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019), Beijing, China, 23–25 September 2019; pp. 151–164. [Google Scholar]
  33. Lu, H.; Zhao, Q.; Chen, Y.; Liao, X.; Lin, Z. Detecting and measuring aggressive location harvesting in mobile apps via data-flow path embedding. Proc. ACM Meas. Anal. Comput. Syst. 2023, 7, 18. [Google Scholar] [CrossRef]
  34. Huang, W.; Dong, Y.; Milanova, A. Type-based taint analysis for Java web applications. In Proceedings of the Fundamental Approaches to Software Engineering: 17th International Conference, FASE 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, 5–13 April 2014; Proceedings 17. Springer: Berlin/Heidelberg, Germany, 2014; pp. 140–154. [Google Scholar]
  35. Fu, X.; Cai, H. Scaling application-level dynamic taint analysis to enterprise-scale distributed systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings, Seoul, Republic of Korea, 5–11 October 2020; pp. 270–271. [Google Scholar]
  36. Zhang, J.; Tian, C.; Duan, Z. Fastdroid: Efficient taint analysis for android applications. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Montreal, QC, Canada, 25–31 May 2019; pp. 236–237. [Google Scholar]
  37. Zhong, Z.; Liu, J.; Wu, D.; Di, P.; Sui, Y.; Liu, A.X. Field-based static taint analysis for industrial microservices. In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice, Pittsburgh, PA, USA, 21–29 May 2022; pp. 149–150. [Google Scholar]
  38. Liang, J.; Wang, M.; Zhou, C.; Wu, Z.; Jiang, Y.; Liu, J.; Liu, Z.; Sun, J. Pata: Fuzzing with path aware taint analysis. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–25 May 2022; pp. 1–17. [Google Scholar]
  39. Chow, Y.W.; Schäfer, M.; Pradel, M. Beware of the unexpected: Bimodal taint analysis. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, Seattle, WA, USA, 17–21 July 2023; pp. 211–222. [Google Scholar]
  40. Liu, P.; Sun, C.; Zheng, Y.; Feng, X.; Qin, C.; Wang, Y.; Li, Z.; Sun, L. Harnessing the power of llm to support binary taint analysis. arXiv 2023, arXiv:2310.08275. [Google Scholar]
  41. Qi, Z.; Feng, Q.; Cheng, Y.; Yan, M.; Li, P.; Yin, H.; Wei, T. SpecTaint: Speculative Taint Analysis for Discovering Spectre Gadgets. In Proceedings of the NDSS, Virtual, 21–25 February 2021; pp. 1–14. [Google Scholar]
  42. Jia, Z.; Yang, C.; Zhao, X.; Li, X.; Ma, J. Design and implementation of an efficient container tag dynamic taint analysis. Comput. Secur. 2023, 135, 103528. [Google Scholar] [CrossRef]
  43. Sang, Q.; Wang, Y.; Liu, Y.; Jia, X.; Bao, T.; Su, P. Airtaint: Making dynamic taint analysis faster and easier. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 20–22 May 2024; pp. 3998–4014. [Google Scholar]
  44. Available online: https://github.com/advisories (accessed on 20 May 2025).
  45. Chen, C.S.; Noorizadegan, A.; Young, D.L.; Chen, C. On the selection of a better radial basis function and its shape parameter in interpolation problems. Appl. Math. Comput. 2023, 442, 127713. [Google Scholar] [CrossRef]
Figure 1. GoVulDect framework. (Module 1: Graph-Level Feature Extraction Module, which extracts graph-level features of potentially concurrent Go function slices. Module 2: Token-Level Feature Extraction Module, which extracts token-level features of slices using taint analysis. Module 3: Detection Module, which classifies and detects vulnerabilities in Go source code.)
Figure 1. GoVulDect framework. (Module 1: Graph-Level Feature Extraction Module, which extracts graph-level features of potentially concurrent Go function slices. Module 2: Token-Level Feature Extraction Module, which extracts token-level features of slices using taint analysis. Module 3: Detection Module, which classifies and detects vulnerabilities in Go source code.)
Applsci 15 06524 g001
Figure 2. Go program code property graph after preprocessing.
Figure 2. Go program code property graph after preprocessing.
Applsci 15 06524 g002
Figure 3. Sample Go code.
Figure 3. Sample Go code.
Applsci 15 06524 g003
Figure 4. Sampling and aggregation process of graph random walk network.
Figure 4. Sampling and aggregation process of graph random walk network.
Applsci 15 06524 g004
Figure 5. Go code lexical analysis.
Figure 5. Go code lexical analysis.
Applsci 15 06524 g005
Figure 6. Examples of SpanBERT pre-trained model extraction and representation.
Figure 6. Examples of SpanBERT pre-trained model extraction and representation.
Applsci 15 06524 g006
Figure 7. Loss and accuracy curves during training and validation. (The green curve represents the training loss of the model as the number of training epochs increases, while the red curve represents the model’s validation accuracy.)
Figure 7. Loss and accuracy curves during training and validation. (The green curve represents the training loss of the model as the number of training epochs increases, while the red curve represents the model’s validation accuracy.)
Applsci 15 06524 g007
Figure 8. The confusion matrix of the model’s classification results for multiple types of data. (The darker the color, the higher the model’s accuracy in distinguishing vulnerabilities).
Figure 8. The confusion matrix of the model’s classification results for multiple types of data. (The darker the color, the higher the model’s accuracy in distinguishing vulnerabilities).
Applsci 15 06524 g008
Figure 9. Scatter plot comparison of fused features. (Yellow and purple represent different vulnerability features, respectively.)
Figure 9. Scatter plot comparison of fused features. (Yellow and purple represent different vulnerability features, respectively.)
Applsci 15 06524 g009
Figure 10. Different code representation comparison experiment. ((left) displays the ROC curves of GraphSAGE under several different code representations, while in the (right), the blue bar represents the CFG, the orange bar represents the CPG, and the green bar represents the DFG.)
Figure 10. Different code representation comparison experiment. ((left) displays the ROC curves of GraphSAGE under several different code representations, while in the (right), the blue bar represents the CFG, the orange bar represents the CPG, and the green bar represents the DFG.)
Applsci 15 06524 g010
Table 1. Top 10 CWE types by frequency in the dataset.
Table 1. Top 10 CWE types by frequency in the dataset.
CWE-IDDescriptionNumber
CWE-79Improper neutralization of input during web page generation.
(“Cross-Site Scripting”)
67
CWE-22Improper limitation of a pathname to a restricted directory.
(“Path Traversal”)
45
CWE-400Uncontrolled resource consumption.39
CWE-20Improper input validation.35
CWE-287Improper authentication.24
CWE-284Improper access control.21
CWE-200Exposure of sensitive information to an unauthorized user.19
CWE-601URL redirection to an untrusted site. (“Open Redirect”)17
CWE-863Incorrect authorization.17
CWE-352Cross-Site Request Forgery. (CSRF)14
Table 2. Experimental environment configuration details.
Table 2. Experimental environment configuration details.
ComponentConfiguration
Operating SystemUbuntu 18.04.1
Programming LanguageProgramming Language
Major Python LibrariesPyg-lib == 0.3.1 + pt21cu121
Torch == 2.0.1 + cu118
Scikit-learn == 1.3.2
Transformers == 4.35.2
Table 3. GraphSAGE model hyperparameters.
Table 3. GraphSAGE model hyperparameters.
ParameterDescriptionValue
num_samplesNeighbor sampling size25
aggregator_typeAggregation functionmean
embedding_sizeNode embedding size64
num_layersNumber of graph convolution layers2
l2_regL2 regularization strength0.0001
learning_rateLearning rate0.01
dropoutDropout rate0.3
epochsTraining epochs500
Table 4. Classification report of GoVulDect.
Table 4. Classification report of GoVulDect.
ClassPrecisionRecallF1-ScoreSample Size
Benign Samples0.960.940.951646
Vulnerable Samples0.940.960.951604
Accuracy//0.953250
Macro AVG0.950.950.953250
Weighted AVG0.950.950.953250
Table 5. Comparison of ablation experiment results.
Table 5. Comparison of ablation experiment results.
ModelPrecisionRecallF1-Score
GoVulDect-Graph0.90580.90390.9043
GoVulDect-Tokens0.88780.88570.8859
GoVulDect0.94770.95130.9489
Table 6. Comparison of the results of different classifiers.
Table 6. Comparison of the results of different classifiers.
ModelAccuracyPrecisionRecallF1-Score
MLP0.910.900.900.90
RF0.910.910.900.90
KNN0.890.880.880.88
SVM0.900.890.890.89
BiLSTM0.930.920.920.92
XGBoost0.940.950.940.95
Table 7. Comparison of existing methods and tools.
Table 7. Comparison of existing methods and tools.
Model/ToolArchitectureTarget LanguageSupported
Vulnerabilities
RATSRule-based static
analysis
C, C++, Perl, PHP,
Python
Buffer overflow,
TOCTOU, etc.
CSGVDPE-BL + Residual
GCN + M-BFA + MLP
C/C++/
VDoTRThird-order tensor
representation +
CircleGGNN + 1-D
convolution
C/C++CWE-120, CWE-119,
CWE-469, CWE-476
AMPLEGraph simplification +
Edge-aware GCN +
Kernel-scaled
representation
C/C++/
GoVulDectSpanBERT +
GraphSAGE +
Transformer +
XGBoost
GoConcurrency
vulnerabilities, SQL
injection, XSS, etc.
“/” indicates that the original paper did not provide detailed information.
Table 8. Comparison of existing methods and tools.
Table 8. Comparison of existing methods and tools.
Model/ToolDataset SizePrecision *Recall *F1-Score *
RATS0.52910.54000.5288
CSGVD27,3180.72050.74420.6733
VDoTR93,5390.79470.78000.7762
AMPLE219,8290.88720.88600.8859
GoVulDect129,9780.91640.91370.9135
* indicates that Precision, Recall, and F1-Score are evaluated on the dataset used in this study.
Table 9. Performance comparison results of different models.
Table 9. Performance comparison results of different models.
Model/ToolTraining Time (s)Detection Time (s)
RATS/0.012
CSGVD7589.81062.063
VDoTR17,822.21179.180
AMPLE5125.16617.121
GoVulDect4221.9453.116
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yuan, L.; Fang, Y.; Zhang, Q.; Liu, Z.; Xu, Y. Go Source Code Vulnerability Detection Method Based on Graph Neural Network. Appl. Sci. 2025, 15, 6524. https://doi.org/10.3390/app15126524

AMA Style

Yuan L, Fang Y, Zhang Q, Liu Z, Xu Y. Go Source Code Vulnerability Detection Method Based on Graph Neural Network. Applied Sciences. 2025; 15(12):6524. https://doi.org/10.3390/app15126524

Chicago/Turabian Style

Yuan, Lisha, Yong Fang, Qiang Zhang, Zhonglin Liu, and Yijia Xu. 2025. "Go Source Code Vulnerability Detection Method Based on Graph Neural Network" Applied Sciences 15, no. 12: 6524. https://doi.org/10.3390/app15126524

APA Style

Yuan, L., Fang, Y., Zhang, Q., Liu, Z., & Xu, Y. (2025). Go Source Code Vulnerability Detection Method Based on Graph Neural Network. Applied Sciences, 15(12), 6524. https://doi.org/10.3390/app15126524

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop