Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling

Zhou, Feng; Hu, Shijing; Du, Xiaozheng; Li, Nan; Zhou, Tongming; Zhao, Yanni; Shang, Sitong; Ling, Xufeng; Zhu, Huaizhong

doi:10.3390/electronics14193910

Open AccessArticle

Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling

by

Feng Zhou

^1,2

,

Shijing Hu

²,

Xiaozheng Du

^2,3,*,

Nan Li

^3,*,

Tongming Zhou

²,

Yanni Zhao

^1,4,

Sitong Shang

¹,

Xufeng Ling

¹

and

Huaizhong Zhu

¹

School of Artificial Intelligence, Shanghai Normal University Tianhua College, No. 1661 Shengxin North Road, Shanghai 201815, China

²

School of Computer Science, Fudan University, Shanghai 200438, China

³

Business Analysis BU, GienTech Technology Co., Ltd., Shanghai 200232, China

⁴

School of Aerospace Engineering and Applied Mechanics, Tongji University, No. 100 Zhangwu Road, Shanghai 200092, China

^*

Authors to whom correspondence should be addressed.

Electronics 2025, 14(19), 3910; https://doi.org/10.3390/electronics14193910

Submission received: 30 July 2025 / Revised: 28 September 2025 / Accepted: 29 September 2025 / Published: 30 September 2025

(This article belongs to the Topic Advanced Operation, Control, and Planning of Intelligent Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

Human-database interaction is inevitable in intelligent system applications, and accurately converting user-entered natural language into database query language is a critical step. To improve the accuracy, generalization, and robustness of text-to-SQL, we propose Nabil (a model for natural language conversion query language based on brain-inspired computing technology and a large language model). This model first leverages the spatiotemporal encoding capabilities of spiking neural networks to capture semantic features of natural language, then fuses these features with those generated by a large language model. Finally, a champion model is designed to select the optimal query from multiple candidate SQLs. Experiments were conducted on three database engines, DuckDB, MySQL, and PostgreSQL, and the model’s effectiveness was verified on benchmark datasets such as BIRD. The results show that Nabil outperforms existing baseline methods in both execution accuracy and effective efficiency scores. Furthermore, our proposed normalization and syntax tree abstraction algorithms further enhance the champion model’s discriminative capabilities, providing new insights for text-to-SQL research.

Keywords:

intelligent systems; text-to-SQL; brain-inspired computing; large language model; human-computer interaction

1. Introduction

In modern intelligent system applications, users submit various queries through conversations and visual interfaces. The intelligent system converts the natural language in these queries into database query language, then returns the data and displays it to the user [1]. In intelligent system applications, such as intelligent human–computer interaction, data analysis, and decision-making systems, issues such as data usability, data efficiency, cross-database access, real-time interaction, semantic understanding, personalized analysis, and multi-source fusion querying inevitably arise [2]. Suppose the intelligent system fails to accurately convert natural language into SQL query language during text-to-SQL conversion. In this case, it will return incorrect information to the user, resulting in an abysmal user experience. The research findings in the text-to-SQL field offer new insights and hope for solving these problems.

Brain-inspired neuron mechanisms not only effectively reduce the overfitting of traditional neural networks but also accurately capture the temporal dimension of natural language while using less power [3]. To improve the accuracy, generalization, and robustness of intelligent systems converting natural language into database query language, we developed a novel Text-to-SQL model, Nabil, based on brain-inspired computing technology and large language models. Figure 1 shows an application scenario for the Nabil model.

As shown in the Nabil model application scenario diagram in Figure 1, the Nabil model, which we designed based on brain-inspired computing technology and a large language model, will be deployed in computing centers within industrial environments. The user side and the application side access the Nabil model through the network layer. User input includes voice and natural language text. Voice is converted to text by third-party software before being entered into the Nabil model. The Nabil model consists of a semantic encoding layer, a multimodal feature fusion layer, a candidate SQL generation layer, and a champion model. The Nabil model outputs the corresponding SQL based on the received text. Because the Nabil model can address issues such as natural language ambiguity, database and knowledge graph isolation, SQL dialect normalization, and slow manual querying, it can be widely applied in various fields, including automobile transportation, shipping, water transport, finance, education, and healthcare. For example, users can use natural language to plan routes and query traffic conditions in real time; passengers can use natural language to query flight status, delays, and gate information; crew members can use natural language to query routes, weather, and navigation status in real time; users can use natural language to query risk indicators, market trends, and unusual transactions; students can use spoken language to query personalized learning paths and recommended content; and doctors can use natural language to query disease diagnosis information and assistance recommendations. The examples provided are relevant to possible specificities within the individual proposed areas of use, such as the aspect defined by safety regulations in air transport, or the high uncertainty in medicine and the GDPR.

In this research, we first used a brain-inspired spiking neural network to construct a natural language semantic encoding layer. We then used a brain-inspired spiking neural network and a large language model to construct a multimodal feature fusion layer. To improve the accuracy and efficiency of the SQL generated by the large language model, we optimized the space complexity of the prompt’s grammatical and semantic spaces during the design process. To select the best result from the multiple candidate SQLs generated, we designed a champion model. This champion model first parses the SQL into an abstract syntax tree (AST) that is independent of the specific database engine and then re-serializes the AST into an equivalent SQL expression based on the syntax and characteristics of the target engine. For each SQL set to be screened, the champion model we designed will use the large language model for determination and to conduct empirical comparisons on three mainstream database engines (DuckDB, MySQL, and PostgreSQL). All verifications are carried out based on entirely consistent test data, ultimately achieving the selection of the best SQL. In the process of implementing the Nabil model, our main innovations are as follows:

A brain-inspired natural language semantic encoding algorithm was proposed. This algorithm effectively improves the model’s generalization capabilities by sparsifying spike trains.
A spatiotemporal feature fusion algorithm was proposed. By dynamically adjusting the weights of brain-inspired encoding features and those of a large language model, it achieves cross-modal fusion of features from brain-inspired spiking neural networks and large language models.
A candidate SQL generation algorithm was proposed. By manipulating prompt templates, examples, and structures, it reduces the ineffective search space of the large language model and improves the efficiency of SQL generation.
A champion model was proposed. By parsing SQL queries into abstract syntax trees and automatically aligning dialects using regular expressions, the same test data can be run on multiple engines, further enhancing the model’s credibility.

In the Section 1, we describe our research background in detail. In the Section 2, we describe related work in detail. In the Section 3, we describe in detail the techniques and algorithms we referenced, used, and designed during our research. In the Section 4, we describe in detail the datasets, evaluation metrics, and experimental results we used. In the Section 5, we summarize our research work and point out its limitations.

2. Related Works

There are many research papers related to SQL. For example, Guo J. et al. proposed the IRNet method using the Spider dataset to improve the accuracy of cross-domain and complex text-to-SQL conversion [4], with an accuracy of up to 46.7% on the Spider dataset. Sen, J. et al. proposed the Athena++ system to expand the functions of the NLIDB system [5], handling complex business intelligence (BI) queries using the Spider and FIBEN datasets. The experimental results show that the system achieves 78.89% on the Spider dataset and 88.33% on the FIBEN dataset. Liu, J. et al. utilized the WikiSql dataset to efficiently convert natural language questions into SQL queries [6], proposing an NL2SQL method that leverages a fine-tuned model and an embedded pre-trained artificial neural network. The experimental results show that the model’s accuracy is nearly 81% on the WikiSQL dev dataset. Marcus R. et al. proposed the Bao system to improve the adaptability and efficiency of query optimization [7]. Cao, R. et al. utilized the Spider dataset to enhance the accuracy of heterogeneous graph encoding in Text-to-SQL tasks [8], proposing a line graph enhanced Text-to-SQL (LGESQL) model and a graph pruning task that improves the discriminative ability of the encoder. The experimental results show that the accuracy of the framework reaches 72.0% using Electra and 62.8% using GloVe.

Sioulas, P. et al. used a multi-query workload dataset based on the TPC-DS architecture to reduce the processing cost of the database under high analytical query loads and improve throughput [9], proposing the RouLette intelligent engine. Ahkouk, K. et al. utilized the NL2SQL benchmark dataset to address the obstacles that ordinary users encounter when interacting with the database and passed Roberta embeddings and data-independent knowledge vectors to the LSTM-based sub-model [10]. The experimental results show that the model achieves a test set execution accuracy of 76.7%. Zhao, C. et al. used the Squall dataset (re-split into training and test sets) and synthetic datasets as new benchmarks to study the ability of Text-to-SQL parsers to match compound operations of columns with domain-specific phrases [11]. A method consisting of pattern pruning and pattern expansion was proposed. In the new Squall data split test, the relative accuracy of the underlying parser was improved by 13.8%. Hui, B. et al. used the Spider dataset to allow Text-to-SQL parsers to model question grammars and proposed the SSQL method to inject syntax into the question-pattern graph encoder [12]. Yu, X. et al. proposed a hybrid optimizer to improve the planning quality of complex SQL queries [13]. The experimental results show that the tail latency is reduced by 65%, and the total latency is reduced by 25% compared to PostgreSQL. Gan, Y. et al. utilized the Spider dataset to enhance the model’s combinatorial generalization ability in text-to-SQL tasks and proposed a clause-level combinatorial example generation method [14].

Fu H. et al. proposed the CatSQL framework to improve the running speed and accuracy of the NL2SQL technique and evaluated it on cross-domain and single-domain benchmark datasets [15]. Chen, Z. et al. studied the automatic error correction model of Text-to-SQL to enhance the accuracy of Text-to-SQL parsing, thereby meeting the needs of practical applications [16]. In order to avoid the context separation of token-level editing, a clause-level editing model was proposed. Gu, Z. et al. utilized the Spider dataset to enhance the generalization ability of text-to-SQL translation in cases involving limited text and proposed the SC-Prompt framework [17]. Giaquinto, R. et al. used a pre-training dataset containing SQL code, text, and tables to achieve the goal of improving model performance in the Text-to-SQL generation task [18]. Lee, K. et al. used Microsoft SQL Server to evaluate the impact of cardinality estimation on complex SQL queries, studying queries such as aggregation, subqueries, grouping, and outer joins [19]. Ba, J. et al. proposed query plan guidance (QPG) to fully automatically test errors in database systems and applied it to three database systems: SQLite, TiDB, and CockroachDB [20]. Chen T. et al. proposed the LOGER optimizer to generate robust and efficient query execution plans using the Join Order Benchmark (JOB), TPC-DS, and Stack Overflow datasets [21]. The experimental results show that LOGER outperforms existing learning query optimizers and is 2.07 times faster than PostgreSQL on the JOB dataset.

Li R. et al. proposed the CardOOD general learning framework [22], which focuses on offline training algorithms to improve the out-of-distribution (OOD) problem of query-driven estimators. The framework extends traditional robust learning and transfer learning techniques, utilizes self-supervised learning strategies, and trains one-time models to address out-of-distribution (OOD) problems. Fan, Y. et al. proposed the CycleSQL iterative framework to enhance the interpretability and accuracy of the end-to-end NL2SQL translation model [23], utilizing five datasets, including the Spider benchmark. The experimental results show that the accuracy of the RESDSQL model’s validation set increases to 82.0%, and the accuracy of the Spider test set increases to 81.6%. Fan, J. et al. proposed the ZeroNL2SQL framework to achieve efficient conversion of zero-shot NL2SQL [24], decomposing the NL2SQL task into subtasks and utilizing LLM and SLM simultaneously. Liu, C. et al. utilized the Table QA test set to enhance the utilization of data table structure information in the NL2SQL conversion process and proposed a method to integrate table structure information [25]. This method associates questions with corresponding data table column names to obtain BERT input data and text features, constructs an external vector based on the table structure, and combines it with the features generated by BERT to enhance feature extraction. The experimental results show that the execution accuracy of this method reaches 90.48%. Kim, H. et al. utilized the BIRD and Spider benchmark datasets, along with the FLEX method based on the large language model (LLM), to conduct a more detailed and accurate evaluation of the Text-to-SQL system [26]. The innovation of this method lies in the use of complex evaluation criteria and comprehensive context information.

Mao, W. et al. utilized four benchmark datasets—Spider-test, DK, Spider-dev, and Realistic—to enhance the performance of LLM-based Text-to-SQL methods [27]. They proposed the DART-SQL framework, which comprises two components: execution-oriented optimization and problem rewriting. Xie, X. et al. proposed the OpenSearch-SQL method and utilized the BIRD dataset to enhance the performance of multi-agent collaborative LLM in Text-to-SQL tasks [28], addressing issues such as model hallucination, incomplete framework, and instruction follow-up failure. The method divides the Text-to-SQL task into four modules: preprocessing, extraction, generation, and optimization, and introduces a consistency alignment module to reduce the difficulty of LLM tasks through task decomposition. The experimental results show that OpenSearch-SQL achieves 72.28% on the BIRD development set, with an R-VES score of 69.36% and an execution accuracy of 69.3%. To improve the reliability of text-to-SQL conversion, Chen, K. et al. utilized the BIRD benchmark dataset and proposed the Reliable Text-to-SQL (RTS) framework [29].

In the study of SQL equivalence verification, Castelein et al. proposed a search-based SQL query test data generation method to automatically generate data that covers all important test targets of SQL queries [30]. The experimental results demonstrate that this method effectively covers 98.6% of queries and can handle complex elements, including joins, subqueries, and string operations, within SQL queries. Chu et al. proposed a new formal framework for automatically verifying the semantic equivalence of SQL queries [31]. To automatically determine the equivalence of two SQL queries, the UDP (U-expression Decision Procedure) algorithm was proposed. Zhou, Q. et al. proposed a new symbolic representation-based automatic query equivalence verification method for SQL queries in databases [32]. This method converts SQL queries into symbolic representations (SRs) and then verifies query equivalence using SMT solvers such as Z3. Zhou, Q. et al. proposed a new symbolic-based method to verify query equivalence under bag semantics [33]. The core idea is to transform the query equivalence proof under bag semantics into proving that there is a bijective identity map between the tuples returned by two queries under all valid inputs. Wang S. et al. proposed QED (Query Equivalence Decider), a powerful tool for determining SQL query equivalence that aims to overcome the limitations of existing tools [34], particularly when dealing with complex features in SQL queries. He, Y., Zhao, P. et al. proposed an automated SQL query equivalence checking tool VeriEQL. This tool employs symbolic reasoning to verify whether two SQL queries are equivalent across all possible input databases using SMT [35]. The tool supports complex SQL queries and can handle a variety of operations, including SELECT, JOIN, GROUP BY, aggregate functions, WITH clauses, and conditional expressions (such as IF and CASE WHEN).

Based on the aforementioned related research work and some of the SQL-related research listed in Table 1, the current research on text-to-SQL has yielded significant results. Additionally, research on SQL-related query optimization and query equivalence verification has also achieved notable outcomes. However, in the field of text-to-SQL research, limited expressive power due to natural language ambiguity, the isolation of databases and knowledge graphs, and other factors still exist, such as models abstaining from correct predictions when they should have. To address these issues and further improve the accuracy and efficiency of text-to-SQL, we built the Nabil model based on brain-inspired computing technology and a large language model. Furthermore, to accurately select the optimal SQL from multiple generated candidate SQLs, we built the Champion model based on a normalization algorithm, syntax trees, and a large language model. Ultimately, we achieved state-of-the-art performance for the Nabil model in text-to-SQL research.

3. Methodology

In this section, we will provide a detailed description of the technologies utilized in Nabil’s research, including a Text-to-SQL model based on brain-inspired computing technology and a large language model, as well as the algorithms and models developed.

3.1. Preliminaries

A common challenge facing current AI research is extremely high energy consumption, driven by the exponential growth in data processing requirements. Current digital computing systems based on the von Neumann architecture [36], due to the separation of computation and storage, consume a significant amount of time and energy in moving data. Consequently, they are struggling to meet this demand, and environmental and other costs are becoming increasingly significant. Because biological brains store and process information simultaneously, they offer extremely high parallelism and energy efficiency. Brain-inspired computing, on the other hand, is well-suited for handling unstructured data, noise, and uncertainty, as well as for efficient and low-power data processing. Therefore, it is crucial to address this challenge by leveraging the efficient information processing mechanisms of biological brains. Furthermore, because brain-inspired neurons more closely resemble the mechanisms of the human brain, they can effectively reduce the overfitting of traditional neural networks.

Since the Transformer architecture ignited the pre-training and fine-tuning paradigm, large language models have consistently refreshed the state of the art (SOTA) in natural language, multimodal, and other fields, utilizing large-scale parameter corpora and RLHF alignment technology. As a result, they have become the most promising technical path in the field of general artificial intelligence. Among them, DeepSeek, Qwen, and ChatGLM are representative of the open-source family [37]. These three models have their characteristics in terms of infrastructure, data scale, alignment strategy, and ecological construction. DeepSeek combines sparse expert networks with high-compression activation technology to achieve mathematical reasoning performance comparable to that of GPT-4 while reducing reasoning by approximately 60% in FLOPs. Qwen, reconstructed based on the Llama-2 framework, significantly improves the accuracy of chain reasoning while maintaining high throughput by combining dynamic few-shot retrieval and generation. ChatGLM combines ring attention with segmented position encoding to achieve nearly linear growth in overhead for ultra-long context reasoning time.

Due to the data engineering team’s demand for cross-dialect SQL automatic translation and lightweight AST optimization tools, SQLglot was created. DuckDB was designed to deliver Snowflake-like columnar OLAP performance in notebooks and server processes, addressing the cloud warehouse latency bottleneck issue of data being stored on disk before querying [38]. Since DuckDB uses vectorized column storage and adaptive parallel execution, JSON can be mapped to a zero-copy table in the same process. PostgreSQL was created to support cloud-native and large-scale analysis scenarios. PostgreSQL achieves compression of synchronization delays by improving parallel aggregation, incremental sorting, SQL/JSON standard functions, and logical replication pipelines. MySQL supports high concurrency on the Internet through iterative OLTP, near real-time analysis, HeatWave column storage, and GPU vector execution.

3.2. Nabil Model

Our text-to-SQL model, Nabil, designed based on brain-inspired computing technology and a large language model, consists of a neural network (NL) semantic encoding layer built with a brain-inspired spiking neural network, a multimodal feature fusion layer built with both the brain-inspired spiking neural network and the large language model, a candidate SQL generation layer, and a champion screening layer. When describing the algorithm process of the NL semantic encoding layer, multimodal feature fusion layer, and candidate SQL generation layer, we use circles to represent the input modules of the algorithm, ovals to represent the output of the algorithm, rounded rectangles to describe the various processing modules, and arrows to represent data flows. Since natural language data is a type of time series data, the spiking neural network structure we constructed for the NL semantic encoding layer is illustrated in Figure 2.

In the spiking neural network structure of Figure 2, the standard embedding method is first used to obtain the word vector of the input natural language. Next, pulse coding is used to convert the continuous embedding vectors into pulse trains. Then, Leaky Integrate-and-Fire (LIF) neurons are used to capture the temporal information of the language. The temporal feature sequences are aggregated to obtain a stable feature representation. Finally, a stable and refined pulse feature sequence is output. Because a multi-layered structure enhances feature extraction, the spiking recurrent neuron layer is stacked, with each layer consisting of LIF recurrent neurons. In the study of the Nabil model, a multimodal feature fusion layer was designed to align and fuse the fine-grained spatiotemporal features encoded by the spiking neural network with the high-level semantic features encoded by the LLM in the same space, thereby obtaining a more expressive and robust joint semantic feature. The structure of the multimodal feature fusion layer is shown in Figure 3.

In the multimodal feature fusion layer of Figure 3, we utilize a linear transformation to align the pulse coding features output by the SNN with the high-level semantic features produced by the LLM, splice them in the fusion module, and fully interact through an MLP to output multimodal joint semantic features. The detailed structure of the candidate SQL generation layer is shown in Figure 4.

In the candidate SQL generation layer of Figure 4, the input of the SQL candidate generation layer is the multimodal semantic features fused from SNN and LLM. The prompt optimization module reduces the invalid search space of the large language model by controlling prompt templates, examples, and structure. The LLM candidate SQL generation module batch-generates multiple SQL candidates based on the optimized prompts, outputting one SQL statement for each prompt. The candidate SQL ranking module sorts the candidate SQL statements based on confidence, ultimately outputting the top-K high-confidence candidate SQL statements for subsequent selection of the champion model. To select the optimal SQL statements, we first generate minimally discriminative test data using a large language model, then employ a hybrid approach validated by multiple database engines. Ultimately, we achieve optimal SQL selection while ensuring accuracy and avoiding the high load of pure search. The champion model structure we designed is shown in Figure 5.

As shown in Figure 5, the top-K high-confidence SQL candidates and reference SQL output by the SQL candidate generation layer serve as input to the champion model. The champion model enables the large language model to generate test data, providing both judgment results and justification. The test data is first parsed into an abstract syntax tree, then translated and formatted. SQL is generated based on the column and table information obtained through scan checks. In the DuckDB engine, judgment results are output after the database is created, the table is created, and the query is executed. In the PostgreSQL and MySQL engines, judgment results are output after creating the table, repairing the SQL, and executing the query. During execution, if the judgment results from the three engines and the large language model are consistent, the champion SQL is directly output. If an engine error occurs, the champion SQL is output based on the judgment result of the large language model.

3.3. Nabil Model Algorithm

In this paper, the specific steps of the main process of the Nabil model we proposed are shown in Algorithm 1.

Algorithm 1: Nabil Model Algorithm. Source: author’s contribution
Input: Q = Natural Language { $ω_{1}, ω_{2}, \dots, ω_{n}$ }
Output: The best SQL
1	embeddings = Embedding $(Q) \in R^{n \times d}$
2	spike_sequence = PulseCodingLayer(embeddings)
3	snn_features = MultilayerSpikingRNN(spike_sequence) # Multilayer LIF neurons
4	nl_semantic_features = TemporalPooling(snn_features)
5	llm_features = LLM_Encoder $(Q) \in R^{n \times d}$ # LLM outputs semantic features
6	snn_aligned = LinearProjection_SNN(nl_semantic_features)
7	llm_aligned = LinearProjection_LLM(llm_features)
8	joint_features = FusionModule(snn_aligned, llm_aligned) # Splicing + MLP fusion
9	prompts = PromptOptimization(joint_features)
10	candidate_sqls = []
11	for prompt in prompts:
12	sql = LLM_GenerateSQL(prompt)
13	candidate_sqls.append(sql)
14	topk_candidates = CandidateSQLRanking(candidate_sqls, joint_features, k = TopK)
15	champion_sql = None
16	test_data = LLM_GenerateTestData(reference_sql, topk_candidates)
17	for sql in topk_candidates:
18	res_duck = DuckDB_Execute(sql, test_data)
19	res_mysql = MySQL_Execute(sql, test_data)
20	res_pg = Postgres_Execute(sql, test_data)
21	results.append([res_duck, res_mysql, res_pg])
22	for i, sql in enumerate(topk_candidates):
23	if results[i][0] == results[i][1] == results[i][2] == LLM_JudgeEquivalence(reference_sql, sql):
24	champion_sql = sql
25	break
26	if champion_sql is None:
27	champion_sql = LLM_JudgeEquivalence (reference_sql, topk_candidates)
28	Return champion_sql

In Algorithm 1, the Nabil model first uses a spiking neural network to encode natural language time series and aligns and fuses them with the semantic features of a large language model. Subsequently, an optimized prompt is used to drive the LLM to batch generate candidate SQL statements, and the top-K candidates are ranked based on confidence. Finally, the optimal SQL statement is obtained by hybrid equivalence verification across multiple database engines and assisted by a large language model.

3.4. Nabil Model Normalization Module Algorithm

To enable SQLs from any source to be directly executed in the three engines without altering the logical semantics of the original query as much as possible, we proposed the Nabil model to design a SQL dialect normalization algorithm. The specific process is shown in Algorithm 2.

Algorithm 2: Nabil Model Normalization Module Algorithm. Source: author’s contribution
Input: SQL
Output: Different SQL that can be executed in multiple engines
1	function normalize (sql_input, target):
2	# 1. Pre-cleaning to remove obvious analytical obstacles
3	sql = strip_semicolon(sql_input)
4	sql = drop_dollar_prefix(sql)
5	sql = rename_numeric_alias(sql)
6	# 2. Abstract Syntax Tree
7	for read_style in [target, generic]:
8	try:
9	sql = sqlglot_transpile (sql,read_style,target, keep_identifiers = True)
10	break
11	except TranspileError:
12	continue
13	# 3. SQL dialect gap repair
14	if target == “mysql”:
15	sql = mysql_remove_nulls(sql)
16	sql = mysql_remove_range(sql)
17	sql = mysql_bool_to_int(sql)
18	sql = mysql_fix_funcs(sql)
19	elif target == “postgres”:
20	sql = pg_concat_to_pipe(sql)
21	sql = pg_explicit_cast(sql)
22	sql = pg_cast_in_agg(sql)
23	elif target == “duckdb”:
24	sql = duckdb_minor_fix(sql)
25	# 4. Output the final SQL
26	return compress_whitespace(sql)

In Algorithm 2, the primary purpose of pre-cleaning is to remove the trailing semicolon and temporary column prefixes and to change the pure numeric alias to a legal identifier, thereby avoiding parsing errors. The method used in the syntax tree abstraction process is to parse with the target dialect first and then fall back to the more relaxed “generic” dialect. In SQL dialect gap repair, the operation for the MySQL engine involves deleting NULLS LAST/FIRST and window RANGE BETWEEN, changing Boolean literals to integers, and replacing the missing function names in MySQL. The operation for the PostgreSQL engine involves replacing CONCAT with || for splicing text, numbers, comparisons, additions, and explicit type conversions and uniformly using NUMERIC in aggregate functions. Because DuckDB uses a syntax close to PostgreSQL and has good compatibility, the operation for the DuckDB engine is to retain duckdb_minor_fix as an extension hook. Finally, the algorithm compresses redundant spaces, removes leading and trailing blanks, and returns SQL that can be directly executed in the target engine, equivalent to the original query logic.

3.5. Nabil Model Syntax Tree Abstraction Process

To ensure the reliability and semantic consistency of large-scale SQL in the automatic conversion process, we need to parse the SQL string of a specific database into a dialect-independent syntax tree and then rewrite it safely into SQL that can be directly executed by other databases and is logically equivalent. The specific structure of the syntax tree abstraction process in the Nabil model we proposed is shown in Figure 6.

Figure 6 shows the method. In the Nabil model, the syntax tree abstraction process involves first converting the original SQL into a syntax tree that is independent of any database dialect, and then regenerating equivalent SQL from this syntax tree according to the grammatical rules of the target database. This can automatically handle differences such as keyword cases, function aliases, and quotation mark styles while ensuring that the order of operations and query logic remains intact. If a dialect conflict is encountered during the parsing or generation process, an error will be thrown, making it easy to fall back to a more relaxed parsing strategy in a timely manner and provide feedback to the upstream and downstream components. The final implementation can utilize a single input SQL query to output multiple SQL queries that can be directly executed in each target database.

4. Experimental Results and Analysis

In this section, we provide a detailed description of the datasets used in the experiment, along with evaluation indicators, performance comparison experiments, ablation comparison experiments, and experiments comparing large language models. In our experiment, we utilized SpikingJelly to construct a brain-inspired spiking neural network, and employed DeepSeek, Qwen, and ChatGLM for a large language model. Our system was running Ubuntu 22.04.5, and our hardware consisted of an NVIDIA GeForce RTX 3070 Ti Laptop GPU (Manufacturer: NVIDIA Corporation, Headquarters City: Santa Clara, Country: United States) with 8 GB of video memory. We used Python 3.11.11 as the development language.

During the experiment, the DeepSeek version used was DeepSeek V3.1, with a parameter size of 671 B. The Qwen version used was Qwen3-Coder-480B-A35B-Instruct, with a parameter size of 480 B. The ChatGLM version used was GLM-Z1-32B-0414, with a parameter size of 32 B. In the experiment, we used these three large language models via API calls. The large language models were used via API calls, and the resource overhead was primarily due to the API call fee; the resource overhead of SpikingJelly was primarily due to the use of local hardware resources.

4.1. Datasets

In our research on the Nabil model, we used the BIRD (BIG Bench for Large-Scale Database Grounded Text-to-SQL Evaluation) dataset. This cross-domain dataset pairs a question with an SQL statement, totaling 12,751 text-to-SQL pairs. This dataset comprises 95 large databases spanning 37 domains, including blockchain, healthcare, and education, occupying 33.4 GB of storage space [37,39].

In the BIRD dataset, ‘database_description’ is a CSV file describing the database schema and its values for model exploration or reference. ‘sqlite’ represents the database content within BIRD. ‘data’ stores each text-to-SQL pair containing oracle knowledge evidence as a JSON file, ‘dev.json’. Table 2 shows the detailed information in each JSON file:

As can be seen from the fields in Table 2, the BIRD dataset is characterized by complex and diverse data structures. It is a key dataset for evaluating the generalization, robustness, and practical implementation capabilities of large models. It is a Text-to-SQL challenge benchmark for real-world complex databases.

4.2. Evaluation Metrics

During our experiments, we used the evaluation metrics EX (Execution Accuracy) and VES (Valid Efficiency Score). EX is calculated as the proportion of SQL statements with correct output results in the entire evaluation set. The calculation formula is shown in Formula (1).

E X = \frac{\sum_{n = 1}^{N} l (V_{n}, {\hat{V}}_{n})}{N}

(1)

In Formula (1) [39],

V_{n}

represents all evaluation sets,

{\hat{V}}_{n}

represents the SQL statements that output the correct result, and

l

represents the indicator function. The calculation formula is shown in Formula (2).

(V, \hat{V}) = \{\begin{matrix} 1, V = \hat{V} \\ 0, V \neq \hat{V} \end{matrix}

(2)

VES is the efficiency of calculating practical SQL that is consistent with the actual SQL. The calculation formula is shown in Formula (3).

V E S = \frac{\sum_{n = 1}^{N} l (V_{n}, {\hat{V}}_{n}) \cdot R (Y_{n}, {\hat{Y}}_{n})}{N}, R (Y_{n}, {\hat{Y}}_{n}) = \sqrt{\frac{E (Y_{n})}{E ({\hat{Y}}_{n})}}

(3)

In Formula (3) [39],

Y_{n}

represents the actual SQL statement,

{\hat{Y}}_{n}

represents the output SQL statement,

R (Y_{n}, {\hat{Y}}_{n})

represents the relative execution efficiency relative to the actual SQL statement, and

E (Y_{n})

is a function that measures the absolute SQL execution efficiency in a specified environment. In this experiment, we primarily calculated the execution efficiency based on the runtime in the current hardware environment.

4.3. Performance Comparison Experiment

In our previous Text-to-SQL research, we proposed an algorithm called NL2PY2SQL based on a large language model [37]. This algorithm first converts natural language into Python code and then uses the Python code to generate SQL. NL2PY2SQL has achieved promising results in experiments. To validate the effectiveness of our proposed Nabil model, we conducted comparative experiments on the BIRD dataset with NL2PY2SQL, SuperSQL, SFT_CodeS, DAILSQL_SC, and C3_SQL. We also used EX and VES to analyze the results. The results are shown in Figure 7.

In Figure 7a, the Nabil model outperforms the runner-up by approximately 4.4 percentage points, demonstrating its superior generalization and robustness when handling complex real-world BIRD database scenarios. Figure 7b also shows that the Nabil model leads in this metric, with a significant performance advantage over other models, fully demonstrating its accuracy, efficiency, and reliability in real-world applications.

This comparative experimental result demonstrates that the Nabil model has a significant advantage in both execution accuracy and overall efficiency scores under the BIRD real-world database benchmark. It also demonstrates that the Nabil model, with its design advantages in complex semantic understanding, multimodal fusion, and SQL generation, has strong potential for practical engineering applications, making it one of the optimal Text-to-SQL solutions for BIRD scenarios.

4.4. Ablation Comparison Experiment

The proposed Nabil model consists of an NL semantic encoding layer, a multimodal feature fusion layer, a candidate SQL generation layer, and a champion screening layer. To gain a deeper understanding of the proposed Nabil model, we conducted ablation comparison experiments on the Nabil model. Four sets of experiments were conducted. The first set of comparative experiments compared the complete Nabil model with the Nabil model without the NL semantic encoding layer. The second set of comparative experiments compared the complete Nabil model with the Nabil model without the multimodal feature fusion layer. The third set of comparative experiments compared the complete Nabil model with the Nabil model without the candidate SQL generation layer. The fourth set of comparative experiments compared the complete Nabil model with the Nabil model without the champion screening layer. The ablation comparison experiments also used EX and VES to analyze the experimental results. The specific experimental results are shown in Figure 8.

Figure 8a shows that removing the NL semantic encoding layer results in a 7.2% decrease in model execution accuracy, demonstrating that the brain-inspired spiking neural network’s ability to capture fine-grained semantic features is crucial for the Nabil model. Figure 8b shows that removing the candidate SQL generation layer and the multimodal feature fusion layer results in a 9.6% and 9.3% drop in model performance, respectively. This demonstrates the significance of Prompt search space optimization and multimodal fusion in improving the overall efficiency of the Nabil model.

Ablation experiment results demonstrate the importance of each key module in the Nabil model. Among them, the NL semantic encoding layer and champion screening layer have the most significant impact on execution accuracy and effective efficiency scores, resulting in a maximum performance drop of approximately 7.2% and 12.7%, respectively. This demonstrates that our proposed brain-inspired spiking neural network feature encoding and champion screening strategy based on SQL equivalence verification are crucial for improving the overall performance of the Text-to-SQL model. Furthermore, the multimodal feature fusion layer and candidate SQL generation layer significantly improve the model’s generalization and execution efficiency, highlighting the value of the feature fusion mechanism and the Prompt optimization strategy proposed in this study.

4.5. Large Language Model Comparison Experiment

In this paper, the large language model is primarily used in the candidate SQL generation layer and the champion screening layer. In the candidate SQL generation layer, the large language model is responsible for generating prompt templates based on semantic features and batch-generating multiple candidate SQL statements (Top-K). In the champion screening layer, the large language model is responsible for generating minimum discriminant test data and assisting in equivalence determination (when database engine execution results are inconsistent, the LLM judgment is used as a reference).

Our multi-layered prompt design consists of structured prompts, few-shot prompts, schema-aware prompts, and task-decomposition prompts. Structured prompts explicitly require the large language model to output standard SQL. Few-shot prompts include NL-SQL examples from the training set in the input. Schema-aware prompts incorporate database schema information into the prompt. Task-decomposition prompts generate SQL step by step, followed by correction and verification.

To ensure fairness and reproducibility, we standardized the settings for comparative experiments involving three large language models (DeepSeek-V3.1, Qwen3-Coder-480B-A35B-Instruct, and GLM-Z1-32B-0414). All experiments were conducted using the same dataset and database engine, consistent prompt templates (structured, few-shot, schema-aware, task decomposition), and uniform generation parameters (temperature = 0.2, top_p = 0.9, max_tokens = 1024, candidate number n = 5). To ensure statistical significance and reproducibility, each model was independently run three times, and the mean and standard deviation of the execution accuracy (EX) and valid efficiency score (VES) were calculated. The experimental results are shown in Figure 9.

Figure 9 shows significant differences in the performance of the three large language models on the BIRD dataset. DeepSeek-V3.1 (671 B parameters) achieved the highest scores in both EX and VES, demonstrating stronger capabilities in complex query generation and stability. Qwen3-Coder-480B-A35B-Instruct (480 B parameters) came in second, maintaining good performance at a lower parameter size. GLM-Z1-32B-0414 (32 B parameters) lagged significantly behind in both metrics, indicating its limited ability to handle complex database scenarios. Overall, larger models and stronger semantic modeling capabilities are associated with better accuracy and efficiency in Text-to-SQL tasks, but this also comes at the cost of higher resource consumption.

In this section, we provide a detailed description and analysis of the Nabil model’s experimental process from the perspectives of dataset content, evaluation metrics, performance comparison experiments, ablation comparison experiments, and large language model comparison experiments. The experimental results demonstrate that the Nabil model has considerable application value in text-to-SQL engineering.

5. Conclusions

In this paper, to improve the accuracy, generalization, and robustness of text-to-SQL (Text-to-SQL) models, we propose Nabil, a Text-to-SQL model based on brain-inspired computing techniques and large language models. The Nabil model’s NL semantic encoding layer uses a brain-inspired spiking neural network to capture fine-grained semantic features. The multimodal feature fusion layer fuses the spiking features output by the SNN with the high-level semantic features output by the LLM. The SQL candidate generation layer narrows the invalid search space of the large language model by controlling prompt templates, examples, and structure. Finally, the SQL candidate generation layer selects the optimal SQL based on SQL equivalence verification. Performance comparison and ablation experiments demonstrate that our proposed Nabil Text-to-SQL model exhibits superior generalization and robustness in complex, real-world database scenarios, achieving state-of-the-art performance. However, due to limited functional support and the inevitable hallucinations of large language models, the proportion of executable verifications remains significantly improved. Future research on Text-to-SQL is planned for models and components with more comprehensive functionality. In addition, given the low power consumption characteristics of brain-inspired computing, we plan to carry out related research on Text-to-SQL at the edge.

Author Contributions

F.Z. and S.H. wrote the main manuscript text, N.L. and X.D. provided the idea. T.Z., Y.Z., S.S., X.L. and H.Z. prepared the data and figures. All authors reviewed the manuscript. The authors read and approved the final manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

Special funding for the joint training of postdoctoral researchers from the Fudan University Postdoctoral Mobile Station and China Electronics Jinxin Digital Technology Group Co., Ltd.

Data Availability Statement

The dataset is from https://bird-bench.github.io/. Accessed 25 March 2025.

Acknowledgments

Thanks to China Electronics Jinxin Digital Technology Group Co., Ltd. for its strong support of the project.

Conflicts of Interest

Author Xiaozheng Du and Nan Li was employed by the company GienTech Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Wang, J.; Liu, S.; Chen, Z.; Shen, T.; Wang, Y.; Yin, R.; Liu, H.; Liu, C.; Shen, C. Ultrasensitive electrospinning fibrous strain sensor with synergistic conductive network for human motion monitoring and human-computer interaction. J. Mater. Sci. Technol. 2025, 213, 213–222. [Google Scholar] [CrossRef]
Meng, Q.; Yan, Z.; Abbas, J.; Shankar, A.; Subramanian, M. Human–computer interaction and digital literacy promote educational learning in pre-school children: Mediating role of psychological resilience for kids’ mental well-being and school readiness. Int. J. Hum.-Comput. Interact. 2025, 41, 16–30. [Google Scholar] [CrossRef]
Mehonic, A.; Kenyon, A.J. Brain-inspired computing needs a master plan. Nature 2022, 604, 255–260. [Google Scholar] [CrossRef]
Guo, J.; Zhan, Z.; Gao, Y.; Xiao, Y.; Lou, J.G.; Liu, T.; Zhang, D. Towards complex text-to-sql in cross-domain database with intermediate representation. arXiv 2019, arXiv:1905.08205. [Google Scholar]
Sen, J.; Lei, C.; Quamar, A.; Özcan, F.; Efthymiou, V.; Dalmia, A.; Stager, G.; Mittal, A.; Saha, D.; Sankaranarayanan, K. Athena++ natural language querying for complex nested sql queries. Proc. VLDB Endow. 2020, 13, 2747–2759. [Google Scholar] [CrossRef]
Liu, J.; Cui, Q.; Cao, H.; Shi, T.; Zhou, M. Auto-conversion from Natural Language to Structured Query Language using Neural Networks Embedded with Pre-training and Fine-tuning Mechanism. In Proceedings of the 2020 Chinese Automation Congress (CAC), Shanghai, China, 6–8 November 2020; IEEE: New York City, NY, USA; pp. 6651–6654. [Google Scholar]
Marcus, R.; Negi, P.; Mao, H.; Tatbul, N.; Alizadeh, M.; Kraska, T. Bao: Making learned query optimization practical. In Proceedings of the 2021 International Conference on Management of Data, Xi’an, China, 20–25 June 2021; pp. 1275–1288. [Google Scholar]
Cao, R.; Chen, L.; Chen, Z.; Zhao, Y.; Zhu, S.; Yu, K. LGESQL: Line graph enhanced text-to-SQL model with mixed local and non-local relations. arXiv 2021, arXiv:2106.01093. [Google Scholar]
Sioulas, P.; Ailamaki, A. Scalable multi-query execution using reinforcement learning. In Proceedings of the 2021 International Conference on Management of Data, Xi’an, China, 20–25 June 2021; pp. 1651–1663. [Google Scholar]
Ahkouk, K.; Machkour, M.; Ennaji, M. Data agnostic RoBERTa-based natural language to SQL query generation. In Proceedings of the IEEE 6th International Conference for Convergence in Technology (I2CT), Pune, India, 2–4 April 2021. [Google Scholar]
Zhao, C.; Su, Y.; Pauls, A.; Platanios, E.A. Bridging the generalization gap in text-to-SQL parsing with schema expansion. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 5568–5578. [Google Scholar]
Hui, B.; Geng, R.; Wang, L.; Qin, B.; Li, B.; Sun, J.; Li, Y. S²SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers. arXiv 2022, arXiv:2203.06958. [Google Scholar]
Yu, X.; Chai, C.; Li, G.; Liu, J. Cost-based or learning-based? A hybrid query optimizer for query plan selection. Proc. VLDB Endow. 2022, 15, 3924–3936. [Google Scholar] [CrossRef]
Gan, Y.; Chen, X.; Huang, Q.; Purver, M. Measuring and improving compositional generalization in text-to-sql via component alignment. arXiv 2022, arXiv:2205.02054. [Google Scholar]
Fu, H.; Liu, C.; Wu, B.; Li, F.; Tan, J.; Sun, J. Catsql: Towards real world natural language to sql applications. Proc. VLDB Endow. 2023, 16, 1534–1547. [Google Scholar] [CrossRef]
Chen, Z.; Chen, S.; White, M.; Mooney, R.; Payani, A.; Srinivasa, J.; Su, Y.; Sun, H. Text-to-SQL error correction with language models of code. arXiv 2023, arXiv:2305.13073. [Google Scholar]
Gu, Z.; Fan, J.; Tang, N.; Cao, L.; Jia, B.; Madden, S.; Du, X. Few-shot text-to-sql translation using structure and content prompt learning. Proc. ACM Manag. Data 2023, 1, 1–28. [Google Scholar] [CrossRef]
Giaquinto, R.; Zhang, D.; Kleiner, B.; Li, Y.; Tan, M.; Bhatia, P.; Nallapati, R.; Ma, X. Multitask pretraining with structured knowledge for text-to-SQL generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 11067–11083. [Google Scholar]
Lee, K.; Dutt, A.; Narasayya, V.; Chaudhuri, S. Analyzing the impact of cardinality estimation on execution plans in microsoft SQL server. Proc. VLDB Endow. 2023, 16, 2871–2883. [Google Scholar] [CrossRef]
Ba, J.; Rigger, M. Testing database engines via query plan guidance. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 14–20 May 2023; IEEE: New York City, NY, USA; pp. 2060–2071. [Google Scholar]
Chen, T.; Gao, J.; Chen, H.; Tu, Y. Loger: A learned optimizer towards generating efficient and robust query execution plans. Proc. VLDB Endow. 2023, 16, 1777–1789. [Google Scholar] [CrossRef]
Li, R.; Zhao, K.; Yu, J.X.; Wang, G. CardOOD: Robust Query-driven Cardinality Estimation under Out-of-Distribution. arXiv 2024, arXiv:2412.05864. [Google Scholar]
Fan, Y.; Ren, T.; Huang, C.; He, Z.; Wang, X.S. Grounding Natural Language to SQL Translation with Data-Based Self-Explanations. arXiv 2024, arXiv:2411.02948. [Google Scholar] [CrossRef]
Fan, J.; Gu, Z.; Zhang, S.; Zhang, Y.; Chen, Z.; Cao, L.; Li, G.; Madden, S.; Du, X.; Tang, N. Combining small language models and large language models for zero-shot NL2SQL. Proc. VLDB Endow. 2024, 17, 2750–2763. [Google Scholar] [CrossRef]
Liu, C.; Liao, W.; Xu, Z. Research on natural language query to SQL method with fused table structure. In Proceedings of the 2024 5th International Conference on Computer Engineering and Application (ICCEA), Hangzhou, China, 12–14 April 2024; IEEE: New York City, NY, USA; pp. 564–567. [Google Scholar]
Kim, H.; Jeon, T.; Choi, S.; Choi, S.; Cho, H. FLEX: Expert-level False-Less EXecution Metric for Reliable Text-to-SQL Benchmark. arXiv 2024, arXiv:2409.19014. [Google Scholar]
Mao, W.; Wang, R.; Guo, J.; Zeng, J.; Gao, C.; Han, P.; Liu, C. Enhancing Text-to-SQL Parsing through Question Rewriting and Execution-Guided Refinement. In Proceedings of the Findings of the Association for Computational Linguistics ACL 2024, Bangkok, Thailand, 11–16 August 2024; pp. 2009–2024. [Google Scholar]
Xie, X.; Xu, G.; Zhao, L.; Guo, R. OpenSearch-SQL: Enhancing Text-to-SQL with Dynamic Few-shot and Consistency Alignment. arXiv 2025, arXiv:2502.14913. [Google Scholar] [CrossRef]
Chen, K.; Chen, Y.; Koudas, N.; Yu, X. Reliable Text-to-SQL with Adaptive Abstention. Proc. ACM Manag. Data 2025, 3, 1–30. [Google Scholar] [CrossRef]
Castelein, J.; Aniche, M.; Soltani, M.; Panichella, A.; van Deursen, A. Search-based test data generation for SQL queries. In Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden, 27 May–3 June 2018; pp. 1220–1230. [Google Scholar]
Chu, S.; Murphy, B.; Roesch, J.; Cheung, A.; Suciu, D. Axiomatic foundations and algorithms for deciding semantic equivalences of SQL queries. arXiv 2018, arXiv:1802.02229. [Google Scholar] [CrossRef]
Zhou, Q.; Arulraj, J.; Navathe, S.; Harris, W.; Xu, D. Automated verification of query equivalence using satisfiability modulo theories. Proc. VLDB Endow. 2019, 12, 1276–1288. [Google Scholar] [CrossRef]
Zhou, Q.; Arulraj, J.; Navathe, S.B.; Harris, W.; Wu, J. SPES: A symbolic approach to proving query equivalence under bag semantics. In Proceedings of the 2022 IEEE 38th International Conference on Data Engineering (ICDE), Kuala Lumpur, Malaysia, 9–12 May 2022; IEEE: New York City, NY, USA; pp. 2735–2748. [Google Scholar]
Wang, S.; Pan, S.; Cheung, A. QED: A Powerful Query Equivalence Decider for SQL. Proc. VLDB Endow. 2024, 17, 3602–3614. [Google Scholar] [CrossRef]
He, Y.; Zhao, P.; Wang, X.; Wang, Y. VeriEQL: Bounded Equivalence Verification for Complex SQL Queries with Integrity Constraints. Proc. ACM Program. Lang. 2024, 8, 1071–1099. [Google Scholar] [CrossRef]
Zhang, Y.; Qu, P.; Ji, Y.; Zhang, W.; Gao, G.; Wang, G.; Song, S.; Li, G.; Chen, W.; Zheng, W.; et al. A system hierarchy for brain-inspired computing. Nature 2020, 586, 378–384. [Google Scholar] [CrossRef]
Du, X.; Hu, S.; Zhou, F.; Wang, C.; Nguyen, B.M. FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model. Future Internet 2025, 17, 12. [Google Scholar] [CrossRef]
Du, X.; Guo, X.; Zhou, F.; Gu, M.; Lu, Z.; Wang, C. FinDS2: A Novel Data Synthesis System for Fintech Product Risks. In Proceedings of the 2024 IEEE 11th International Conference on Cyber Security and Cloud Computing (CSCloud), Shanghai, China, 28–30 June 2024; IEEE: New York City, NY, USA; pp. 73–78. [Google Scholar]
Li, J.; Hui, B.; Qu, G.; Yang, J.; Li, B.; Li, B.; Wang, B.; Qin, B.; Geng, R.; Huo, N.; et al. Can llm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls. Adv. Neural Inf. Process. Syst. 2023, 36, 42330–42357. [Google Scholar]

Figure 1. Nabil model application scenarios. Source: author’s contribution.

Figure 2. The Spiking neural network structure of NL semantic encoding layer. Source: author’s contribution.

Figure 3. The Structure of the multimodal feature fusion layer. Source: author’s contribution.

Figure 4. The Structure of the candidate SQL generation layer. Source: author’s contribution.

Figure 5. The Champion Model Structure. Source: author’s contribution.

Figure 6. The Nabil model syntax tree abstract process structure. Source: author’s contribution.

Figure 7. Performance comparison experiment results. (a) Performance comparison results using EX metrics. (b) Performance comparison results using VES metrics.

Figure 8. Ablation comparison experiment results. (a): Ablation comparison results using EX metric statistics. (b): Ablation comparison results using VES metric statistics.

Figure 9. Comparison results of large language models. (a) Comparison results using EX metric statistics. (b) Comparison results using VES metric statistics.

Table 1. Related Research Statistics. Source: author’s contribution.

Researcher	Research Content	Advantages	Disadvantages
Guo, J. [4]	Text-to-SQL	Modular and interpretable	Limited expressive capabilities
Chen, T. [21]	Query optimization	Strong robustness	Only supports SPJ queries
Mao, W. [27]	Text-to-SQL	Reduce natural language ambiguity	Risk of excessive rewriting
Xie, X. [28]	Text-to-SQL	Few-shot automatic expansion	Limited SQL-like capabilities
Chen, K. [29]	Text-to-SQL	Improve accuracy and reliability	Excessive abandonment
Wang, S. [34]	SQL equivalence verification	Fast reasoning speed	Does not support complex queries
He, Y. [35]	SQL equivalence verification	Considering integrity constraints	Does not support complex queries

Table 2. Description of each field in datasets. Source: [37,39].

Field	Description
db_id	Database Name
question	Questions curated by human crowdsourced resources based on database descriptions and content.
evidence	External knowledge evidence annotated by experts is used to assist models or SQL annotators.
SQL	Questions are accurately answered using SQL annotated by crowdsourced resources, including database descriptions and content.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, F.; Hu, S.; Du, X.; Li, N.; Zhou, T.; Zhao, Y.; Shang, S.; Ling, X.; Zhu, H. Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling. Electronics 2025, 14, 3910. https://doi.org/10.3390/electronics14193910

AMA Style

Zhou F, Hu S, Du X, Li N, Zhou T, Zhao Y, Shang S, Ling X, Zhu H. Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling. Electronics. 2025; 14(19):3910. https://doi.org/10.3390/electronics14193910

Chicago/Turabian Style

Zhou, Feng, Shijing Hu, Xiaozheng Du, Nan Li, Tongming Zhou, Yanni Zhao, Sitong Shang, Xufeng Ling, and Huaizhong Zhu. 2025. "Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling" Electronics 14, no. 19: 3910. https://doi.org/10.3390/electronics14193910

APA Style

Zhou, F., Hu, S., Du, X., Li, N., Zhou, T., Zhao, Y., Shang, S., Ling, X., & Zhu, H. (2025). Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling. Electronics, 14(19), 3910. https://doi.org/10.3390/electronics14193910

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Nabil: A Text-to-SQL Model Based on Brain-Inspired Computing Techniques and Large Language Modeling

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Preliminaries

3.2. Nabil Model

3.3. Nabil Model Algorithm

3.4. Nabil Model Normalization Module Algorithm

3.5. Nabil Model Syntax Tree Abstraction Process

4. Experimental Results and Analysis

4.1. Datasets

4.2. Evaluation Metrics

4.3. Performance Comparison Experiment

4.4. Ablation Comparison Experiment

4.5. Large Language Model Comparison Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI