Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching

Chen, Hainan; Shen, Dongqi

doi:10.3390/bdcc9100265

Open AccessArticle

Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching

by

Hainan Chen

^*

and

Dongqi Shen

School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(10), 265; https://doi.org/10.3390/bdcc9100265

Submission received: 21 August 2025 / Revised: 9 October 2025 / Accepted: 10 October 2025 / Published: 20 October 2025

Download

Browse Figures

Versions Notes

Abstract

Tables, as a form of structured or semi-structured data, are widely found in documents, reports, and data manuals. Table-based question answering (TableQA) plays a key role in table document analysis and understanding. Existing approaches to TableQA can be broadly categorized into content-matching methods and end-to-end generation methods based on encoder–decoder deep neural networks. Content-matching methods return one or more table cells as answers, thereby preserving the original data and making them more suitable for downstream tasks. End-to-end methods, especially those leveraging large language models (LLMs), have achieved strong performance on various benchmarks. However, the variability in LLM-generated expressions and their heavy reliance on prompt engineering limit their applicability where answer fidelity to the source table is critical. In this work, we propose CBCM (Cell-by-Cell semantic Matching), a fine-grained cell-level matching method that extends the traditional row- and column-matching paradigm to improve accuracy and applicability in TableQA. Furthermore, based on the public IM-TQA dataset, we construct a new benchmark, IM-TQA-X, specifically designed for the multi-row and multi-column cell recall task, a scenario underexplored in existing state-of-the-art content-matching methods. Experimental results show that CBCM improves overall accuracy by 2.5% over the latest row- and column-matching method RGCNRCI (Relational Graph Convolutional Networks based Row and Column Intersection), and boosts accuracy in the multi-row and multi-column recall task from 4.3% to 34%.

Keywords:

complex table question answering; multi-row and multi-column cell recall; extended cell semantic representation; text classification

1. Introduction

Tables are integral to various sectors such as finance, healthcare, and engineering due to their ability to present data concisely and efficiently [1]. Table Question Answering (TQA) technology facilitates the extraction of valuable information from numerous tables through natural language interaction [2,3,4,5]. Developing table question answering models is essential in document data analysis and processing, capturing significant interest in artificial intelligence research. Current methodologies for TQA primarily fall into two categories: methods based on table content matching and end-to-end generation methods using encoder-decoder neural networks [6,7].

Despite advancements, significant research gaps exist within the domain of TQA. Fine-grained table content matching methods, which use cells as search objects, can efficiently pinpoint answers; however, they struggle with complex table datasets that contain cross-row and cross-column cells [1,8,9,10,11,12]. This discrepancy highlights a critical research question regarding how best to address complex table layouts that traditional seq2seq architectures, designed mainly for simpler tables [13], fail to handle adequately.

Moreover, the emergence of transformers introduced table serialization methods, which concatenate questions and tables into sequences for encoding [14,15,16,17,18]. While transformers offer a robust framework, inherent constraints such as input sequence length limitations and impaired interpretability during encoding detract from their effectiveness [19]. The challenge of maintaining structural integrity during serialization poses another research question: How can the table serialization method be adapted to better preserve the structural nuances of complex tables?

Coarse-grained matching methods present a more efficient alternative, leveraging row-and-column intersecting points as answers [10,20], yet inadequately manage scenarios involving complex multi-line cell retrieval [15,17,21,22]. This deficiency surfaces a further research problem: How can accuracy be improved for table question types that require more nuanced recall, avoiding the inadvertent inclusion of irrelevant data points in the results? The empirical example using the IM-TQA [10] dataset illustrates this dilemma vividly.

The advent of Large Language Models (LLMs) such as GPT [23] and Ernie Bot has paved the way for end-to-end solutions that generate answer texts directly [24,25]. LLM approaches outperform existing methods but introduce challenges such as difficulty in verifying the correctness of diverse text expressions and variability induced by prompt wording [26]. This poses a significant hypothesis: Could refining semantic matching within cell-based matching approaches offer more reliable and interpretable results across diverse question types?

By focusing on table content matching-based methods, this paper seeks to address these research issues directly. We propose a method centered on cell semantic matching, evaluating each cell iteratively to ascertain its relevance as a correct answer to a given question, and subsequently returning all relevant answer cells. We hypothesize that this method offers superior interpretability and robustness compared to existing coarse-grained approaches, and effectively handles varied question distribution scenarios.

Our study aims to bridge the identified gaps by refining current methodologies and enhancing TQA models’ efficiency and accuracies. This pursuit is crucial for advancing document data analysis and ensuring applicability in practical, real-world scenarios.

The remainder of this paper is organized as follows: Section 2 reviews related work on table question answering and table cell semantic matching. Section 3 presents the methodology of the proposed CBCM model. Validation experiments are discussed in Section 4. Section 5 includes the discussion, while conclusions are summarized in Section 6. (All the code and data can be obtained from https://github.com/s-dq/CBCM_tableqa (accessed on 20 August 2025)).

2. Related Work

(1): Table question answering

Early table question answering used the semantic parsing method to generate structured query statements by encoding and decoding the questions and then executing the query statements on the structured table data to obtain the answers. However, this method can only be applied to structured tables. After the emergence of the transformer, the table is flattened into a sequence according to the rule from left to right and top to bottom, spliced with the question, and then encoded and decoded. The probability of each cell is the output [15,17]. It should be noticed that the flattening rule of the table from left to right and top to bottom will cause structure disorder due to cross-row and cross-column cells. By adding more embedding layers such as col_id and row_id, the table structure information is more complete [18]. To solve the problem of excessive text length after serializing large tables, a new sparse attention calculation method is designed to improve the input length limit by reducing the amount of calculation [19].

To optimize the matching procedure, row and column matching are employed rather than cell matching. The table question answering task is transformed into the task of finding the rows and columns related to the question, and the intersection of rows and columns is the answer [10,20,27]. On the basis of row and column matching, the semantic information in the row and column text is considered to provide more judgment bases for the text classification task and improve the accuracy of row and column matching [10].

(2): Text classification

In the scope of table-based question and answering, target cells identification, to some extent can be considered as a cell text classification task. Text as a kind of unstructured data, when using mathematical modeling methods to perform text classification tasks, it is necessary to convert text into structured features through methods such as TF-IDF [28], word2vec [29], and BERT [30].

Once the texts are vectorized, the next step is to design an optimal text classification algorithm for text classification. Traditional machine learning methods can be used in text classification, such as logistic regression, KNN, SVM, CRF, etc. Compared with thesis statistic machine learning algorithms [31,32,33], deep learning methods, such as CNN, DNN, and RNN for classification tasks [34,35], rely on their ability to model complex and nonlinear relationships in data and have achieved better results in multiple tasks [36].

3. CBCM Table QA Method

3.1. Cell Type Classification in Table Question Answering

The cells in the table are divided into 5 types [10]: row attributes, column attributes, row index, column index and pure data.

Row attributes and column attributes are traditional table titles that describe other cells in the same row and column, respectively. For example, the yellow and red cells in Figure 1. Attribute cells are used to describe other cells.

Row index and column index are individual cells used to index data records in the row or column direction. For example, the blue and green cells in Figure 1.

Pure data cells are the core of the table. They have no function of describing or indexing other cells, and their meaning should be understood with the help of the above attributes or index cells.

3.2. Definition of Table Types in Table Question Answering

According to the composition and distribution of cell types in the table, the tables are divided into 4 types [10]: vertical table (TT1), horizontal table (TT2), hierarchical table (TT3), complex table (TT4).

Vertical tables (TT1) arrange data in the vertical direction. The first row is the column title, and the other rows are data tuples, as shown in Figure 1A.

Horizontal tables (TT2) arrange data in the horizontal direction. The first column is the row title, and the other columns are data tuples, as shown in Figure 1B.

Hierarchical tables (TT3) arrange data in both vertical and horizontal directions, and the title shows a multi-level hierarchical structure, as shown in Figure 1C.

Complex tables (TT4) mix attribute cells, index cells, and pure data cells. Figure 1D shows an example of complex table.

3.3. Question Classification in Table Question Answering

According to the distribution of answer cells, we divide the questions into 3 types: single cell query (TQ1), single line query (TQ2), multi-line query (TQ3)

For single cell query (TQ1), the answer to the question consists of a single cell. For example, the question in Figure 2A. The answer to the question is the red cell in the table.

For single line query (TQ2), the answer to the question consists of multiple cells in a single row or column. For example, the question in Figure 2B. The answer to the question is the red cell in the table.

For multi-line query (TQ3), the answer to the question appears in positions of more than one row and more than one column. For example, the question in Figure 2C. The answer to the question is the red cell in the table.

Among them, the third type of multi-line query includes the types of questions that cannot be solved by row and column matching mentioned in Figure 2. We propose an identification rule, R1: for any given cell in the table; if there are answer cells that appear in the current row and column simultaneously, then the current cell should also belong to the target answer cells. If R1 is satisfied, then the row and column matching method can be used for question answering. If not, the row and column matching method cannot be used. The logical expressions are as follows.

R 1 : \forall (i, j) \in {(r, c)}, (\exists (i, k) \in A, \exists (m, j) \in A) | (i \neq m) \land (k \neq j) \Rightarrow (i, j) \in A

(1)

where:

{(r, c)}

is a table, r is the row index, c is the column index.

(i, j)

is a table cell with ith row and jth column. A is the answer table cell set.

3.4. Extended Cell-Bycell Semantic Matching

The CBCM algorithm consists of two stages: first, the extended cell semantic expressions are built; second, semantic classification is performed to determine whether a cell is the answer cell. Specifically, the entire procedure can be organized into three steps, and the overall framework is illustrated in Figure 3. The pseudocode is illustrated in Algorithm 1.

Algorithm 1 CBCM Algorithm

Require: Table T, Question Q
Ensure: Answer Cells A

1:: Step 1: Cell Semantic List Generation
2:: Initialize empty text list $TL \leftarrow {}$
3:: for each cell $C_{i, j}$ in table T do
4:: Generate description text $D_{i, j}$ for cell $C_{i, j}$ using predefined text generation rules
5:: Append $D_{i, j}$ to $TL$
6:: end for
7:: Step 2: Concatenate Texts
8:: Initialize empty text question list $TQL \leftarrow {}$
9:: for each description text $D_{i, j}$ in $TL$ do
10:: Concatenate $D_{i, j}$ with question Q to form text $T Q_{i, j}$
11:: Append $T Q_{i, j}$ to $TQL$
12:: end for
13:: Step 3: Text Binary Classification
14:: Initialize empty answer set $A \leftarrow {}$
15:: for each text $T Q_{i, j}$ in $TQL$ do
16:: Perform binary classification on $T Q_{i, j}$
17:: if classification result is “yes” then
18:: Append cell $C_{i, j}$ to answer set A
19:: end if
20:: end for
21:: return A

Step 1: Generate a cell semantic list from the table. For any given table, design a cell semantic text generation rule to generate cell description texts for each table cell. Then, the given table is converted to a text list (TL).

Step 2: Concatenate cell semantic and question text. Concatenate the question text to each cell description text to generate a new text list (TQL).

Step 3: Text binary classification. Perform a binary classification task on the constructed new text list (TQL) to determine whether the cell represented by the current cell description text is the answer to the question. Return all the cells with the classification result of “yes” as the result.

4. Experiment

To validate and improve the accuracy of our table question-answering method, we first created a dataset specifically designed for the complex table QA task. Subsequently, we conducted experiments concentrating on two aspects: cell semantic representation styles and cell semantic classification.

4.1. IM-TQA-X Benchmark Dataset

Based on the public IM-TQA dataset [10], we constructed a new benchmark IM-TQA-X for complex multi-line cell recall question answering validation.

We find all complex multi-line cell recall questions by applying the R1 rule (defined in Section 3.3). Then, identify all the related tables (CQ) and all the questions on those tables (one table with about four questions). Organize all the CQ tables and questions into the test set. The established new benchmark forces the algorithms to consider more about the complex multi-line cell recall table question-answering. The construction process is shown in Figure 4, and the pseudocode is illustrated in Algorithm 2.

It should be noted that the CBCM method considers the distribution of answers to all types of questions. In the original IM-TQA dataset, the overall proportion of multi-line query (TQ3) is very small, and the proportion in the test set is also very low. To reflect the effectiveness of the CBCM method, more multi-line query questions (TQ3) are added to the test set. After adjusting the data, multi-line query (TQ3) still only accounts for about 6%, as shown in Table 1. In addition, the row and column matching method and the cell matching method use the text binary classification method for training. Thus, the type of question does not affect the training.

Algorithm 2 IM-TQA-X Benchmark Construction

Require: IM-TQA dataset D
Ensure: New Benchmark

I M - T Q A - X

1:: Step 1: Identify Complex Multi-line Questions
2:: Initialize empty set $CQ \leftarrow {}$
3:: Initialize empty test set $TestSet \leftarrow {}$
4:: for each question $q_{i}$ in dataset D do
5:: if question $q_{i}$ satisfies R1 rule then
6:: Add $q_{i}$ to $CQ$
7:: end if
8:: end for
9:: Step 2: Collect Relevant Tables and Questions
10:: for each complex question $c q_{i}$ in $CQ$ do
11:: Find table $t_{i}$ related to $c q_{i}$
12:: Identify all questions $q_{i, j}$ related to table $t_{i}$
13:: for each question $q_{i, j}$ do
14:: Add tuple $(t_{i}, q_{i, j})$ to $TestSet$
15:: end for
16:: end for
17:: Step 3: Adjust Question Distribution
18:: Calculate initial proportion of multi-line queries in $TestSet$
19:: if proportion of multi-line queries < desired proportion then
20:: Add more multi-line queries $TQ 3$ to $TestSet$ until desired proportion is met
21:: end if
22:: Note: Training Consideration
23:: for each method (row-column matching, cell matching) do
24:: Use text binary classification method for training on $TestSet$
25:: end for
26:: return $I M - T Q A - X = TestSet$

4.2. Table Question Answering Performance Validation

To validate the performance of the proposed CBCM algorithm. The latest row and column matching method RGCNRCI [10] is used as the benchmark. The RGCNRCI algorithm effectively handles implicit structures and multi-type tables. It achieves an overall question-answering accuracy rate of 53.4% on the IM-TQA dataset, surpassing existing baselines (a 3.8% improvement over RCI-AIT). Moreover, it significantly enhances performance on complex tables, reaching 32.0% accuracy (an 8.2% improvement over RCI-AIT), thereby addressing the limitations of traditional methods in handling complex table structures.

The proposed CBCM algorithm comprises two key components: cell semantic expression and cell semantic classification. For comprehensive validation, four table cell context representation methods have been devised as potential options to convey the semantics of table cells. To eliminate variations arising from the use of different semantic classification algorithms, a fully connected neural network is employed as the cell semantic classifier.

The four cell semantic expressions are constructed from the largest to the smallest number of related cells used, they are defined as follows (for any given table cell). Full row and column text (CT1), all cells in the current row and all cells in the current column are concatenated in sequence. Full row and column attribute index (CT2), all attributes and index cells in the current row and all attributes and index cells in the current column are concatenated in sequence. Nearest neighbor attribute and index (CT3), the left nearest neighbor row attribute cell, row index cell, the upper nearest neighbor column attribute cell, and column index cell are concatenated in sequence. Nearest neighbor attribute or index (CT4), the left nearest neighbor row attribute cell or row index cell is used. The upper nearest neighbor column attribute or column index is appended.

After constructing the table cell semantics and concatenating them with the question text, a potential feature string is generated. Each table cell has its own feature string. These feature strings are encoded by ERNIE [37] to produce cell semantic vectors. These vectors are then input into a classification model to compute the probability of “yes” or “no”, thereby indicating which cells are potential answers. Thus, two concatenation orders of the question text and cell semantic text are also considered: Question text + Cell semantic text and Cell semantic text + Question text.

A total of eight test scenarios are constructed, as detailed in Table 2. The experimental results for all scenarios are presented in Table 3. The experiment parameters are illustrated in the Appendix A.

The experimental results demonstrate that the proposed CBCM algorithm outperforms the benchmark model RGCNRCI, particularly in handling complex table structures. The summary of the research findings indicates that Experiment 3, employing the cell semantic expression CT3, delivers the highest overall accuracy at 50.5%. This approach preserves the most relevant row and column attributes and indices, ensuring that critical semantic information is retained while maintaining high relevance to the target table cell.

Moreover, the sequence of text concatenation appears to have a minor impact on the experimental results, without exhibiting any consistent pattern or trend. Thus, it suggests that the CBCM algorithm’s performance is relatively robust to variations in text concatenation order, allowing for flexibility in text processing.

When analyzing performance by table type, RGCNRCI showed superiority in vertical and hierarchical tables due to its effectiveness in encoding entire rows and columns semantically. However, it exhibited limitations in complex tables where excessive irrelevant information might be incorporated during semantic encoding. In contrast, CBCM excels in complex tables by focusing on cell-centric semantic matching, which reduces noise and enhances relevant information extraction.

Examining the results by question type reveals that CBCM outperforms RGCNRCI in multi-row and multi-column question scenarios. This is attributed to CBCM’s design, which is based on cell semantic matching and can handle queries across diverse table structures, providing robust solutions for a broader range of question types. While RGCNRCI maintains a slight edge in single-row and single-column questions, CBCM provides competitive accuracy across all question types.

Overall, CBCM’s advantages lie in its ability to effectively capture and utilize relevant semantic information in complex table environments, providing a versatile and comprehensive solution for a wide array of table-related questions. These findings underscore the innovation in CBCM’s semantic representation and classification approach, highlighting its potential for development in intelligent question-answering systems.

5. Discussion

5.1. Ablation Analysis

To further validate the performance of CBCM, we examined the impact of the text binary classification method on table question-answering. As shown in Figure 5, the 768-dimensional vector derived from the ERNIE model serves as the input for the semantic classification algorithms. When selecting machine learning algorithms for text classification, we considered the large feature dimension and the interdependence of features. Thus, we employed three machine learning methods for text binary classification: Support Vector Machine [38], Random Forest [39], and k-Nearest Neighbor [26].

The established new benchmark dataset (Section 4.1) is used. Two indicators, Text Classification Accuracy and Table Question Answering Accuracy are used to validate the classification algorithm performance. The text classification accuracy is the accuracy of determining whether each cell is the answer, and the table question answering accuracy is the accuracy of correctly obtaining all answer cells in the entire table.

The experimental results are shown in Table 4. It shows that the linear fully connected classification method can achieve better results than other machine learning method. Although other machine learning algorithms also show the effectiveness of text binary classification, in our task, all cells in the entire table need to be judged whether they are answers. Only when all cells are correctly judged can a question be answered correctly, which leads to a low final accuracy rate.

5.2. Complexity Analysis

In our proposed table question-answering (QA) implementation method based on cell semantic matching, the approach leverages fine-grained semantic expressions to match and classify cells, offering a structured solution to complex table QA tasks. While this approach brings significant advantages in improving accuracy, particularly for tables with cross-row and cross-column dependencies, it is not without its limitations. One key limitation lies in the increased time complexity. The time complexity of our proposed CBCM (Cell-based Classification Matching) algorithm is

O (m \times n)

, where m and n represent the number of rows and columns in the table, respectively. This is a considerable increase compared to simpler methods, such as row- and column-based approaches, which have a time complexity of

O (m + n)

. As a result, while CBCM can effectively handle complex tables, it may become computationally expensive for larger tables, limiting its scalability and real-time performance.

6. Conclusions

This paper proposes a robust table question-answering method based on extended cell semantic matching, which transforms table question answering into a text classification task. Building on the public table question answering dataset IM-TQA, a new benchmark dataset (IM-TQA-X) is designed to compel algorithms to address more complex multi-line cell recall situations. Systematic validation demonstrated that the proposed CBCM algorithm offers broader applicability and higher answer accuracy compared to current state-of-the-art table content matching methods.

However, the CBCM algorithm’s performance heavily relies on prior knowledge, posing challenges in real-world applications. Specifically, it requires accurate and comprehensive information about the table structure, cell contents, and cell types. While we assume that these upstream tasks are completed, acquiring such data in practice can be non-trivial. This dependence on external knowledge introduces a bottleneck in system deployment, as inaccuracies or gaps in table structure or cell categorization may lead to suboptimal results. Future research should aim to reduce this dependency, possibly through unsupervised learning or by enhancing the system’s robustness to incomplete or noisy table data.

Author Contributions

Conceptualization, H.C. and D.S.; methodology, D.S.; writing—original draft preparation, D.S.; writing—review and editing, H.C.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 72301250.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (72301250). The conclusions herein are those of the authors and do not necessarily reflect the views of the sponsoring agencies.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Implementation Details for TQA Experiments

The parameters used in our comparative experiment RGCNRCI are the same as those in the original GitHub repository [10]. The algorithm-related parameters are as follows:

model_type: bert-base-chinese.
num_train_epochs: 3
learning_rate: $2 \times 10^{- 5}$
train_instances: 28,438/19,827

The parameters (4) are the parameters of the row and column models, representing the total number of data items. Due to changes in the data set, this parameter is modified.

References

Rahul, K.; Banyal, R.K.; Arora, N. A systematic review on big data applications and scope for industrial processing and healthcare sectors. J. Big Data 2023, 10, 133. [Google Scholar] [CrossRef]
Zheng, Y.; Wang, H.; Dong, B.; Wang, X.; Li, C. HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing. In Proceedings of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022. [Google Scholar]
Hui, B.; Geng, R.; Wang, L.; Qin, B.; Li, Y.; Li, B.; Sun, J.; Li, Y. S²SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers. In Proceedings of the Association for Computational Linguistics, Dubin, Ireland, 22–27 May 2022. [Google Scholar]
Wu, J.; Xu, Y.; Karlsson, B.F.; Okumura, M. A Table Question Alignment based Cell-Selection Method for Table-Text QA. J. Nat. Lang. Process. 2024, 31, 189–211. [Google Scholar] [CrossRef]
Khurana, U.; Suneja, S.; Samulowitz, H. Table Retrieval using LLMs and Semantic Table Similarity. In Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, Sydney, NSW, Australia, 28 April 2025; pp. 1072–1076. [Google Scholar]
Wu, J.; Xu, Y.; Gao, Y.; Lou, J.G.; Karlsson, B.; Okumura, M. TACR: A Table Alignment-based Cell Selection Method for HybridQA. In Proceedings of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
Liu, Y.; Zhang, Y.; Wang, Y.; Hou, F.; Yuan, J.; Tian, J.; Zhang, Y.; Shi, Z.; Fan, J.; He, Z. A survey of visual transformers. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 7478–7498. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Zhang, A.; Wu, K.; Sun, K.; Li, Z.; Wu, H.; Zhang, M.; Wang, H. DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset. In Proceedings of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
Yu, T.; Zhang, R.; Yang, K.; Yasunaga, M.; Wang, D.; Li, Z.; Ma, J.; Li, I.; Yao, Q.; Roman, S.; et al. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018. [Google Scholar]
Zheng, M.; Hao, Y.; Jiang, W.; Lin, Z.; Lyu, Y.; She, Q.; Wang, W. IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures. In Proceedings of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
Katsis, Y.; Chemmengath, S.; Kumar, V.; Bharadwaj, S.; Canim, M.; Glass, M.; Gliozzo, A.; Pan, F.; Sen, J.; Sankaranarayanan, K.; et al. AIT-QA: Question Answering Dataset over Complex Tables in the Airline Industry. In Proceedings of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022. [Google Scholar]
Cheng, Z.; Dong, H.; Wang, Z.; Jia, R.; Guo, J.; Gao, Y.; Han, S.; Lou, J.G.; Zhang, D. HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation. In Proceedings of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022. [Google Scholar]
Shi, H.; Xie, Y.; Goncalves, L.; Gao, S.; Zhao, J. WikiDT: Visual-Based Table Recognition and Question Answering Dataset. In International Conference on Document Analysis and Recognition; Springer: Berlin/Heidelberg, Germany, 2024; pp. 406–437. [Google Scholar]
Szałata, A.; Hrovatin, K.; Becker, S.; Tejada-Lapuerta, A.; Cui, H.; Wang, B.; Theis, F.J. Transformers in single-cell omics: A review and new perspectives. Nat. Methods 2024, 21, 1430–1443. [Google Scholar] [CrossRef] [PubMed]
Dargahi Nobari, A.; Rafiei, D. Dtt: An example-driven tabular transformer for joinability by leveraging large language models. Proc. ACM Manag. Data 2024, 2, 1–24. [Google Scholar] [CrossRef]
Zhang, H.; Wang, Y.; Wang, S.; Cao, X.; Zhang, F.; Wang, Z. Table Fact Verification with Structure-Aware Transformer. In Proceedings of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
Yin, P.; Neubig, G.; Yih, W.T.; Riedel, S. TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. In Proceedings of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
Yang, J.; Gupta, A.; Upadhyay, S.; He, L.; Goel, R.; Paul, S. TableFormer: Robust Transformer Modeling for Table-Text Encoding. In Proceedings of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022. [Google Scholar]
Eisenschlos, J.; Gor, M.; Müller, T.; Cohen, W. MATE: Multi-view Attention for Table Transformer Efficiency. In Proceedings of the Association for Computational Linguistics, Online, 1–6 August 2021. [Google Scholar]
Glass, M.; Canim, M.; Gliozzo, A.; Chemmengath, S.; Kumar, V.; Chakravarti, R.; Sil, A.; Pan, F.; Bharadwaj, S.; Fauceglia, N.R. Capturing Row and Column Semantics in Transformer Based Question Answering over Tables. In Proceedings of the Association for Computational Linguistics, Online, 1–6 August 2021. [Google Scholar]
Zhu, F.; Lei, W.; Huang, Y.; Wang, C.; Zhang, S.; Lv, J.; Feng, F.; Chua, T.S. TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance. In Proceedings of the Association for Computational Linguistics, Online, 1–6 August 2021. [Google Scholar]
Zhao, Y.; Nan, L.; Qi, Z.; Zhang, R.; Radev, D. ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples. In Proceedings of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022. [Google Scholar]
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6 December 2020. [Google Scholar]
Nan, L.; Hsieh, C.; Mao, Z.; Lin, X.V.; Verma, N.; Zhang, R.; Kryściński, W.; Schoelkopf, H.; Kong, R.; Tang, X.; et al. FeTaQA: Free-form Table Question Answering. Trans. Assoc. Comput. Linguist. 2022, 10, 35–49. [Google Scholar] [CrossRef]
Guan, C.; Huang, M.; Zhang, P. MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering. In Proceedings of the 2024 10th International Conference on Computing and Artificial Intelligence, Bali, Indonesia, 26–29 April 2024. [Google Scholar]
Zhuang, Y.; Yu, Y.; Wang, K.; Sun, H.; Zhang, C. Toolqa: A dataset for LLM question answering with external tools. Adv. Neural Inf. Process. Syst. 2023, 36, 50117–50143. [Google Scholar]
Kumar, V.; Gupta, Y.; Chemmengath, S.; Sen, J.; Chakrabarti, S.; Bharadwaj, S.; Pan, F. Multi-Row, Multi-Span Distant Supervision For Table+Text Question Answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
Salton, G.; Buckley, C. Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 1988, 24, 513–523. [Google Scholar] [CrossRef]
Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; Dean, J. Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 2013, 26. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019. [Google Scholar]
Sowmya, B.; Srinivasa, K. Large scale multi-label text classification of a hierarchical dataset using rocchio algorithm. In Proceedings of the 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS), Bengaluru, India, 6–8 October 2016. [Google Scholar]
Chen, K.; Zhang, Z.; Long, J.; Zhang, H. Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Syst. Appl. 2016, 66, 245–260. [Google Scholar] [CrossRef]
Lodhi, H.; Saunders, C.; Shawe-Taylor, J.; Cristianini, N.; Watkins, C. Text classification using string kernels. In Proceedings of the 14th International Conference on Neural Information Processing Systems, Denver, CO, USA, 1 January 2000. [Google Scholar]
Chen, J.; Yan, S.; Wong, K.C. Verbal aggression detection on Twitter comments: Convolutional neural network for short-text sentiment analysis. Neural Comput. Appl. 2020, 32, 10809–10818. [Google Scholar] [CrossRef]
Kowsari, K.; Heidarysafa, M.; Brown, D.E.; Meimandi, K.J.; Barnes, L.E. RMDL: Random Multimodel Deep Learning for Classification. In Proceedings of the 2nd International Conference on Information System and Data Mining, Lakeland, FL, USA, 9–11 April 2018; pp. 19–28. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Han, X.; Liu, Z.; Jiang, X.; Sun, M.; Liu, Q. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019. [Google Scholar]
Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]

Figure 1. Various types of tables.

Figure 2. Examples of various types of questions.

Figure 3. CBCM Algorithm Workflow.

Figure 4. IM-TQA-X Benchmark Build Process.

Figure 5. Table cells semantic classification experiments with different methods.

Table 1. IM-TQA-X Benchmark Dataset Statistics.

	Train Set		Test Set
	Table Number	Question Number	Table Number	Question Number
Total	907	3771	183	768
Classification by table type (According to Section 3.2)
Complex (TT1)	223	1,014	66	306
Vertical (TT2)	224	849	45	174
Horizontal (TT3)	229	933	38	129
Hierarchical (TT4)	231	1075	34	159
Classification by question type (According to Section 3.3)
Single cell query (TQ1)	–	2112	–	404
Single line query (TQ2)	–	1630	–	317
Multi-cell query (TQ3)	–	29	–	47

Table 2. Test Scenarios and Experiment Details.

Test Number	Cell Semantics	Text Concatenation Order	Text Classification Methods
1	CT1	Question Text + Cell Semantic Text	Fully connected classification
2	CT2
3	CT3
4	CT4
5	CT1	Cell Semantic Text + Question Text
6	CT2
7	CT3
8	CT4

Table 3. Experimental results on text semantic representation.

Model	Experiments	Overall Accuracy	Classification by Table Type				Classification by Question Type
Model	Experiments	Overall Accuracy	Complex	Vertical	Horizontal	Hierarchical	Single Cell	Single Line	Multi Line
CBCM	1	40.4%	19.9%	62.6%	46.5%	50.3%	46.3%	37.5%	8.5%
	2	38.3%	18.6%	56.3%	50.4%	46.5%	43.1%	36.3%	10.6%
	3	50.5%	45.4%	59.8%	58.2%	44.0%	55.9%	46.1%	34.0%
	4	48.3%	44.1%	58.0%	54.3%	40.9%	54.5%	43.2%	29.8%
	5	38.9%	16.3%	62.1%	49.6%	48.4%	45.8%	35.6%	2.1%
	6	41.7%	24.2%	56.9%	51.2%	50.9%	49.0%	37.5%	6.4%
	7	49.3%	41.2%	60.9%	59.8%	44.7%	55.7%	44.8%	25.5%
	8	49.1%	41.5%	57.5%	55.8%	49.1%	55.4%	44.5%	25.5%
RGCNRCI	–	48.0%	27.8%	70.1%	48.1%	62.9%	52.7%	48.6%	4.3%

The bold denotes the best results.

Table 4. Table cells semantic classification performance.

Text Binary Classification Method	Text Classification Accuracy	Table Question Answering Accuracy
SVM	71.72%	1.69%
Random Forest	78.80%	2.21%
k-Nearest Neighbor	81.64%	3.25%
Linear Classifier	97.00%	49.00%

The bold denotes the best results.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, H.; Shen, D. Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching. Big Data Cogn. Comput. 2025, 9, 265. https://doi.org/10.3390/bdcc9100265

AMA Style

Chen H, Shen D. Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching. Big Data and Cognitive Computing. 2025; 9(10):265. https://doi.org/10.3390/bdcc9100265

Chicago/Turabian Style

Chen, Hainan, and Dongqi Shen. 2025. "Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching" Big Data and Cognitive Computing 9, no. 10: 265. https://doi.org/10.3390/bdcc9100265

APA Style

Chen, H., & Shen, D. (2025). Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching. Big Data and Cognitive Computing, 9(10), 265. https://doi.org/10.3390/bdcc9100265

Article Menu

Complex Table Question Answering with Multiple Cells Recall Based on Extended Cell Semantic Matching

Abstract

1. Introduction

2. Related Work

3. CBCM Table QA Method

3.1. Cell Type Classification in Table Question Answering

3.2. Definition of Table Types in Table Question Answering

3.3. Question Classification in Table Question Answering

3.4. Extended Cell-Bycell Semantic Matching

4. Experiment

4.1. IM-TQA-X Benchmark Dataset

4.2. Table Question Answering Performance Validation

5. Discussion

5.1. Ablation Analysis

5.2. Complexity Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Implementation Details for TQA Experiments

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI