Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases

Shi, Xiaojun; Shen, Chaoyuan; Qiao, Lianpeng; Hu, Tianze; Wang, Guoren

doi:10.3390/app16115296

Open AccessArticle

Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases

by

Xiaojun Shi

,

Chaoyuan Shen

,

Lianpeng Qiao

,

Tianze Hu

and

Guoren Wang

^*

School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(11), 5296; https://doi.org/10.3390/app16115296

Submission received: 17 April 2026 / Revised: 20 May 2026 / Accepted: 21 May 2026 / Published: 25 May 2026

(This article belongs to the Special Issue AI-Based Data Science and Database Systems, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

As data volumes grow and business requirements become increasingly complex, Hybrid Transactional/Analytical Processing (HTAP) technologies, capable of handling both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) workloads on a single platform, have gained prominence. HTAP databases typically maintain dual data storage formats and dual query engines: one row-oriented for OLTP, and another column-oriented for OLAP. Query plans, known as hybrid plans, can be segmented and pushed down to execute on these different formats. However, existing HTAP solutions still face challenges in optimizing these hybrid plans, struggling to explore the vast space of potential execution strategies effectively. To address these issues, this study introduces a learning-based query optimizer for row–column dual store HTAP database systems, which automatically generates multiple high-quality query optimizer hints (HINTs) to derive candidate plans. To balance plan generation efficiency with plan quality, a lightweight, learning-based algorithm using Monte Carlo Tree Search (MCTS) for generating hybrid access HINTs is proposed. Moreover, a Transformer-based neural network model coupled with a hybrid plan feature representation method is developed to select the candidate execution plan with the lowest predicted execution time. This work focuses on latency-oriented hybrid-plan selection for analytical queries in a row–column dual-store HTAP architecture; the current evaluation does not cover full mixed OLTP/OLAP workload scheduling, transactional interference, or concurrency control, which are left as future work. Experimental results on AlloyDB Omni, a recent row–column dual-store HTAP database, using the real-world IMDB dataset and JOB benchmark demonstrate that our system reduces execution time by 75.02% compared to the Cost-Based Optimizer (CBO) and by 62.23% compared to the state-of-the-art row-store-based learning query optimizer in this evaluated analytical-query setting.

Keywords:

HTAP database; hybrid query plan; query optimization; Monte Carlo Tree Search; Transformer; machine learning

1. Introduction

With the emergence of large-scale real-time analytical applications in modern society, many business scenarios require the ability to simultaneously handle high-concurrency transactional requests and perform real-time analysis on the latest data. In traditional data processing workflows, Online Transaction Processing (OLTP) systems handle day-to-day business transactions—such as order processing and inventory management—which are typically high-frequency, low-latency operations requiring fast responses and data consistency guarantees. However, to conduct data analysis and support decision-making, data from OLTP systems must be extracted through an Extract-Transform-Load (ETL) process and loaded into Online Analytical Processing (OLAP) systems.

Hybrid Transactional/Analytical Processing (HTAP) technology provides a more efficient alternative. As illustrated in Figure 1, unlike the traditional loosely coupled architecture combining separate OLTP databases and OLAP data warehouses, HTAP technology adopts a unified architecture that simultaneously supports OLTP and OLAP on a single platform. This eliminates the time-consuming ETL process and enables rapid analysis of the latest transactional data, thereby improving overall system throughput and query response time [1,2]. Representative HTAP databases include Oracle [3], SQL Server [4], DB2 BLU [5], SAP HANA [6], TiDB [7], PolarDB-IMCI [8], RateupDB [9], and AlloyDB [10].

Current mainstream HTAP databases such as TiDB [7] and Oracle [3] adopt a row–column co-existing data storage approach [2]. Unlike previous storage approaches based on pure row, pure column, or hybrid models such as PAX [11], the row–column co-existing approach goes a step further: it uses row storage as the primary format and column storage as the auxiliary. In this storage mode, write-intensive OLTP queries can directly access row storage to ensure low latency, while OLAP analytical queries can quickly retrieve data from column store replicas. The column store replicas are periodically updated incrementally from the row store to maintain consistency. Moreover, some HTAP databases support placing column store replicas directly in memory (In-Memory Column Store, IMCS) [3,4,5,8,10,12,13,14,15,16], further accelerating query response times by eliminating disk I/O.

To fully exploit the potential of hybrid physical layouts, many HTAP systems [4,7,10] support executing a single query across different storage formats to generate hybrid plans. Despite this capability, current HTAP systems still have shortcomings in fully leveraging the combined advantages of row and column storage [17].

1.1. Motivation

For an SQL query, a database supporting hybrid queries can generate a hybrid plan based on its cost model. A HINT [18] is an extended syntax in SQL queries used to guide the database system in generating specific query execution plans. Under different HINTs, the query plan and execution performance of the same SQL query can differ significantly. As illustrated in Figure 2, under five different access-path HINTs, a single SQL query generates five distinct execution plans, including row plans, column plans, and hybrid plans. The default plan (generated without a HINT) corresponds to one specific hybrid plan, but it is not necessarily optimal.

Several key observations motivate our work:

1.: Unified cost model lacking. Current HTAP database query optimization systems [3,4,8,10] primarily rely on row-based cost models to generate execution plans. They lack a unified framework that can simultaneously quantify the costs of row-level and column-level operators. As shown in Figure 2, the plan estimated to have the lowest cost by the optimizer does not necessarily have the shortest execution time.
2.: Complex hybrid plan generation space. For a query involving m tables, the number of possible hybrid access HINTs is $2^{m}$ , representing an exponentially large search space. Current HTAP systems [3,4,7,8,10] either do not support hybrid plans, directly route queries to row or column storage [8], or use simple cost models to determine access paths without comprehensively considering computational operators [7].
3.: Limitations of cost-based optimizers. While learning-based query optimizers such as DQ [19], ReJOIN [20], RTOS [21], and Neo [22] have shown promise, these representative learning-based optimizers mainly focus on conventional single-engine or row-store plans. To the best of our knowledge, existing learning-based query optimization studies mainly focus on cardinality estimation, cost estimation, join ordering, plan ranking, or query plan representation in conventional single-engine database settings. Learning-based optimization for hybrid plan generation in row–column dual-store HTAP databases remains relatively underexplored [23,24].

To make the discussion above more concrete, Table 1 contrasts the proposed method with representative families of existing query-optimization techniques along three dimensions: main idea, limitation in a row–column dual-store HTAP setting, and the specific difference of our work.

1.2. Solution Overview

To address these challenges, we propose a machine learning-based hybrid plan query optimization system. The system models hybrid access HINT generation as a tree search problem and employs Monte Carlo Tree Search (MCTS) to efficiently explore good hybrid access HINTs, achieving a balance between exploration and exploitation. For each candidate HINT on the tree, a neural network quickly computes its benefit score. We further adopt an enhanced Transformer-based [25] neural network that captures both global query features and local hybrid physical plan features, enabling accurate execution time prediction for hybrid plans.

1.3. Contributions

The main contributions of this paper are as follows:

1.: We design and implement a learning-based hybrid-plan query optimization framework for row–column dual-store HTAP databases, addressing a setting that, to the best of our knowledge, has received limited attention from prior learning-based query optimizers.
2.: We propose a hybrid access HINT generation method combining MCTS with neural networks, which can quickly and effectively explore good full-length hybrid access HINTs, achieving a balance between exploration and exploitation.
3.: We adapt an existing Transformer-based plan-representation model [26] to candidate hybrid plans, and integrate it with MCTS-based hybrid HINT generation. We do not claim the Transformer architecture itself as a new general-purpose query-plan representation model: components such as height encoding, tree-aware attention, and the global query node are inspired by prior neural query-plan representation studies; the contribution lies in adapting and integrating these techniques in the row–column dual-store HTAP setting.
4.: Experimental results on AlloyDB Omni, using the real-world IMDB dataset and JOB benchmark, demonstrate that the proposed method reduces execution time by 75.02% compared with the cost-based optimizer and by 62.23% compared with the state-of-the-art row-store-based learning query optimizer in the evaluated analytical-query setting.

1.4. Scope and Terminology

Throughout this paper, a hybrid plan refers to a complete physical execution plan that may combine row-store and column-store access operators. A hybrid path (or hybrid access path) refers to an access path or partial execution path within a hybrid plan, specifying for each base relation whether it is accessed from row storage or column storage. A hybrid HINT (or hybrid access HINT) is the optimizer hint, generated by our MCTS-based search process, that instructs the DBMS to follow a specific hybrid access pattern. These three terms are used consistently in this sense throughout the paper. The optimization target of this work is latency-oriented hybrid-plan selection for analytical queries; mixed OLTP/OLAP workload scheduling, transactional interference, and concurrency control are out of scope and are discussed as future work.

1.5. Problem Statement

We state the optimization problem here for clarity; the detailed search-space construction (hybrid access HINT tree) and the role of each module are introduced in Section 2.

For a query q over a schema S with relations

T = {t_{1}, \dots, t_{m}}

, let

P_{q}^{S}

denote the set of all candidate execution plans that the underlying HTAP optimizer can generate, and let

t (p)

denote the end-to-end execution latency of a plan

p \in P_{q}^{S}

. The objective is to identify the plan with minimum execution latency:

p^{*} = arg min_{p \in P_{q}^{S}} t (p) .

(1)

Intuitively,

P_{q}^{S}

is exponentially large because each of the m base relations can be accessed from either row or column storage, yielding

2^{m}

different hybrid access patterns. To make the problem tractable, we reformulate it through hybrid access HINTs: let

O (q, S, T_{r}, T_{c})

denote the plan produced by the cost-based optimizer when relations in

T_{r}

are forced to row storage and relations in

T_{c}

to column storage. The reformulated objective is then

min_{(T_{r}, T_{c})} t (O (q, S, T_{r}, T_{c})),

(2)

where

(T_{r}, T_{c})

ranges over all partitions of T. The MCTS-based search (Section 3) explores promising

(T_{r}, T_{c})

partitions, and the Transformer-based predictor (Section 4) approximates the unknown latency function

t (\cdot)

so that the best-predicted candidate plan can be selected. Equation (2) captures only execution latency; system-level factors such as memory budget, transactional interference, data-freshness constraints, and concurrency-control overhead can be incorporated as constraints in future work.

The remainder of this paper is organized as follows. Section 2 presents the system overview. Section 3 describes the hybrid path and candidate plan generation module. Section 4 details the execution time prediction model. Section 5 presents the experimental evaluation. Section 6 reviews related work, and Section 7 concludes the paper.

2. System Overview

In this section, we first explain key preliminary concepts, then present the overall system architecture.

2.1. Preliminaries

2.1.1. Hybrid Access HINT

A hybrid access HINT refers to extended syntax that forces an SQL query to use access paths with different storage formats. Formally, given the current schema S with m relations represented by

T = {t_{1}, t_{2}, \dots, t_{m}}

, let

T_{r}

and

T_{c}

be two disjoint subsets of T. For any query q, a hybrid access HINT is denoted as

H_{q}^{(T_{r}, T_{c})}

, indicating that tables in

T_{r}

are read from row storage and tables in

T_{c}

are read from column storage. For example, in AlloyDB, the HINT is expressed as:

\begin{matrix} / * + & ColumnarScan (a), ColumnarScan (b), \\ NoColumnarScan (c), NoColumnarScan (d) * / \end{matrix}

while in TiDB it is:

\begin{matrix} / * + & READ_FROM_STORAGE (TIFLASH [a, b], TIKV [c, d]) * / \end{matrix}

2.1.2. Hybrid Access HINT Tree

Given m relations under schema S for query q, a full-length hybrid access HINT

H_{q}^{(T_{r}, T_{c})}

where

(T_{r}, T_{c})

is a partition of T has

2^{m}

possible values. Exhaustively exploring this space is prohibitively expensive. Therefore, we construct a hybrid access HINT tree

T_{q}

to accelerate the search.

The tree has height

m + 2

. Leaf nodes are called terminal nodes; all other nodes are search nodes. The root node represents the empty set. Each search node can be expanded by adding a table from q (with ordering constraints) or by stopping the search. Search nodes represent sets of tables assigned to column storage (

T_{c}

), while terminal nodes represent complete full-length hybrid access HINTs. The tree structure is shown in the upper-left part of Figure 3.

2.1.3. Problem Definition

The formal problem statement was introduced in Section 1.5; we recall it here in the notation just established and connect it to the search-space construction.

For any query q under schema S, let

P_{q}^{S}

denote the set of all possible execution plans and

t (p)

denote the execution time of plan p. The optimization objective for hybrid plan query optimization in HTAP databases is:

p^{*} = \arg \min_{p \in P_{q}^{S}} t (p)

(3)

Since directly solving Equation (3) is impractical, we reformulate the problem using HINTs. The optimizer with a hybrid access HINT applied is denoted as

O (q, S, T_{r}, T_{c})

. Our optimization objective becomes:

\min_{T_{r}, T_{c}} t (O (q, S, T_{r}, T_{c}))

(4)

Intuitively, instead of searching the full plan space

P_{q}^{S}

, we search over

(T_{r}, T_{c})

partitions on the hybrid access HINT tree

T_{q}

: each terminal node corresponds to one full-length HINT

H_{q}^{(T_{r}, T_{c})}

that, when fed to the optimizer

O

, yields one candidate plan. Two complementary learned components then make this search tractable in practice: the MCTS-based hybrid HINT generator (Section 3) chooses a small number of promising

(T_{r}, T_{c})

partitions, and the Transformer-based predictor (Section 4) approximates the unknown latency function

t (\cdot)

to rank the resulting candidate plans.

2.2. System Architecture

As shown in Figure 3, the system consists of three key modules:

Hybrid path and candidate plan generation. For each incoming query, a hybrid access HINT tree is constructed, and Monte Carlo Tree Search is employed to search for good HINTs. The database query optimizer then generates complete candidate plans from the selected HINTs.

Execution time prediction. A Transformer-based neural network with attention mechanisms predicts the execution time of each candidate plan. The plan with the minimum predicted execution time is selected as the final plan.

Incremental model training and update. After executing the selected plan, the actual execution time is used as training feedback to incrementally improve both the HINT estimator and the Transformer model asynchronously, without blocking the online workload.

2.3. Overall Algorithm

Algorithm 1 shows the overall workflow. The system first uses MCTS with the hybrid access HINT estimator to generate candidate partitions, then selects the top-H HINTs. For each candidate HINT, the optimizer generates a complete hybrid plan. The TreeTransformer model estimates the execution time of each candidate, and the plan with the best predicted performance is selected. During training, the selected plan is executed asynchronously, and feedback updates both models.

Algorithm 1: Hybrid Access HINT Query Optimizer

3. Hybrid Path and Candidate Plan Generation

The goal of this module is to select H good hybrid access HINTs to generate H candidate plans using the cost-based optimizer.

3.1. Monte Carlo Tree Search for HINT Exploration

Since the full hybrid access path search space is large, we apply Monte Carlo Tree Search (MCTS) on the HINT tree

T_{q}

. MCTS follows an exploration–exploitation strategy: it focuses on paths that lead to high estimated query performance (exploitation) while also visiting less-explored directions (exploration). The overall module architecture is shown in Figure 4.

3.1.1. Node Utility on the HINT Tree

The node utility

U (υ)

is computed from two factors:

Node benefit score $B (υ)$ . Each node

υ

has a benefit score representing the expected performance of full-length hybrid access paths passing through

υ

. For leaf nodes,

B (υ)

is directly computed by the hybrid access HINT estimator (Section 3.2). For non-leaf nodes, let

L (υ)

denote the set of leaf nodes that have been estimated and contain

υ

on their search path:

B (υ) = \frac{1}{| L (υ) |} \sum_{υ^{'} \in L (υ)} B (υ^{'})

(5)

Intuitively,

B (υ)

summarizes what is currently known about the sub-tree rooted at

υ

: a non-leaf node inherits the average estimated performance of all full-length hybrid HINTs that have so far been simulated through it, so promising sub-trees accumulate higher scores as more leaves under them are evaluated.

Node exploration degree $F (υ)$ . This counts the number of times node

υ

has been visited, ensuring that less-explored nodes are also considered.

Following the Upper Confidence Bound (UCB) approach [27], the node utility is defined as:

U (υ) = B (υ) + γ \sqrt{\frac{ln (F (υ_{f}))}{F (υ)}}

(6)

where

υ_{f}

is the parent of

υ

and

γ

controls the exploration degree.

Intuitively, the first term exploits sub-trees that have so far yielded fast hybrid plans, while the second term grows whenever a sibling is repeatedly visited and

υ

is not, so under-explored branches are eventually revisited. The hyper-parameter

γ

trades off exploitation against exploration: small

γ

collapses MCTS toward greedy selection of the current best sub-tree, while large

γ

pushes the search to cover more of the

2^{m}

hybrid access patterns.

3.1.2. MCTS Workflow

The MCTS workflow consists of five steps:

Selection. Starting from the root, iteratively select the child with the highest utility until reaching a node with unexpanded children.
Expansion. Randomly select an unexpanded child and recursively expand to a leaf node.
Simulation. Use the hybrid access HINT estimator to evaluate the leaf node’s corresponding hybrid access path.
Update. Backpropagate the leaf node’s utility to update ancestor nodes.
Termination. Repeat until the search budget b is exhausted.

After termination, the system selects the top-H hybrid access HINTs with the highest utility for candidate plan generation.

3.2. Hybrid Access HINT Estimator

This subsection constructs a lightweight estimator to predict the performance of a query under a given hybrid access HINT. It takes as input the query features and hybrid access path and outputs the estimated performance via a recurrent neural network.

3.2.1. Query Encoding

Following prior work [21,28,29], we encode query q considering join predicates (

E_{q}^{J}

) and filter predicates (

E_{q}^{F}

). Specifically,

E_{q}^{J}

is an

m \times m

matrix where

E_{q}^{J} [i, j] = 1

if tables i and j are joined in q.

E_{q}^{F}

is a length-m vector where

E_{q}^{F} [i]

represents the selectivity of filters on table i (default 1 if no filter exists). The query encoding

E_{q}

is obtained by flattening

E_{q}^{J}

and concatenating

E_{q}^{F}

.

For example, consider a query SELECT * FROM a, b, c WHERE a.id = b.id AND b.id = c.id AND b.attr2 BETWEEN B1 AND B2 with selectivity 0.6 on table b. Then

E_{q}^{J} [0, 1] = E_{q}^{J} [1, 0] = E_{q}^{J} [1, 2] = E_{q}^{J} [2, 1] = 1

, and

E_{q}^{F} [1] = 0.6

, as illustrated in Figure 4.

3.2.2. Hybrid Access Path Representation

Each hybrid access path is a length-

(m + 1)

vector

H_{q}^{(T_{r}, T_{c})} = [t_{1}, t_{2}, \dots, t_{c}, \dots, t_{m}, t_{m + 1}]

, where elements before

t_{c}

represent column-store tables (

T_{c}

),

t_{c}

is the transition point, and elements after

t_{c}

represent row-store tables (

T_{r}

). For example, a full column plan is represented as

[a, b, c, d, t_{c}]

, while a hybrid plan with

{a, d}

in column and

{b, c}

in row is

[a, d, t_{c}, b, c]

.

We use Long Short-Term Memory (LSTM) networks to capture the sequential features of each path. Each table’s embedding

e (t_{i})

is learned from training data. The LSTM encodes the path as:

(h_{i}, m_{i}) = LSTMUnit (e (t_{i}), h_{i - 1}, m_{i - 1})

. The final path representation is

E_{H} = h_{m + 1}

.

3.2.3. Node Utility Prediction

Given the query encoding

E_{q}

and path representation

E_{H}

, the estimator uses a fully connected network to predict the node benefit score:

B (υ) = FC (Concat (E_{q}, E_{H}))

(7)

Intuitively,

E_{q}

encodes what the query joins and filters, while

E_{H}

encodes how the partial hybrid path so far assigns base relations to row or column storage; the fully connected layer learns a fast, lightweight mapping from this pair to an estimated benefit score, which MCTS uses as a cheap surrogate to decide which leaves of

T_{q}

to expand.

4. Execution Time Prediction

Although prior work [26] has adapted Transformers to query plan representation, it does not accommodate hybrid plans. This section presents our enhanced Transformer-based neural network for predicting the execution time of each candidate hybrid plan.

4.1. Why a Transformer-Based Predictor?

Candidate hybrid plans are not flat feature vectors. Each plan is a tree of physical operators whose nodes carry heterogeneous features—operator type, row/column access decisions, predicate descriptors, estimated cardinalities, join conditions, and per-table statistics—and whose execution cost depends on long-range, cross-operator interactions (e.g., the choice of a columnar scan deep in the tree affects the cost of an upstream hash join). Lightweight models such as linear regression, random forests, or multilayer perceptrons are easier to train and faster to evaluate, but they treat plan features as fixed-length vectors and therefore struggle to model structural dependencies among plan nodes and the interaction between row-store and column-store sub-trees. Tree-RNN and Tree-CNN models capture parent–child dependencies but, as discussed in Section 6.3, suffer from forgetting effects or limited receptive fields. We therefore adopt a Transformer-based predictor with tree-aware attention and a global query node, which (i) jointly models all plan nodes via self-attention, (ii) uses structural biases to respect the tree topology, and (iii) summarizes the entire hybrid plan through a single global representation. We do not claim this Transformer architecture itself as new: height encoding, tree-biased attention, and the global query node are inspired by prior neural query-plan representation work [25,26,30]; our contribution is to adapt these techniques to candidate hybrid plans in a row–column dual-store HTAP database and to couple the predictor with MCTS-based hybrid HINT generation.

4.2. Feature Extraction and Encoding

A query execution plan node typically contains operators, tables, predicates, and join conditions:

Operators describe the operation at each plan node (e.g., Merge Join, Index Scan, Columnar Scan, Vectorized Hash Join), treated as categorical variables with approximately 30 types.
Predicates describe filter conditions as <column, comparison operator, value> triples [31]. The column and operator are categorical; the value is range-normalized to $[0, 1]$ .
Joins are join conditions treated as categorical variables (finite domain given the schema).
Tables are relation identifiers, also treated as categorical variables.

4.2.1. Learned Embeddings for Categorical Variables

Instead of one-hot encoding, which suffers from dimension explosion and difficulty handling new categories, we use fixed-size learned embeddings [26] for each categorical variable. New categories can be accommodated by simply adding a randomly initialized embedding without affecting existing learned embeddings.

4.2.2. Predicate Encoding

The predicate representation

E_{p}

concatenates the column embedding

E_{c}

, operator embedding

E_{o}

, and normalized value

v_{norm}

:

E_{p} = [E_{c} ∥ E_{o} ∥ v_{norm}]

(8)

4.2.3. Statistical Information Encoding

Histogram encoding. Equal-height histograms from database statistics are reorganized into N uniform buckets using linear interpolation. For a predicate

score > 35

on a histogram with boundaries

(20, 30, 40, 50, 60)

and

N = 4

buckets, the encoding would be

E_{h} = [0, 0.5, 1, 1]

.

Sampling encoding. Random tuple samples are maintained from each table. For each predicate, each sample tuple is checked for satisfaction, producing an m-bit bitmap

E_{s}

indicating the percentage of satisfying tuples.

4.2.4. Final Node Embedding

The components—operators (

E_{o}

), joins (

E_{j}

), tables (

E_{t}

), predicates (

E_{p}

), histograms (

E_{h}

), and samples (

E_{s}

)—are concatenated and passed through a linear layer:

E = Linear (E_{o} ∥ E_{j} ∥ E_{t} ∥ E_{p} ∥ E_{h} ∥ E_{s})

(9)

Intuitively, plan nodes are heterogeneous: a Hash Join carries join conditions, a Columnar Scan carries a table and predicates, and an aggregation may carry none of these. Concatenating the per-component embeddings and projecting them through a shared linear layer maps every node, regardless of operator type, into a single d-dimensional space, so the subsequent attention layers can compare and combine plan nodes without operator-specific branching logic.

For nodes without predicates or join conditions (e.g., a Hash node), zero padding is used. This produces uniform-size embeddings for all node types, forming a vector tree as input to the tree attention network.

4.3. Tree Attention Network

Building on prior Transformer work [25,26], we adapt the architecture to accept hybrid query plans. Figure 5 shows the model architecture, which introduces three key components: height encoding, tree-biased attention, and a global query node.

4.3.1. Height Encoding

The height of a node is defined as the length of the longest path from the node to any leaf. All leaf nodes have height 0, and the root has the maximum height. Height information implicitly encodes parent–child dependencies, as parent nodes always have greater height than their descendants. A learned height embedding

E_{h}

is added to each node embedding:

x_{i}^{'} = x_{i} + E_{h}

(10)

4.3.2. Tree-Biased Attention

In standard self-attention (Equation (11)), every node can attend to all other nodes. However, in query plans, information should flow primarily bottom-up. We introduce tree-biased attention to control information flow using the tree structure.

Given the standard attention mechanism:

Attention (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(11)

We compute pairwise distances

d_{i j}

between plan nodes. If node j is reachable from node i (i.e., j is a descendant of i),

d_{i j}

is the tree distance; otherwise,

d_{i j}

is marked as unreachable. Each distance value is associated with a learnable scalar bias

b_{d}

:

A_{i j}^{'} = A_{i j} + b_{d}

(12)

This allows the model to learn larger biases for small distances (e.g., direct parent–child) and small negative values to discourage attention between distant or unreachable nodes.

Intuitively,

b_{d}

injects the tree topology directly into self-attention: an operator can give more weight to its immediate children (which dominate its runtime) and less weight to nodes in an unrelated sub-tree (which usually have weak cost coupling). Compared with vanilla self-attention, this biases the model toward physically meaningful information flow—bottom-up cost propagation along the plan tree—without removing the ability to model long-range interactions across distant operators.

4.3.3. Global Query Node

To capture both global context (the query, which is invariant across the plan search space) and local context (the plan), we introduce a global query node

E_{Q}

. Using the query encoding

E_{q}

from Section 3.2.1, we project it to the same dimension as plan node embeddings:

E_{Q} = Linear (E_{q})

(13)

The global query node is connected to all plan nodes and has learnable node and height embeddings. Its output vector serves as the representation of the entire query plan. This concept is analogous to the [CLS] token in BERT [30].

4.4. Query Time Prediction

The global query node output

O_{Q}

is fed into a two-layer fully connected network for time prediction:

\begin{matrix} t^{'} & = ReLU (W_{1} O_{Q} + b_{1}) \\ t & = Sigmoid (W_{2} t^{'} + b_{2}) \end{matrix}

(14)

where

W_{1}, W_{2}, b_{1}, b_{2}

are learnable parameters, and

t \in [0, 1]

is the normalized execution time. The training loss is mean squared error:

Loss = \frac{1}{N} \sum_{i = 1}^{N} {(t_{i} - t_{i}^{'})}^{2}

(15)

Intuitively, the Sigmoid output produces a bounded latency estimate on a normalized scale, which keeps the regression stable across the wide latency range of analytical queries; mean squared error then penalizes large prediction gaps more heavily than small ones, encouraging the predictor to avoid catastrophic mis-ranking of candidate plans rather than merely matching their average latency.

4.5. Model Training and Update

The system maintains a memory pool M collecting training instances

D_{i} = (q, p, t (p))

. For each training cycle,

b_{s}

triples are sampled from M to form a batch. Both the HINT estimator and the tree attention network are trained asynchronously to avoid blocking online workloads.

The hyperparameters and training settings of the Transformer predictor used in our experiments are summarized in Table 2. The configuration follows the conventions of prior tree-Transformer plan-representation work [26]. To mitigate overfitting we apply dropout inside the attention and feed-forward sub-layers, use weight decay in the optimizer, and select the final model checkpoint according to its loss on a held-out validation split rather than relying on a fixed number of epochs.

Table 3 reports the size and composition of the training data used to evaluate the predictor on the JOB workload. For each query, multiple candidate hybrid plans are generated by the MCTS-based HINT search and executed on the target DBMS to collect ground-truth latencies, which are then normalized to

[0, 1]

before training.

4.5.1. Training Convergence

During training, the validation loss generally stabilized within the 30 online training rounds used in our experiments, and the final checkpoint was selected according to the lowest validation loss to reduce overfitting; this validation-based checkpointing also serves as our early-stopping mechanism in lieu of a fixed-patience criterion.

4.5.2. Cold-Start and Generalization

The proposed method requires an initial set of executed plans to train the predictor. In cold-start scenarios—when a new schema or workload is deployed—the system can initially fall back on the native cost-based optimizer or use a small bootstrap workload to populate the memory pool M. As more (query, plan, latency) triples are collected, the predictor is incrementally updated, and the MCTS-based search benefits from increasingly accurate node-utility estimates. For unseen queries that share tables and join graphs with the training distribution, the learned table embeddings and tree-aware attention enable reasonable generalization; for queries whose structure differs substantially from the training distribution, additional online execution is required before the predictor stabilizes. A comprehensive online-learning and drift-adaptation strategy is beyond the scope of this paper and is left as future work.

5. Experimental Evaluation

Evaluation Scope.

The experiments in this section evaluate analytical-query latency under hybrid row–column access. They do not evaluate concurrent OLTP/OLAP execution, transactional interference, resource isolation, or freshness maintenance. The results should therefore be interpreted as evidence for latency-oriented hybrid-plan selection rather than as a complete evaluation of end-to-end HTAP workload management.

5.1. Experimental Setup

5.1.1. Environment

Our system is deployed on AlloyDB Omni 15.2.0 (Google LLC, Mountain View, CA, USA) [32], a locally deployable version of AlloyDB [10] compatible with PostgreSQL 15 (PostgreSQL Global Development Group) [33]. It supports both disk-based row storage and in-memory column storage with hybrid plan query optimization. The database and optimizer system are deployed on separate servers connected via a 10 Gb/s network. Both servers use Intel Xeon Gold 6148 CPUs @ 2.40 GHz (Intel Corporation, Santa Clara, CA, USA) with 1 TB memory. The optimizer server additionally uses an NVIDIA GeForce RTX 4090 GPU (NVIDIA Corporation, Santa Clara, CA, USA). The learning components are implemented in PyTorch 1.13 (Meta Platforms, Inc., Menlo Park, CA, USA) on Python 3.9 (Python Software Foundation, Wilmington, DE, USA).

5.1.2. Dataset and Workload

We use the IMDB (Internet Movie Database) dataset, a real-world database containing information about movies, actors, directors, and production companies (Table 4). The JOB (Join Order Benchmark) [34] is an OLAP benchmark using IMDB. JOB contains 33 query structures with 2–6 variants each (e.g., 1a, 1b, 1c), totaling 113 queries. Variants share the same tables and joins but differ in filter conditions. The number of relations per query ranges from 4 to 17.

5.1.3. Baseline Methods

We compare against four baselines:

Row Plan: AlloyDB Omni with all tables forced to row storage via HINTs, equivalent to traditional PostgreSQL plans.
Column Plan: AlloyDB Omni with all tables forced to column storage via HINTs.
Cost-based Hybrid Plan: AlloyDB Omni’s default cost-based hybrid plan optimizer.
HybridQO: A learning-based row plan optimizer [35] adapted for hybrid plan optimization. It searches join order prefixes, completes them with the cost-based optimizer, and uses a neural network to evaluate plan performance and uncertainty.

5.1.4. Evaluation Metrics

We use query latency (execution time) as the primary metric and GMRL (Geometric Mean Relative Latency) [21] for normalized per-query performance:

GMRL = {(\prod_{i = 1}^{n} \frac{E_{cur} (q_{i})}{E_{\max} (q_{i})})}^{\frac{1}{n}}

(16)

where lower GMRL indicates better performance.

5.1.5. Experimental Settings

The database working memory and cache are set to the dataset size with row plan warm-up for all queries, eliminating disk I/O effects. Row and column plans are executed three times with averaged results. Learning methods (HybridQO and ours) run for 30 iterations with the last round’s results used. MCTS budget is 75 ms. Neural networks are trained every 10 queries using 128 samples from the experience pool. The timeout is set to 1.5× the maximum baseline execution time, and execution times are normalized to

[0, 1]

.

5.2. Overall Query Latency

Figure 6 shows the total latency after completing all 113 queries. Our hybrid access HINT query optimizer outperforms all baselines:

vs. Row Plan: 71.32% time saved (3.49× speedup, from 214.99 s to 61.65 s).
vs. Column Plan: 96.83% time saved (31.57× speedup, from 1946.40 s to 61.65 s).
vs. Cost-based Hybrid Plan: 75.02% time saved (4.00× speedup, from 246.84 s to 61.65 s).
vs. HybridQO: 62.23% time saved (2.65× speedup, from 163.24 s to 61.65 s).

Key findings include: (1) hybrid plans leverage column store advantages such as SIMD operations, parallelization, and reduced I/O; (2) the column plan performs worst due to expensive tuple reconstruction (column-to-row conversion); (3) the cost-based hybrid plan performs worse than even the row plan due to its inability to unify row and column costs; and (4) HybridQO performs reasonably but is limited because it was not designed for hybrid plans.

5.3. Per-Query Latency Analysis

Figure 7 presents 20 queries on a log scale. Our optimizer achieves excellent performance on nearly all queries, with only query 22c being suboptimal (the time prediction module did not correctly identify the default plan as superior). HybridQO only improves three queries (15c, 26c, 30b), with query 26c alone contributing 99% of its total improvement.

5.4. Dominance Proportions

Table 5 presents the dominance proportions. Our optimizer achieves 76.11% dominance (proportion of queries where it is the best or tied for best) and 30.97% strict dominance (proportion where it is strictly the best). This demonstrates both strong exploitation (reusing existing good plans, 76.11%) and strong exploration (discovering new better plans, 30.97%). In contrast, HybridQO has only 1.77% strict dominance, indicating very limited exploration capability for hybrid plans.

5.5. Stability and Complex Query Handling

Table 6 and Figure 8a show that our optimizer has the lowest mean (545.59 ms), lowest standard deviation (1066.33), lowest median (125.16 ms), and lowest maximum (7550.71 ms), demonstrating the best stability across all methods. Figure 8b further confirms that at the 50th, 75th, 90th, 99th, and 99.5th percentiles, our optimizer consistently achieves the lowest latency. Notably, the time savings ratio increases with percentile, indicating stronger advantages on complex queries.

5.6. End-to-End Query Time

Table 7 compares end-to-end times. Despite introducing additional search time (7.86 s) and inference time (2.80 s), our optimizer achieves a total of 75.15 s versus AlloyDB Omni’s 249.83 s—a 69.92% reduction (3.32× speedup). The plan generation time is nearly identical (2.85 s vs. 2.99 s), as HINTs reduce the optimizer’s plan exploration space. The additional overhead of 10.66 s is vastly offset by the 185.19 s execution time saving, yielding a net gain of 174.53 s.

6. Related Work

6.1. HTAP Databases with Row–Column Co-Existing Storage

The concept of HTAP was first proposed by Gartner [36], predicting that HTAP technology would be widely adopted for real-time data analysis. In a strict sense, an HTAP database system should satisfy real-time transaction and analytical processing requirements without requiring data ETL processes. To simultaneously support both OLTP and OLAP performance, mainstream HTAP databases have adopted row–column co-existing storage [2].

Table 8 summarizes key technologies in representative HTAP database systems across five dimensions: transaction processing, analytical processing, data synchronization, query optimization, and resource scheduling.

6.2. Hybrid Plans in Row–Column HTAP Databases

As shown in Figure 9, the query-processing component of such an HTAP database features a hybrid operator query optimizer on the replica node. The row execution engine can receive, process, and return row-format tuples. For column operators in the physical plan, the row engine pushes them down to the column execution engine, which processes data in columnar format using SIMD instructions and vectorized execution. The processed columnar data then undergoes a column-to-row conversion before being returned to the row engine and finally to the user.

Current hybrid plan generation approaches include heuristic methods based on column defaults (e.g., Oracle 21c [39], SQL Server [4]) and cost function-based optimization (e.g., TiDB [7]). The cost function approach compares costs across storage formats:

\begin{matrix} C_{opt_scan} & = min (C_{col_scan}, C_{row_scan}, C_{index_scan}) \\ C_{row_scan} & = S_{tuple} \cdot N_{tuple} \cdot f_{scan} + N_{reg} \cdot f_{seek} \\ C_{col_scan} & = \sum_{j = 1}^{m} (S_{{col}_{j}} \cdot N_{tuple} \cdot f_{scan} + N_{{reg}_{j}} \cdot f_{seek}) \end{matrix}

(17)

However, these cost formulas are based on independence and uniform distribution assumptions without considering multi-column correlations or data skew.

6.3. Learning-Based Query Optimization

Learning-based query optimizers have made significant advances. DQ [19] and ReJOIN [20] use reinforcement learning to find optimal join orders. Neo [22] is the first end-to-end learning-based optimizer, using tree convolutional neural networks to estimate full plan latency. RTOS [21] applies graph neural networks to train reinforcement learning agents for join order selection. Bao [43] sits atop PostgreSQL, using HINTs to enable or disable sets of join and scan operations. Balsa [28] uses a similar architecture to Neo with modified training. Lero [44] formalizes the problem as a learning-to-rank task. LEON [29] is another learning-to-rank approach that prunes possible physical plans in a dynamic programming fashion.

For query plan representation, existing approaches include: (1) flattened methods such as AVGDL [45], which serialize plans into sequences but fail to capture tree structure; (2) Tree-RNNs such as Tree-LSTM [46] used by RTOS [21] and E2E-Cost [31], which aggregate information bottom-up but suffer from forgetting problems [47]; (3) Tree-CNNs [48] used by Neo [22] and Bao [43], which capture parent–child dependencies via triangular kernels but have limited receptive fields; and (4) feature vector methods such as AIMeetsAI [49] and LQPP [50], which encode features using predefined rules but fail to capture structural dependencies.

However, the above representative methods do not directly target hybrid row–column execution optimization and are therefore not directly applicable to hybrid plan selection in row–column dual-store HTAP databases. The most recent work on query optimization in HTAP databases [24] does not involve machine learning. More recent learned-optimizer studies, including learning-to-rank optimizers such as Lero [44], robust plan-generation models such as LOGER [51], and unified learned-optimizer services for lakehouse-style engines [52], also focus on conventional single-engine or row-store settings and do not target hybrid row–column plan selection in row–column dual-store HTAP systems.

7. Conclusions

This paper investigated the hybrid query optimization problem in HTAP databases with row–column co-existing configurations. We designed and implemented a hybrid access HINT query optimizer that combines Monte Carlo Tree Search with neural networks to efficiently generate high-quality hybrid access HINTs and produce superior candidate plans. The system extracts and encodes both local features of hybrid plans and global features of queries, then employs an enhanced TreeTransformer to evaluate candidate plans and select the best predicted hybrid execution plan.

The system was experimentally validated on AlloyDB Omni, a recent row–column co-existing HTAP database, using the real-world IMDB dataset and the JOB benchmark. In this evaluation setting, the proposed method improves latency-oriented hybrid-plan selection for analytical queries: it reduces execution time by 75.02% compared with the cost-based optimizer and by 62.23% compared with a state-of-the-art row-store-based learning optimizer, and it also achieves the best stability and the strongest performance on complex queries among all baselines considered.

We emphasize the scope of these claims. The current study targets analytical, latency-oriented hybrid-plan selection in a single row–column dual-store HTAP architecture; it does not fully evaluate mixed OLTP/OLAP workloads, transactional interference, concurrency, or system-level resource constraints such as memory and freshness budgets. The Transformer-based representation reuses height encoding, tree-aware attention, and a global query node from prior plan-representation work, and our contribution lies in adapting and integrating these components with MCTS-based hybrid HINT generation rather than in proposing a new general-purpose plan-representation architecture. Future work will (i) extend the optimization objective to incorporate concurrency and resource constraints, (ii) evaluate the framework under realistic mixed OLTP/OLAP workloads on additional HTAP systems, (iii) investigate online-learning strategies for cold-start and workload-drift scenarios, and (iv) study the trade-off between the proposed Transformer predictor and lightweight alternatives such as gradient-boosted trees or compact MLPs.

Author Contributions

Conceptualization, X.S.; methodology, C.S.; software, C.S.; validation, L.Q.; formal analysis, T.H.; investigation, L.Q.; writing—original draft preparation, X.S.; writing—review and editing, X.S.; supervision, G.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The IMDB dataset and JOB benchmark used in this study are publicly available. Reproducibility materials are released at https://github.com/HR10108/HtapQO (accessed on 20 May 2026). The repository includes the prototype implementation of the proposed optimizer, the MCTS-based hybrid HINT search procedure, the Transformer-based runtime estimator, the SQL and workload files used in the evaluation, representative configuration examples, and step-by-step instructions for building and running the prototype. The following components are intentionally not included: the underlying DBMS runtime assets (AlloyDB Omni binaries and other non-distributable storage- and execution-engine artifacts), the generated experimental result files, and locally trained model checkpoints. Researchers can regenerate these artifacts by following the released configurations and procedures on their own AlloyDB Omni deployment.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

HTAP	Hybrid Transactional/Analytical Processing
OLTP	Online Transaction Processing
OLAP	Online Analytical Processing
DBMS	Database Management System
CBO	Cost-Based Optimizer
MCTS	Monte Carlo Tree Search
HINT	Query optimizer hint/directive
LSTM	Long Short-Term Memory
IMCS	In-Memory Column Store
ETL	Extract-Transform-Load
JOB	Join Order Benchmark
GMRL	Geometric Mean Relative Latency
UCB	Upper Confidence Bound
CPU	Central Processing Unit
GPU	Graphics Processing Unit
SIMD	Single Instruction Multiple Data
I/O	Input/Output
MVCC	Multi-Version Concurrency Control
2PC	Two-Phase Commit
PAX	Partition Attributes Across
MLP	Multilayer Perceptron
RNN	Recurrent Neural Network
CNN	Convolutional Neural Network
ML	Machine Learning

References

Özcan, F.; Tian, Y.; Tözün, P. Hybrid Transactional/Analytical Processing: A Survey. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 1771–1775. [Google Scholar]
Li, G.; Zhang, C. HTAP Databases: What is New and What is Next. In Proceedings of the 2022 International Conference on Management of Data, Philadelphia, PA, USA, 12–17 June 2022; pp. 2483–2488. [Google Scholar]
Lahiri, T.; Chavan, S.; Colgan, M.; Das, D.; Ganesh, A.; Gleeson, M.; Hase, S.; Holloway, A.; Kamp, J.; Lee, T.-H.; et al. Oracle Database In-Memory: A Dual Format In-Memory Database. In Proceedings of the 2015 IEEE 31st International Conference on Data Engineering, Seoul, Republic of Korea, 13–17 April 2015; pp. 1253–1258. [Google Scholar]
Larson, P.-Å.; Birka, A.; Hanson, E.N.; Huang, W.; Nowakiewicz, M.; Papadimos, V. Real-Time Analytical Processing with SQL Server. Proc. VLDB Endow. 2015, 8, 1740–1751. [Google Scholar] [CrossRef]
Raman, V.; Attaluri, G.; Barber, R.; Chainani, N.; Kalmuk, D.; KulandaiSamy, V.; Leenstra, J.; Lightstone, S.; Liu, S.; Lohman, G.M.; et al. DB2 with BLU Acceleration: So Much More than Just a Column Store. Proc. VLDB Endow. 2013, 6, 1080–1091. [Google Scholar] [CrossRef]
Sikka, V.; Färber, F.; Lehner, W.; Cha, S.K.; Peh, T.; Bornhövd, C. Efficient Transaction Processing in SAP HANA Database: The End of a Column Store Myth. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, USA, 20–24 May 2012; pp. 731–742. [Google Scholar]
Huang, D.; Liu, Q.; Cui, Q.; Fang, Z.; Ma, X.; Xu, F.; Shen, L.; Tang, L.; Zhou, Y.; Huang, M.; et al. TiDB: A Raft-Based HTAP Database. Proc. VLDB Endow. 2020, 13, 3072–3084. [Google Scholar] [CrossRef]
Wang, J.; Li, T.; Song, H.; Yang, X.; Zhou, W.; Li, F.; Yan, B.; Wu, Q.; Liang, Y.; Ying, C.; et al. PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba. Proc. ACM Manag. Data 2023, 1, 1–25. [Google Scholar] [CrossRef]
Lee, R.; Zhou, M.; Li, C.; Hu, S.; Teng, J.; Li, D.; Zhang, X. The Art of Balance: A RateupDB Experience of Building a CPU/GPU Hybrid Database Product. Proc. VLDB Endow. 2021, 14, 2999–3013. [Google Scholar] [CrossRef]
Google Cloud. AlloyDB: A Fully Managed PostgreSQL-Compatible Database Service. Available online: https://cloud.google.com/alloydb (accessed on 1 March 2023).
Ailamaki, A.; DeWitt, D.J.; Hill, M.D.; Skounakis, M. Weaving Relations for Cache Performance. In Proceedings of the 27th International Conference on Very Large Data Bases, Roma, Italy, 11–14 September 2001; pp. 169–180. [Google Scholar]
Chen, J.; Ding, Y.; Liu, Y.; Li, F.; Zhang, L.; Zhang, M.; Wei, K.; Cao, L.; Zou, D.; Liu, Y.; et al. ByteHTAP: ByteDance’s HTAP System with High Data Freshness and Strong Data Consistency. Proc. VLDB Endow. 2022, 15, 3411–3424. [Google Scholar] [CrossRef]
Li, G.; Dong, H.; Zhang, C. Cloud Databases: New Techniques, Challenges, and Opportunities. Proc. VLDB Endow. 2022, 15, 3758–3761. [Google Scholar] [CrossRef]
Makreshanski, D.; Giceva, J.; Barthels, C.; Alonso, G. BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 37–50. [Google Scholar]
Nakamura, M.; Tabaru, T.; Ujibashi, Y.; Hashida, T.; Kawaba, M.; Harada, L. Extending PostgreSQL to Handle OLXP Workloads. In Proceedings of the Fifth International Conference on Innovative Computing Technology (INTECH 2015), Galicia, Spain, 20–22 May 2015; pp. 40–44. [Google Scholar]
StoneDB. A Real-Time HTAP Database. Available online: https://github.com/stoneatom/stonedb (accessed on 1 March 2023).
Zhang, C.; Li, G.; Lv, T. HyBench: A New Benchmark for HTAP Databases. Proc. VLDB Endow. 2023, 16, 4043–4056. [Google Scholar] [CrossRef]
Nippon Telegraph and Telephone Corporation. ossc-db/pg_hint_plan: Extension Adding Support for Optimizer Hints in PostgreSQL. Available online: https://github.com/ossc-db/pg_hint_plan (accessed on 30 April 2024).
Krishnan, S.; Yang, Z.; Goldberg, K.; Hellerstein, J.M.; Stoica, I. Learning to Optimize Join Queries with Deep Reinforcement Learning. arXiv 2018, arXiv:1808.03196. [Google Scholar]
Marcus, R.; Papaemmanouil, O. Deep Reinforcement Learning for Join Order Enumeration. In Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, Houston, TX, USA, 10 June 2018; pp. 1–4. [Google Scholar]
Yu, X.; Li, G.; Chai, C.; Tang, N. Reinforcement Learning with Tree-LSTM for Join Order Selection. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1297–1308. [Google Scholar]
Marcus, R.; Negi, P.; Mao, H.; Zhang, C.; Alizadeh, M.; Kraska, T.; Papaemmanouil, O.; Tatbul, N. Neo: A Learned Query Optimizer. Proc. VLDB Endow. 2019, 12, 1705–1718. [Google Scholar] [CrossRef]
Lehmann, C.; Sulimov, P.; Stockinger, K. Is Your Learned Query Optimizer Behaving as You Expect? A Machine Learning Perspective. arXiv 2023, arXiv:2309.01551. [Google Scholar] [CrossRef]
Song, H.; Zhou, W.; Li, F.; Peng, X.; Cui, H. Rethink Query Optimization in HTAP Databases. Proc. ACM Manag. Data 2023, 1, 1–27. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
Zhao, Y.; Cong, G.; Shi, J.; Miao, C. QueryFormer: A Tree Transformer Model for Query Plan Representation. Proc. VLDB Endow. 2022, 15, 1658–1670. [Google Scholar] [CrossRef]
Browne, C.B.; Powley, E.; Whitehouse, D.; Lucas, S.M.; Cowling, P.I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; Colton, S. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games 2012, 4, 1–43. [Google Scholar] [CrossRef]
Yang, Z.; Chiang, W.-L.; Luan, S.; Mittal, G.; Luo, M.; Stoica, I. Balsa: Learning a Query Optimizer without Expert Demonstrations. In Proceedings of the 2022 International Conference on Management of Data, Philadelphia, PA, USA, 12–17 June 2022; pp. 931–944. [Google Scholar]
Chen, X.; Chen, H.; Liang, Z.; Liu, S.; Wang, J.; Zeng, K.; Su, H.; Zheng, K. LEON: A New Framework for ML-Aided Query Optimization. Proc. VLDB Endow. 2023, 16, 2261–2273. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Sun, J.; Li, G. An End-to-End Learning-Based Cost Estimator. Proc. VLDB Endow. 2019, 13, 307–319. [Google Scholar] [CrossRef]
AlloyDB Omni Overview. Available online: https://cloud.google.com/alloydb/docs/omni (accessed on 8 May 2024).
PostgreSQL Global Development Group. PostgreSQL. Available online: https://www.postgresql.org/ (accessed on 19 April 2024).
Leis, V.; Gubichev, A.; Mirchev, A.; Boncz, P.; Kemper, A.; Neumann, T. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 2015, 9, 204–215. [Google Scholar] [CrossRef]
Yu, X.; Chai, C.; Li, G.; Liu, J. Cost-Based or Learning-Based? A Hybrid Query Optimizer for Query Plan Selection. Proc. VLDB Endow. 2022, 15, 3924–3936. [Google Scholar] [CrossRef]
Pezzini, M.; Feinberg, D.; Rayner, N.; Edjlali, R. Hybrid Transaction/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation; Gartner: Stamford, CT, USA, 2014; pp. 4–20. [Google Scholar]
Oracle. MySQL HeatWave: Real-Time Analytics for MySQL Database Service. Available online: https://www.oracle.com/mysql/heatwave/ (accessed on 1 April 2024).
SingleStore. The Single Database for All Data-Intensive Applications. Available online: https://www.singlestore.com/ (accessed on 19 April 2024).
Oracle. Oracle 21c: Automating Management of In-Memory Objects. Available online: https://docs.oracle.com/en/database/oracle/oracle-database/23/inmem/ (accessed on 19 April 2024).
Appuswamy, R.; Karpathiotakis, M.; Porobic, D.; Ailamaki, A. The Case for Heterogeneous HTAP. In Proceedings of the 8th Biennial Conference on Innovative Data Systems Research (CIDR 2017), Chaminade, CA, USA, 8–11 January 2017. [Google Scholar]
Raza, A.; Chrysogelos, P.; Anadiotis, A.C.; Ailamaki, A. Adaptive HTAP through Elastic Resource Scheduling. arXiv 2020, arXiv:2004.02380. [Google Scholar] [CrossRef]
Sirin, U.; Dwarkadas, S.; Ailamaki, A. Performance Characterization of HTAP Workloads. In Proceedings of the 2021 IEEE 37th International Conference on Data Engineering (ICDE), Chania, Greece, 19–22 April 2021; pp. 1808–1819. [Google Scholar]
Marcus, R.; Negi, P.; Mao, H.; Tatbul, N.; Alizadeh, M.; Kraska, T. Bao: Making Learned Query Optimization Practical. In Proceedings of the 2021 International Conference on Management of Data, Virtual Event, 20–25 June 2021; pp. 1275–1288. [Google Scholar]
Zhu, R.; Chen, W.; Ding, B.; Chen, X.; Pfadler, A.; Wu, Z.; Zhou, J. Lero: A Learning-to-Rank Query Optimizer. Proc. VLDB Endow. 2023, 16, 1466–1479. [Google Scholar] [CrossRef]
Yuan, H.; Li, G.; Feng, L.; Sun, J.; Han, Y. Automatic View Generation with Deep Learning and Reinforcement Learning. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1501–1512. [Google Scholar]
Tai, K.S.; Socher, R.; Manning, C.D. Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. arXiv 2015, arXiv:1503.00075. [Google Scholar]
Pascanu, R.; Mikolov, T.; Bengio, Y. On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; pp. 1310–1318. [Google Scholar]
Mou, L.; Li, G.; Zhang, L.; Wang, T.; Jin, Z. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30, pp. 1287–1293. [Google Scholar]
Ding, B.; Das, S.; Marcus, R.; Wu, W.; Chaudhuri, S.; Narasayya, V.R. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands, 30 June–5 July 2019; pp. 1241–1258. [Google Scholar]
Akdere, M.; Çetintemel, U.; Riondato, M.; Upfal, E.; Zdonik, S.B. Learning-Based Query Performance Modeling and Prediction. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, Arlington, VA, USA, 1–5 April 2012; pp. 390–401. [Google Scholar]
Chen, T.; Gao, J.; Chen, H.; Tu, Y. LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans. Proc. VLDB Endow. 2023, 16, 1777–1789. [Google Scholar] [CrossRef]
Gunasekaran, K.P.; Tiwari, K.; Acharya, R. Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Ecosystem: Can One Optimizer Serve All Engines? arXiv 2024, arXiv:2407.02129. [Google Scholar]

Figure 1. Comparison between the traditional OLTP/OLAP separated architecture and the unified row–column co-existing HTAP architecture.

Figure 2. Query plans and execution performance of a single SQL query under different HINTs. Dashed boxes group sub-plans that share the same storage-format access pattern; different colors distinguish row-store and column-store access; the asterisk (*) marks the plan selected by the default cost-based optimizer.

Figure 3. System architecture of the hybrid plan query optimizer. Solid arrows denote the forward query/data flow during inference, while dashed arrows denote feedback signals used for online training; different module colors distinguish the search, prediction, and execution components. The ellipsis (…) indicates further candidate plans omitted for clarity, and the asterisk (*) marks the plan finally selected by our optimizer.

Figure 4. Monte Carlo Tree Search and hybrid access HINT estimator. Dashed boxes group logically related modules; solid arrows denote forward data flow during prediction; dashed lines indicate feedback used to update node statistics on the HINT tree; different colors distinguish the search, encoding, and estimation stages. The ellipsis (…) indicates intermediate tree nodes or feature vectors omitted for clarity and does not affect the technical content.

Figure 5. Tree attention network model and execution time prediction architecture.

Figure 6. Overall execution time comparison across different methods. Speedup ratios are annotated with the multiplication sign “×” (e.g., “3.49×”); the slight overlap of bar labels does not affect scientific understanding.

Figure 7. Per-query execution time comparison. The slight overlap of x-axis tick labels is purely a layout artifact (queries are plotted on a log scale) and does not affect scientific understanding.

Figure 8. Stability and tail-latency analysis across methods. Boxes show the interquartile range (IQR), the central line marks the median, and whiskers extend to 1.5 × IQR; open circles denote outliers beyond the whiskers. The slight overlap of legend or tick labels does not affect scientific understanding.

Figure 9. Architecture of a row–column co-existing HTAP database based on disk row store and in-memory column store. Solid arrows denote query and data flow; the dashed arrow denotes the incremental refresh of the in-memory column store from the row store; different module colors distinguish the storage, execution, and refresh components.

Table 1. Comparison between existing query optimization approaches and the proposed method.

Approach	Main Idea	Limitation for Row–Column Dual-Store HTAP	Difference of Our Work
Traditional cost-based optimization	Uses statistics and handcrafted cost models	Hard to accurately model hybrid row/column access choices	Learns execution behavior directly from candidate hybrid plans
Learned cardinality/cost estimation	Improves optimizer components using machine learning	Usually targets conventional plans or single-engine settings	Predicts execution time for hybrid row/column access alternatives
Learning-to-rank optimization	Learns relative quality of candidate plans	Often relies on plans generated by native optimizers	Combines MCTS-based HINT search with a learned predictor
HTAP query optimization	Optimizes analytical queries over mixed storage/workloads	Often requires system-specific rules and cost models	Focuses on hybrid plan selection in row–column dual-store architecture
Proposed method	Generates and evaluates hybrid row–column plans	Current work focuses on analytical queries	Provides a learning-based hybrid-plan selection framework

Table 2. Hyperparameters and training settings of the Transformer predictor.

Item	Value
Number of Transformer layers	8
Number of attention heads	8
Hidden dimension	128
Feed-forward dimension	512
Dropout rate	0.1
Batch size	128
Optimizer	AdamW
Learning rate	$1 \times 10^{- 3}$
Weight decay	$1 \times 10^{- 4}$
Max training rounds	30 (per online iteration)
Model selection	Lowest validation loss
Train/validation/test split	70/15/15
Random seed	42

Table 3. Training data statistics for the Transformer predictor.

Item	Value
Benchmark	IMDB/JOB
Number of query templates	33
Number of query instances	113
Average number of candidate hybrid plans per query	up to $H = 5$ retained from MCTS
Total generated candidate plans (MCTS rollouts)	≈1500
Retained executed (query, plan, latency) triples	≈565 (up to 5 per query)
Training samples	70% (≈395 triples)
Validation samples	15% (≈85 triples; used for checkpoint selection)
Testing samples	15% (≈85 triples)
Evaluation metrics	query latency, GMRL, plan regret

Table 4. Workload statistics.

Statistic	Value
Dataset	IMDB
Attributes	108
Relations	21
Queries	113
Dataset Size (GB)	3.6

Table 5. Dominance and strict dominance proportions of different methods.

Method	Dominance (%)	Strict Dominance (%)
Our work	76.11	30.97
Hybrid plan	50.44	4.42
HybridQO	47.79	1.77
Row plan	15.04	15.04
Column plan	1.77	1.77

Table 6. Descriptive statistics of execution time (ms) across five methods.

Statistic	Row Plan	Column Plan	Hybrid Plan	HybridQO	Our Work
Count	113	113	113	113	113
Mean	1902.53	17,224.78	2184.42	1444.61	545.59
Std Dev	8301.75	42,218.32	8843.31	3564.76	1066.33
Min	15.53	45.05	0.12	0.12	0.12
25th %ile	108.32	680.18	57.36	56.61	56.61
Median	222.65	1434.84	181.77	181.77	125.16
75th %ile	1138.32	5459.22	1117.25	1117.25	447.45
Max	86,063.15	160,000.00	87,490.30	26,763.19	7550.71
GMRL	0.1297	0.7692	0.0863	0.0835	0.0565

Table 7. End-to-end time comparison between AlloyDB Omni and our optimizer (seconds).

AlloyDB Omni			Our Optimizer
Plan Gen.	Execution	Total	Search	Plan Gen.	Inference	Execution	Total
2.99	246.84	249.83	7.86	2.85	2.80	61.65	75.15

Table 8. Overview of key technologies in HTAP databases [2].

Task Type	Key Technology	Corresponding HTAP Databases
Transaction Processing	MVCC + Log	Oracle Dual-Format [3], SQL Server [4], DB2 BLU [5], MySQL HeatWave [37], SAP HANA [6]
Transaction Processing	2PC + Raft + Log	TiDB [7]
Analytical Processing	In-memory delta + column scan	Oracle [3], SQL Server [4], DB2 [5], MySQL HeatWave [37], SAP HANA [6]
	Log-based delta + column scan	TiDB [7]
	Column scan	SingleStore [38]
Data Synchronization	In-memory delta merge	Oracle [3], SQL Server [4], DB2 [5], MySQL HeatWave [37], SAP HANA [6]
	Log-based delta merge	TiDB [7]
	Rebuild from primary key row store	SingleStore [38], Oracle [3]
Query Optimization	In-memory column selection	Oracle 21c [39], MySQL HeatWave [37]
	Hybrid row/column scan	TiDB [7], SQL Server [4]
	CPU/GPU acceleration	RateupDB [9], Caldera [40]
Resource Scheduling	Freshness scheduling	RDE [41]
Resource Scheduling	Workload scheduling	SAP HANA [6], Siper [42]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shi, X.; Shen, C.; Qiao, L.; Hu, T.; Wang, G. Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases. Appl. Sci. 2026, 16, 5296. https://doi.org/10.3390/app16115296

AMA Style

Shi X, Shen C, Qiao L, Hu T, Wang G. Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases. Applied Sciences. 2026; 16(11):5296. https://doi.org/10.3390/app16115296

Chicago/Turabian Style

Shi, Xiaojun, Chaoyuan Shen, Lianpeng Qiao, Tianze Hu, and Guoren Wang. 2026. "Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases" Applied Sciences 16, no. 11: 5296. https://doi.org/10.3390/app16115296

APA Style

Shi, X., Shen, C., Qiao, L., Hu, T., & Wang, G. (2026). Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases. Applied Sciences, 16(11), 5296. https://doi.org/10.3390/app16115296

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Query Optimization for Hybrid Plans in Row–Column Dual Store HTAP Databases

Abstract

1. Introduction

1.1. Motivation

1.2. Solution Overview

1.3. Contributions

1.4. Scope and Terminology

1.5. Problem Statement

2. System Overview

2.1. Preliminaries

2.1.1. Hybrid Access HINT

2.1.2. Hybrid Access HINT Tree

2.1.3. Problem Definition

2.2. System Architecture

2.3. Overall Algorithm

3. Hybrid Path and Candidate Plan Generation

3.1. Monte Carlo Tree Search for HINT Exploration

3.1.1. Node Utility on the HINT Tree

3.1.2. MCTS Workflow

3.2. Hybrid Access HINT Estimator

3.2.1. Query Encoding

3.2.2. Hybrid Access Path Representation

3.2.3. Node Utility Prediction

4. Execution Time Prediction

4.1. Why a Transformer-Based Predictor?

4.2. Feature Extraction and Encoding

4.2.1. Learned Embeddings for Categorical Variables

4.2.2. Predicate Encoding

4.2.3. Statistical Information Encoding

4.2.4. Final Node Embedding

4.3. Tree Attention Network

4.3.1. Height Encoding

4.3.2. Tree-Biased Attention

4.3.3. Global Query Node

4.4. Query Time Prediction

4.5. Model Training and Update

4.5.1. Training Convergence

4.5.2. Cold-Start and Generalization

5. Experimental Evaluation

5.1. Experimental Setup

5.1.1. Environment

5.1.2. Dataset and Workload

5.1.3. Baseline Methods

5.1.4. Evaluation Metrics

5.1.5. Experimental Settings

5.2. Overall Query Latency

5.3. Per-Query Latency Analysis

5.4. Dominance Proportions

5.5. Stability and Complex Query Handling

5.6. End-to-End Query Time

6. Related Work

6.1. HTAP Databases with Row–Column Co-Existing Storage

6.2. Hybrid Plans in Row–Column HTAP Databases

6.3. Learning-Based Query Optimization

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI