1. Introduction
With the emergence of large-scale real-time analytical applications in modern society, many business scenarios require the ability to simultaneously handle high-concurrency transactional requests and perform real-time analysis on the latest data. In traditional data processing workflows, Online Transaction Processing (OLTP) systems handle day-to-day business transactions—such as order processing and inventory management—which are typically high-frequency, low-latency operations requiring fast responses and data consistency guarantees. However, to conduct data analysis and support decision-making, data from OLTP systems must be extracted through an Extract-Transform-Load (ETL) process and loaded into Online Analytical Processing (OLAP) systems.
Hybrid Transactional/Analytical Processing (HTAP) technology provides a more efficient alternative. As illustrated in
Figure 1, unlike the traditional loosely coupled architecture combining separate OLTP databases and OLAP data warehouses, HTAP technology adopts a unified architecture that simultaneously supports OLTP and OLAP on a single platform. This eliminates the time-consuming ETL process and enables rapid analysis of the latest transactional data, thereby improving overall system throughput and query response time [
1,
2]. Representative HTAP databases include Oracle [
3], SQL Server [
4], DB2 BLU [
5], SAP HANA [
6], TiDB [
7], PolarDB-IMCI [
8], RateupDB [
9], and AlloyDB [
10].
Current mainstream HTAP databases such as TiDB [
7] and Oracle [
3] adopt a row–column co-existing data storage approach [
2]. Unlike previous storage approaches based on pure row, pure column, or hybrid models such as PAX [
11], the row–column co-existing approach goes a step further: it uses row storage as the primary format and column storage as the auxiliary. In this storage mode, write-intensive OLTP queries can directly access row storage to ensure low latency, while OLAP analytical queries can quickly retrieve data from column store replicas. The column store replicas are periodically updated incrementally from the row store to maintain consistency. Moreover, some HTAP databases support placing column store replicas directly in memory (In-Memory Column Store, IMCS) [
3,
4,
5,
8,
10,
12,
13,
14,
15,
16], further accelerating query response times by eliminating disk I/O.
To fully exploit the potential of hybrid physical layouts, many HTAP systems [
4,
7,
10] support executing a single query across different storage formats to generate
hybrid plans. Despite this capability, current HTAP systems still have shortcomings in fully leveraging the combined advantages of row and column storage [
17].
1.1. Motivation
For an SQL query, a database supporting hybrid queries can generate a hybrid plan based on its cost model. A HINT [
18] is an extended syntax in SQL queries used to guide the database system in generating specific query execution plans. Under different HINTs, the query plan and execution performance of the same SQL query can differ significantly. As illustrated in
Figure 2, under five different access-path HINTs, a single SQL query generates five distinct execution plans, including row plans, column plans, and hybrid plans. The default plan (generated without a HINT) corresponds to one specific hybrid plan, but it is not necessarily optimal.
Several key observations motivate our work:
- 1.
Unified cost model lacking. Current HTAP database query optimization systems [
3,
4,
8,
10] primarily rely on row-based cost models to generate execution plans. They lack a unified framework that can simultaneously quantify the costs of row-level and column-level operators. As shown in
Figure 2, the plan estimated to have the lowest cost by the optimizer does not necessarily have the shortest execution time.
- 2.
Complex hybrid plan generation space. For a query involving
m tables, the number of possible hybrid access HINTs is
, representing an exponentially large search space. Current HTAP systems [
3,
4,
7,
8,
10] either do not support hybrid plans, directly route queries to row or column storage [
8], or use simple cost models to determine access paths without comprehensively considering computational operators [
7].
- 3.
Limitations of cost-based optimizers. While learning-based query optimizers such as DQ [
19], ReJOIN [
20], RTOS [
21], and Neo [
22] have shown promise, these representative learning-based optimizers mainly focus on conventional single-engine or row-store plans. To the best of our knowledge, existing learning-based query optimization studies mainly focus on cardinality estimation, cost estimation, join ordering, plan ranking, or query plan representation in conventional single-engine database settings. Learning-based optimization for hybrid plan generation in row–column dual-store HTAP databases remains relatively underexplored [
23,
24].
To make the discussion above more concrete,
Table 1 contrasts the proposed method with representative families of existing query-optimization techniques along three dimensions: main idea, limitation in a row–column dual-store HTAP setting, and the specific difference of our work.
1.2. Solution Overview
To address these challenges, we propose a machine learning-based hybrid plan query optimization system. The system models hybrid access HINT generation as a tree search problem and employs Monte Carlo Tree Search (MCTS) to efficiently explore good hybrid access HINTs, achieving a balance between exploration and exploitation. For each candidate HINT on the tree, a neural network quickly computes its benefit score. We further adopt an enhanced Transformer-based [
25] neural network that captures both global query features and local hybrid physical plan features, enabling accurate execution time prediction for hybrid plans.
1.3. Contributions
The main contributions of this paper are as follows:
- 1.
We design and implement a learning-based hybrid-plan query optimization framework for row–column dual-store HTAP databases, addressing a setting that, to the best of our knowledge, has received limited attention from prior learning-based query optimizers.
- 2.
We propose a hybrid access HINT generation method combining MCTS with neural networks, which can quickly and effectively explore good full-length hybrid access HINTs, achieving a balance between exploration and exploitation.
- 3.
We adapt an existing Transformer-based plan-representation model [
26] to candidate hybrid plans, and integrate it with MCTS-based hybrid HINT generation. We do not claim the Transformer architecture itself as a new general-purpose query-plan representation model: components such as height encoding, tree-aware attention, and the global query node are inspired by prior neural query-plan representation studies; the contribution lies in adapting and integrating these techniques in the row–column dual-store HTAP setting.
- 4.
Experimental results on AlloyDB Omni, using the real-world IMDB dataset and JOB benchmark, demonstrate that the proposed method reduces execution time by 75.02% compared with the cost-based optimizer and by 62.23% compared with the state-of-the-art row-store-based learning query optimizer in the evaluated analytical-query setting.
1.4. Scope and Terminology
Throughout this paper, a hybrid plan refers to a complete physical execution plan that may combine row-store and column-store access operators. A hybrid path (or hybrid access path) refers to an access path or partial execution path within a hybrid plan, specifying for each base relation whether it is accessed from row storage or column storage. A hybrid HINT (or hybrid access HINT) is the optimizer hint, generated by our MCTS-based search process, that instructs the DBMS to follow a specific hybrid access pattern. These three terms are used consistently in this sense throughout the paper. The optimization target of this work is latency-oriented hybrid-plan selection for analytical queries; mixed OLTP/OLAP workload scheduling, transactional interference, and concurrency control are out of scope and are discussed as future work.
1.5. Problem Statement
We state the optimization problem here for clarity; the detailed search-space construction (hybrid access HINT tree) and the role of each module are introduced in
Section 2.
For a query
q over a schema
S with relations
, let
denote the set of all candidate execution plans that the underlying HTAP optimizer can generate, and let
denote the end-to-end execution latency of a plan
. The objective is to identify the plan with minimum execution latency:
Intuitively,
is exponentially large because each of the
m base relations can be accessed from either row or column storage, yielding
different hybrid access patterns. To make the problem tractable, we reformulate it through hybrid access HINTs: let
denote the plan produced by the cost-based optimizer when relations in
are forced to row storage and relations in
to column storage. The reformulated objective is then
where
ranges over all partitions of
T. The MCTS-based search (
Section 3) explores promising
partitions, and the Transformer-based predictor (
Section 4) approximates the unknown latency function
so that the best-predicted candidate plan can be selected. Equation (
2) captures only execution latency; system-level factors such as memory budget, transactional interference, data-freshness constraints, and concurrency-control overhead can be incorporated as constraints in future work.
The remainder of this paper is organized as follows.
Section 2 presents the system overview.
Section 3 describes the hybrid path and candidate plan generation module.
Section 4 details the execution time prediction model.
Section 5 presents the experimental evaluation.
Section 6 reviews related work, and
Section 7 concludes the paper.
2. System Overview
In this section, we first explain key preliminary concepts, then present the overall system architecture.
2.1. Preliminaries
2.1.1. Hybrid Access HINT
A hybrid access HINT refers to extended syntax that forces an SQL query to use access paths with different storage formats. Formally, given the current schema
S with
m relations represented by
, let
and
be two disjoint subsets of
T. For any query
q, a hybrid access HINT is denoted as
, indicating that tables in
are read from row storage and tables in
are read from column storage. For example, in AlloyDB, the HINT is expressed as:
while in TiDB it is:
2.1.2. Hybrid Access HINT Tree
Given m relations under schema S for query q, a full-length hybrid access HINT where is a partition of T has possible values. Exhaustively exploring this space is prohibitively expensive. Therefore, we construct a hybrid access HINT tree to accelerate the search.
The tree has height
. Leaf nodes are called
terminal nodes; all other nodes are
search nodes. The root node represents the empty set. Each search node can be expanded by adding a table from
q (with ordering constraints) or by stopping the search. Search nodes represent sets of tables assigned to column storage (
), while terminal nodes represent complete full-length hybrid access HINTs. The tree structure is shown in the upper-left part of
Figure 3.
2.1.3. Problem Definition
The formal problem statement was introduced in
Section 1.5; we recall it here in the notation just established and connect it to the search-space construction.
For any query
q under schema
S, let
denote the set of all possible execution plans and
denote the execution time of plan
p. The optimization objective for hybrid plan query optimization in HTAP databases is:
Since directly solving Equation (
3) is impractical, we reformulate the problem using HINTs. The optimizer with a hybrid access HINT applied is denoted as
. Our optimization objective becomes:
Intuitively, instead of searching the full plan space
, we search over
partitions on the hybrid access HINT tree
: each terminal node corresponds to one full-length HINT
that, when fed to the optimizer
, yields one candidate plan. Two complementary learned components then make this search tractable in practice: the MCTS-based hybrid HINT generator (
Section 3) chooses a small number of promising
partitions, and the Transformer-based predictor (
Section 4) approximates the unknown latency function
to rank the resulting candidate plans.
2.2. System Architecture
As shown in
Figure 3, the system consists of three key modules:
Hybrid path and candidate plan generation. For each incoming query, a hybrid access HINT tree is constructed, and Monte Carlo Tree Search is employed to search for good HINTs. The database query optimizer then generates complete candidate plans from the selected HINTs.
Execution time prediction. A Transformer-based neural network with attention mechanisms predicts the execution time of each candidate plan. The plan with the minimum predicted execution time is selected as the final plan.
Incremental model training and update. After executing the selected plan, the actual execution time is used as training feedback to incrementally improve both the HINT estimator and the Transformer model asynchronously, without blocking the online workload.
2.3. Overall Algorithm
Algorithm 1 shows the overall workflow. The system first uses MCTS with the hybrid access HINT estimator to generate candidate partitions, then selects the top-
H HINTs. For each candidate HINT, the optimizer generates a complete hybrid plan. The TreeTransformer model estimates the execution time of each candidate, and the plan with the best predicted performance is selected. During training, the selected plan is executed asynchronously, and feedback updates both models.
| Algorithm 1: Hybrid Access HINT Query Optimizer |
![Applsci 16 05296 i001 Applsci 16 05296 i001]() |
3. Hybrid Path and Candidate Plan Generation
The goal of this module is to select H good hybrid access HINTs to generate H candidate plans using the cost-based optimizer.
3.1. Monte Carlo Tree Search for HINT Exploration
Since the full hybrid access path search space is large, we apply Monte Carlo Tree Search (MCTS) on the HINT tree
. MCTS follows an exploration–exploitation strategy: it focuses on paths that lead to high estimated query performance (exploitation) while also visiting less-explored directions (exploration). The overall module architecture is shown in
Figure 4.
3.1.1. Node Utility on the HINT Tree
The node utility is computed from two factors:
Node benefit score . Each node
has a benefit score representing the expected performance of full-length hybrid access paths passing through
. For leaf nodes,
is directly computed by the hybrid access HINT estimator (
Section 3.2). For non-leaf nodes, let
denote the set of leaf nodes that have been estimated and contain
on their search path:
Intuitively, summarizes what is currently known about the sub-tree rooted at : a non-leaf node inherits the average estimated performance of all full-length hybrid HINTs that have so far been simulated through it, so promising sub-trees accumulate higher scores as more leaves under them are evaluated.
Node exploration degree . This counts the number of times node has been visited, ensuring that less-explored nodes are also considered.
Following the Upper Confidence Bound (UCB) approach [
27], the node utility is defined as:
where
is the parent of
and
controls the exploration degree.
Intuitively, the first term exploits sub-trees that have so far yielded fast hybrid plans, while the second term grows whenever a sibling is repeatedly visited and is not, so under-explored branches are eventually revisited. The hyper-parameter trades off exploitation against exploration: small collapses MCTS toward greedy selection of the current best sub-tree, while large pushes the search to cover more of the hybrid access patterns.
3.1.2. MCTS Workflow
The MCTS workflow consists of five steps:
Selection. Starting from the root, iteratively select the child with the highest utility until reaching a node with unexpanded children.
Expansion. Randomly select an unexpanded child and recursively expand to a leaf node.
Simulation. Use the hybrid access HINT estimator to evaluate the leaf node’s corresponding hybrid access path.
Update. Backpropagate the leaf node’s utility to update ancestor nodes.
Termination. Repeat until the search budget b is exhausted.
After termination, the system selects the top-H hybrid access HINTs with the highest utility for candidate plan generation.
3.2. Hybrid Access HINT Estimator
This subsection constructs a lightweight estimator to predict the performance of a query under a given hybrid access HINT. It takes as input the query features and hybrid access path and outputs the estimated performance via a recurrent neural network.
3.2.1. Query Encoding
Following prior work [
21,
28,
29], we encode query
q considering join predicates (
) and filter predicates (
). Specifically,
is an
matrix where
if tables
i and
j are joined in
q.
is a length-
m vector where
represents the selectivity of filters on table
i (default 1 if no filter exists). The query encoding
is obtained by flattening
and concatenating
.
For example, consider a query
SELECT * FROM a, b, c WHERE a.id = b.id AND b.id = c.id AND b.attr2 BETWEEN B1 AND B2 with selectivity 0.6 on table
b. Then
, and
, as illustrated in
Figure 4.
3.2.2. Hybrid Access Path Representation
Each hybrid access path is a length- vector , where elements before represent column-store tables (), is the transition point, and elements after represent row-store tables (). For example, a full column plan is represented as , while a hybrid plan with in column and in row is .
We use Long Short-Term Memory (LSTM) networks to capture the sequential features of each path. Each table’s embedding is learned from training data. The LSTM encodes the path as: . The final path representation is .
3.2.3. Node Utility Prediction
Given the query encoding
and path representation
, the estimator uses a fully connected network to predict the node benefit score:
Intuitively, encodes what the query joins and filters, while encodes how the partial hybrid path so far assigns base relations to row or column storage; the fully connected layer learns a fast, lightweight mapping from this pair to an estimated benefit score, which MCTS uses as a cheap surrogate to decide which leaves of to expand.
4. Execution Time Prediction
Although prior work [
26] has adapted Transformers to query plan representation, it does not accommodate hybrid plans. This section presents our enhanced Transformer-based neural network for predicting the execution time of each candidate hybrid plan.
4.1. Why a Transformer-Based Predictor?
Candidate hybrid plans are not flat feature vectors. Each plan is a tree of physical operators whose nodes carry heterogeneous features—operator type, row/column access decisions, predicate descriptors, estimated cardinalities, join conditions, and per-table statistics—and whose execution cost depends on long-range, cross-operator interactions (e.g., the choice of a columnar scan deep in the tree affects the cost of an upstream hash join). Lightweight models such as linear regression, random forests, or multilayer perceptrons are easier to train and faster to evaluate, but they treat plan features as fixed-length vectors and therefore struggle to model structural dependencies among plan nodes and the interaction between row-store and column-store sub-trees. Tree-RNN and Tree-CNN models capture parent–child dependencies but, as discussed in
Section 6.3, suffer from forgetting effects or limited receptive fields. We therefore adopt a Transformer-based predictor with tree-aware attention and a global query node, which (i) jointly models all plan nodes via self-attention, (ii) uses structural biases to respect the tree topology, and (iii) summarizes the entire hybrid plan through a single global representation. We do not claim this Transformer architecture itself as new: height encoding, tree-biased attention, and the global query node are inspired by prior neural query-plan representation work [
25,
26,
30]; our contribution is to adapt these techniques to candidate hybrid plans in a row–column dual-store HTAP database and to couple the predictor with MCTS-based hybrid HINT generation.
4.2. Feature Extraction and Encoding
A query execution plan node typically contains operators, tables, predicates, and join conditions:
Operators describe the operation at each plan node (e.g., Merge Join, Index Scan, Columnar Scan, Vectorized Hash Join), treated as categorical variables with approximately 30 types.
Predicates describe filter conditions as
<column, comparison operator, value> triples [
31]. The column and operator are categorical; the value is range-normalized to
.
Joins are join conditions treated as categorical variables (finite domain given the schema).
Tables are relation identifiers, also treated as categorical variables.
4.2.1. Learned Embeddings for Categorical Variables
Instead of one-hot encoding, which suffers from dimension explosion and difficulty handling new categories, we use fixed-size learned embeddings [
26] for each categorical variable. New categories can be accommodated by simply adding a randomly initialized embedding without affecting existing learned embeddings.
4.2.2. Predicate Encoding
The predicate representation
concatenates the column embedding
, operator embedding
, and normalized value
:
4.2.3. Statistical Information Encoding
Histogram encoding. Equal-height histograms from database statistics are reorganized into N uniform buckets using linear interpolation. For a predicate on a histogram with boundaries and buckets, the encoding would be .
Sampling encoding. Random tuple samples are maintained from each table. For each predicate, each sample tuple is checked for satisfaction, producing an m-bit bitmap indicating the percentage of satisfying tuples.
4.2.4. Final Node Embedding
The components—operators (
), joins (
), tables (
), predicates (
), histograms (
), and samples (
)—are concatenated and passed through a linear layer:
Intuitively, plan nodes are heterogeneous: a Hash Join carries join conditions, a Columnar Scan carries a table and predicates, and an aggregation may carry none of these. Concatenating the per-component embeddings and projecting them through a shared linear layer maps every node, regardless of operator type, into a single d-dimensional space, so the subsequent attention layers can compare and combine plan nodes without operator-specific branching logic.
For nodes without predicates or join conditions (e.g., a Hash node), zero padding is used. This produces uniform-size embeddings for all node types, forming a vector tree as input to the tree attention network.
4.3. Tree Attention Network
Building on prior Transformer work [
25,
26], we adapt the architecture to accept hybrid query plans.
Figure 5 shows the model architecture, which introduces three key components: height encoding, tree-biased attention, and a global query node.
4.3.1. Height Encoding
The height of a node is defined as the length of the longest path from the node to any leaf. All leaf nodes have height 0, and the root has the maximum height. Height information implicitly encodes parent–child dependencies, as parent nodes always have greater height than their descendants. A learned height embedding
is added to each node embedding:
4.3.2. Tree-Biased Attention
In standard self-attention (Equation (
11)), every node can attend to all other nodes. However, in query plans, information should flow primarily bottom-up. We introduce tree-biased attention to control information flow using the tree structure.
Given the standard attention mechanism:
We compute pairwise distances
between plan nodes. If node
j is reachable from node
i (i.e.,
j is a descendant of
i),
is the tree distance; otherwise,
is marked as unreachable. Each distance value is associated with a learnable scalar bias
:
This allows the model to learn larger biases for small distances (e.g., direct parent–child) and small negative values to discourage attention between distant or unreachable nodes.
Intuitively, injects the tree topology directly into self-attention: an operator can give more weight to its immediate children (which dominate its runtime) and less weight to nodes in an unrelated sub-tree (which usually have weak cost coupling). Compared with vanilla self-attention, this biases the model toward physically meaningful information flow—bottom-up cost propagation along the plan tree—without removing the ability to model long-range interactions across distant operators.
4.3.3. Global Query Node
To capture both global context (the query, which is invariant across the plan search space) and local context (the plan), we introduce a global query node
. Using the query encoding
from
Section 3.2.1, we project it to the same dimension as plan node embeddings:
The global query node is connected to all plan nodes and has learnable node and height embeddings. Its output vector serves as the representation of the entire query plan. This concept is analogous to the
[CLS] token in BERT [
30].
4.4. Query Time Prediction
The global query node output
is fed into a two-layer fully connected network for time prediction:
where
are learnable parameters, and
is the normalized execution time. The training loss is mean squared error:
Intuitively, the Sigmoid output produces a bounded latency estimate on a normalized scale, which keeps the regression stable across the wide latency range of analytical queries; mean squared error then penalizes large prediction gaps more heavily than small ones, encouraging the predictor to avoid catastrophic mis-ranking of candidate plans rather than merely matching their average latency.
4.5. Model Training and Update
The system maintains a memory pool M collecting training instances . For each training cycle, triples are sampled from M to form a batch. Both the HINT estimator and the tree attention network are trained asynchronously to avoid blocking online workloads.
The hyperparameters and training settings of the Transformer predictor used in our experiments are summarized in
Table 2. The configuration follows the conventions of prior tree-Transformer plan-representation work [
26]. To mitigate overfitting we apply dropout inside the attention and feed-forward sub-layers, use weight decay in the optimizer, and select the final model checkpoint according to its loss on a held-out validation split rather than relying on a fixed number of epochs.
Table 3 reports the size and composition of the training data used to evaluate the predictor on the JOB workload. For each query, multiple candidate hybrid plans are generated by the MCTS-based HINT search and executed on the target DBMS to collect ground-truth latencies, which are then normalized to
before training.
4.5.1. Training Convergence
During training, the validation loss generally stabilized within the 30 online training rounds used in our experiments, and the final checkpoint was selected according to the lowest validation loss to reduce overfitting; this validation-based checkpointing also serves as our early-stopping mechanism in lieu of a fixed-patience criterion.
4.5.2. Cold-Start and Generalization
The proposed method requires an initial set of executed plans to train the predictor. In cold-start scenarios—when a new schema or workload is deployed—the system can initially fall back on the native cost-based optimizer or use a small bootstrap workload to populate the memory pool M. As more (query, plan, latency) triples are collected, the predictor is incrementally updated, and the MCTS-based search benefits from increasingly accurate node-utility estimates. For unseen queries that share tables and join graphs with the training distribution, the learned table embeddings and tree-aware attention enable reasonable generalization; for queries whose structure differs substantially from the training distribution, additional online execution is required before the predictor stabilizes. A comprehensive online-learning and drift-adaptation strategy is beyond the scope of this paper and is left as future work.
5. Experimental Evaluation
The experiments in this section evaluate analytical-query latency under hybrid row–column access. They do not evaluate concurrent OLTP/OLAP execution, transactional interference, resource isolation, or freshness maintenance. The results should therefore be interpreted as evidence for latency-oriented hybrid-plan selection rather than as a complete evaluation of end-to-end HTAP workload management.
5.1. Experimental Setup
5.1.1. Environment
Our system is deployed on AlloyDB Omni 15.2.0 (Google LLC, Mountain View, CA, USA) [
32], a locally deployable version of AlloyDB [
10] compatible with PostgreSQL 15 (PostgreSQL Global Development Group) [
33]. It supports both disk-based row storage and in-memory column storage with hybrid plan query optimization. The database and optimizer system are deployed on separate servers connected via a 10 Gb/s network. Both servers use Intel Xeon Gold 6148 CPUs @ 2.40 GHz (Intel Corporation, Santa Clara, CA, USA) with 1 TB memory. The optimizer server additionally uses an NVIDIA GeForce RTX 4090 GPU (NVIDIA Corporation, Santa Clara, CA, USA). The learning components are implemented in PyTorch 1.13 (Meta Platforms, Inc., Menlo Park, CA, USA) on Python 3.9 (Python Software Foundation, Wilmington, DE, USA).
5.1.2. Dataset and Workload
We use the
IMDB (Internet Movie Database) dataset, a real-world database containing information about movies, actors, directors, and production companies (
Table 4). The
JOB (Join Order Benchmark) [
34] is an OLAP benchmark using IMDB. JOB contains 33 query structures with 2–6 variants each (e.g., 1a, 1b, 1c), totaling 113 queries. Variants share the same tables and joins but differ in filter conditions. The number of relations per query ranges from 4 to 17.
5.1.3. Baseline Methods
We compare against four baselines:
Row Plan: AlloyDB Omni with all tables forced to row storage via HINTs, equivalent to traditional PostgreSQL plans.
Column Plan: AlloyDB Omni with all tables forced to column storage via HINTs.
Cost-based Hybrid Plan: AlloyDB Omni’s default cost-based hybrid plan optimizer.
HybridQO: A learning-based row plan optimizer [
35] adapted for hybrid plan optimization. It searches join order prefixes, completes them with the cost-based optimizer, and uses a neural network to evaluate plan performance and uncertainty.
5.1.4. Evaluation Metrics
We use query latency (execution time) as the primary metric and GMRL (Geometric Mean Relative Latency) [
21] for normalized per-query performance:
where lower GMRL indicates better performance.
5.1.5. Experimental Settings
The database working memory and cache are set to the dataset size with row plan warm-up for all queries, eliminating disk I/O effects. Row and column plans are executed three times with averaged results. Learning methods (HybridQO and ours) run for 30 iterations with the last round’s results used. MCTS budget is 75 ms. Neural networks are trained every 10 queries using 128 samples from the experience pool. The timeout is set to 1.5× the maximum baseline execution time, and execution times are normalized to .
5.2. Overall Query Latency
Figure 6 shows the total latency after completing all 113 queries. Our hybrid access HINT query optimizer outperforms all baselines:
vs. Row Plan: 71.32% time saved (3.49× speedup, from 214.99 s to 61.65 s).
vs. Column Plan: 96.83% time saved (31.57× speedup, from 1946.40 s to 61.65 s).
vs. Cost-based Hybrid Plan: 75.02% time saved (4.00× speedup, from 246.84 s to 61.65 s).
vs. HybridQO: 62.23% time saved (2.65× speedup, from 163.24 s to 61.65 s).
Key findings include: (1) hybrid plans leverage column store advantages such as SIMD operations, parallelization, and reduced I/O; (2) the column plan performs worst due to expensive tuple reconstruction (column-to-row conversion); (3) the cost-based hybrid plan performs worse than even the row plan due to its inability to unify row and column costs; and (4) HybridQO performs reasonably but is limited because it was not designed for hybrid plans.
5.3. Per-Query Latency Analysis
Figure 7 presents 20 queries on a log scale. Our optimizer achieves excellent performance on nearly all queries, with only query 22c being suboptimal (the time prediction module did not correctly identify the default plan as superior). HybridQO only improves three queries (15c, 26c, 30b), with query 26c alone contributing 99% of its total improvement.
5.4. Dominance Proportions
Table 5 presents the dominance proportions. Our optimizer achieves 76.11% dominance (proportion of queries where it is the best or tied for best) and 30.97% strict dominance (proportion where it is strictly the best). This demonstrates both strong
exploitation (reusing existing good plans, 76.11%) and strong
exploration (discovering new better plans, 30.97%). In contrast, HybridQO has only 1.77% strict dominance, indicating very limited exploration capability for hybrid plans.
5.5. Stability and Complex Query Handling
Table 6 and
Figure 8a show that our optimizer has the lowest mean (545.59 ms), lowest standard deviation (1066.33), lowest median (125.16 ms), and lowest maximum (7550.71 ms), demonstrating the best stability across all methods.
Figure 8b further confirms that at the 50th, 75th, 90th, 99th, and 99.5th percentiles, our optimizer consistently achieves the lowest latency. Notably, the time savings ratio increases with percentile, indicating stronger advantages on complex queries.
5.6. End-to-End Query Time
Table 7 compares end-to-end times. Despite introducing additional search time (7.86 s) and inference time (2.80 s), our optimizer achieves a total of 75.15 s versus AlloyDB Omni’s 249.83 s—a 69.92% reduction (3.32× speedup). The plan generation time is nearly identical (2.85 s vs. 2.99 s), as HINTs reduce the optimizer’s plan exploration space. The additional overhead of 10.66 s is vastly offset by the 185.19 s execution time saving, yielding a net gain of 174.53 s.
6. Related Work
6.1. HTAP Databases with Row–Column Co-Existing Storage
The concept of HTAP was first proposed by Gartner [
36], predicting that HTAP technology would be widely adopted for real-time data analysis. In a strict sense, an HTAP database system should satisfy real-time transaction and analytical processing requirements without requiring data ETL processes. To simultaneously support both OLTP and OLAP performance, mainstream HTAP databases have adopted row–column co-existing storage [
2].
Table 8 summarizes key technologies in representative HTAP database systems across five dimensions: transaction processing, analytical processing, data synchronization, query optimization, and resource scheduling.
6.2. Hybrid Plans in Row–Column HTAP Databases
As shown in
Figure 9, the query-processing component of such an HTAP database features a hybrid operator query optimizer on the replica node. The row execution engine can receive, process, and return row-format tuples. For column operators in the physical plan, the row engine pushes them down to the column execution engine, which processes data in columnar format using SIMD instructions and vectorized execution. The processed columnar data then undergoes a column-to-row conversion before being returned to the row engine and finally to the user.
Current hybrid plan generation approaches include heuristic methods based on column defaults (e.g., Oracle 21c [
39], SQL Server [
4]) and cost function-based optimization (e.g., TiDB [
7]). The cost function approach compares costs across storage formats:
However, these cost formulas are based on independence and uniform distribution assumptions without considering multi-column correlations or data skew.
6.3. Learning-Based Query Optimization
Learning-based query optimizers have made significant advances. DQ [
19] and ReJOIN [
20] use reinforcement learning to find optimal join orders. Neo [
22] is the first end-to-end learning-based optimizer, using tree convolutional neural networks to estimate full plan latency. RTOS [
21] applies graph neural networks to train reinforcement learning agents for join order selection. Bao [
43] sits atop PostgreSQL, using HINTs to enable or disable sets of join and scan operations. Balsa [
28] uses a similar architecture to Neo with modified training. Lero [
44] formalizes the problem as a learning-to-rank task. LEON [
29] is another learning-to-rank approach that prunes possible physical plans in a dynamic programming fashion.
For query plan representation, existing approaches include: (1)
flattened methods such as AVGDL [
45], which serialize plans into sequences but fail to capture tree structure; (2)
Tree-RNNs such as Tree-LSTM [
46] used by RTOS [
21] and E2E-Cost [
31], which aggregate information bottom-up but suffer from forgetting problems [
47]; (3)
Tree-CNNs [
48] used by Neo [
22] and Bao [
43], which capture parent–child dependencies via triangular kernels but have limited receptive fields; and (4)
feature vector methods such as AIMeetsAI [
49] and LQPP [
50], which encode features using predefined rules but fail to capture structural dependencies.
However, the above representative methods do not directly target hybrid row–column execution optimization and are therefore not directly applicable to hybrid plan selection in row–column dual-store HTAP databases. The most recent work on query optimization in HTAP databases [
24] does not involve machine learning. More recent learned-optimizer studies, including learning-to-rank optimizers such as Lero [
44], robust plan-generation models such as LOGER [
51], and unified learned-optimizer services for lakehouse-style engines [
52], also focus on conventional single-engine or row-store settings and do not target hybrid row–column plan selection in row–column dual-store HTAP systems.
7. Conclusions
This paper investigated the hybrid query optimization problem in HTAP databases with row–column co-existing configurations. We designed and implemented a hybrid access HINT query optimizer that combines Monte Carlo Tree Search with neural networks to efficiently generate high-quality hybrid access HINTs and produce superior candidate plans. The system extracts and encodes both local features of hybrid plans and global features of queries, then employs an enhanced TreeTransformer to evaluate candidate plans and select the best predicted hybrid execution plan.
The system was experimentally validated on AlloyDB Omni, a recent row–column co-existing HTAP database, using the real-world IMDB dataset and the JOB benchmark. In this evaluation setting, the proposed method improves latency-oriented hybrid-plan selection for analytical queries: it reduces execution time by 75.02% compared with the cost-based optimizer and by 62.23% compared with a state-of-the-art row-store-based learning optimizer, and it also achieves the best stability and the strongest performance on complex queries among all baselines considered.
We emphasize the scope of these claims. The current study targets analytical, latency-oriented hybrid-plan selection in a single row–column dual-store HTAP architecture; it does not fully evaluate mixed OLTP/OLAP workloads, transactional interference, concurrency, or system-level resource constraints such as memory and freshness budgets. The Transformer-based representation reuses height encoding, tree-aware attention, and a global query node from prior plan-representation work, and our contribution lies in adapting and integrating these components with MCTS-based hybrid HINT generation rather than in proposing a new general-purpose plan-representation architecture. Future work will (i) extend the optimization objective to incorporate concurrency and resource constraints, (ii) evaluate the framework under realistic mixed OLTP/OLAP workloads on additional HTAP systems, (iii) investigate online-learning strategies for cold-start and workload-drift scenarios, and (iv) study the trade-off between the proposed Transformer predictor and lightweight alternatives such as gradient-boosted trees or compact MLPs.
Author Contributions
Conceptualization, X.S.; methodology, C.S.; software, C.S.; validation, L.Q.; formal analysis, T.H.; investigation, L.Q.; writing—original draft preparation, X.S.; writing—review and editing, X.S.; supervision, G.W. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The IMDB dataset and JOB benchmark used in this study are publicly available. Reproducibility materials are released at
https://github.com/HR10108/HtapQO (accessed on 20 May 2026). The repository includes the prototype implementation of the proposed optimizer, the MCTS-based hybrid HINT search procedure, the Transformer-based runtime estimator, the SQL and workload files used in the evaluation, representative configuration examples, and step-by-step instructions for building and running the prototype. The following components are intentionally
not included: the underlying DBMS runtime assets (AlloyDB Omni binaries and other non-distributable storage- and execution-engine artifacts), the generated experimental result files, and locally trained model checkpoints. Researchers can regenerate these artifacts by following the released configurations and procedures on their own AlloyDB Omni deployment.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| HTAP | Hybrid Transactional/Analytical Processing |
| OLTP | Online Transaction Processing |
| OLAP | Online Analytical Processing |
| DBMS | Database Management System |
| CBO | Cost-Based Optimizer |
| MCTS | Monte Carlo Tree Search |
| HINT | Query optimizer hint/directive |
| LSTM | Long Short-Term Memory |
| IMCS | In-Memory Column Store |
| ETL | Extract-Transform-Load |
| JOB | Join Order Benchmark |
| GMRL | Geometric Mean Relative Latency |
| UCB | Upper Confidence Bound |
|
CPU
|
Central Processing Unit
|
|
GPU
|
Graphics Processing Unit
|
|
SIMD
|
Single Instruction Multiple Data
|
|
I/O
|
Input/Output
|
|
MVCC
|
Multi-Version Concurrency Control
|
|
2PC
|
Two-Phase Commit
|
|
PAX
|
Partition Attributes Across
|
|
MLP
|
Multilayer Perceptron
|
|
RNN
|
Recurrent Neural Network
|
|
CNN
|
Convolutional Neural Network
|
|
ML
|
Machine Learning
|
References
- Özcan, F.; Tian, Y.; Tözün, P. Hybrid Transactional/Analytical Processing: A Survey. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 1771–1775. [Google Scholar]
- Li, G.; Zhang, C. HTAP Databases: What is New and What is Next. In Proceedings of the 2022 International Conference on Management of Data, Philadelphia, PA, USA, 12–17 June 2022; pp. 2483–2488. [Google Scholar]
- Lahiri, T.; Chavan, S.; Colgan, M.; Das, D.; Ganesh, A.; Gleeson, M.; Hase, S.; Holloway, A.; Kamp, J.; Lee, T.-H.; et al. Oracle Database In-Memory: A Dual Format In-Memory Database. In Proceedings of the 2015 IEEE 31st International Conference on Data Engineering, Seoul, Republic of Korea, 13–17 April 2015; pp. 1253–1258. [Google Scholar]
- Larson, P.-Å.; Birka, A.; Hanson, E.N.; Huang, W.; Nowakiewicz, M.; Papadimos, V. Real-Time Analytical Processing with SQL Server. Proc. VLDB Endow. 2015, 8, 1740–1751. [Google Scholar] [CrossRef]
- Raman, V.; Attaluri, G.; Barber, R.; Chainani, N.; Kalmuk, D.; KulandaiSamy, V.; Leenstra, J.; Lightstone, S.; Liu, S.; Lohman, G.M.; et al. DB2 with BLU Acceleration: So Much More than Just a Column Store. Proc. VLDB Endow. 2013, 6, 1080–1091. [Google Scholar] [CrossRef]
- Sikka, V.; Färber, F.; Lehner, W.; Cha, S.K.; Peh, T.; Bornhövd, C. Efficient Transaction Processing in SAP HANA Database: The End of a Column Store Myth. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, USA, 20–24 May 2012; pp. 731–742. [Google Scholar]
- Huang, D.; Liu, Q.; Cui, Q.; Fang, Z.; Ma, X.; Xu, F.; Shen, L.; Tang, L.; Zhou, Y.; Huang, M.; et al. TiDB: A Raft-Based HTAP Database. Proc. VLDB Endow. 2020, 13, 3072–3084. [Google Scholar] [CrossRef]
- Wang, J.; Li, T.; Song, H.; Yang, X.; Zhou, W.; Li, F.; Yan, B.; Wu, Q.; Liang, Y.; Ying, C.; et al. PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba. Proc. ACM Manag. Data 2023, 1, 1–25. [Google Scholar] [CrossRef]
- Lee, R.; Zhou, M.; Li, C.; Hu, S.; Teng, J.; Li, D.; Zhang, X. The Art of Balance: A RateupDB Experience of Building a CPU/GPU Hybrid Database Product. Proc. VLDB Endow. 2021, 14, 2999–3013. [Google Scholar] [CrossRef]
- Google Cloud. AlloyDB: A Fully Managed PostgreSQL-Compatible Database Service. Available online: https://cloud.google.com/alloydb (accessed on 1 March 2023).
- Ailamaki, A.; DeWitt, D.J.; Hill, M.D.; Skounakis, M. Weaving Relations for Cache Performance. In Proceedings of the 27th International Conference on Very Large Data Bases, Roma, Italy, 11–14 September 2001; pp. 169–180. [Google Scholar]
- Chen, J.; Ding, Y.; Liu, Y.; Li, F.; Zhang, L.; Zhang, M.; Wei, K.; Cao, L.; Zou, D.; Liu, Y.; et al. ByteHTAP: ByteDance’s HTAP System with High Data Freshness and Strong Data Consistency. Proc. VLDB Endow. 2022, 15, 3411–3424. [Google Scholar] [CrossRef]
- Li, G.; Dong, H.; Zhang, C. Cloud Databases: New Techniques, Challenges, and Opportunities. Proc. VLDB Endow. 2022, 15, 3758–3761. [Google Scholar] [CrossRef]
- Makreshanski, D.; Giceva, J.; Barthels, C.; Alonso, G. BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 37–50. [Google Scholar]
- Nakamura, M.; Tabaru, T.; Ujibashi, Y.; Hashida, T.; Kawaba, M.; Harada, L. Extending PostgreSQL to Handle OLXP Workloads. In Proceedings of the Fifth International Conference on Innovative Computing Technology (INTECH 2015), Galicia, Spain, 20–22 May 2015; pp. 40–44. [Google Scholar]
- StoneDB. A Real-Time HTAP Database. Available online: https://github.com/stoneatom/stonedb (accessed on 1 March 2023).
- Zhang, C.; Li, G.; Lv, T. HyBench: A New Benchmark for HTAP Databases. Proc. VLDB Endow. 2023, 16, 4043–4056. [Google Scholar] [CrossRef]
- Nippon Telegraph and Telephone Corporation. ossc-db/pg_hint_plan: Extension Adding Support for Optimizer Hints in PostgreSQL. Available online: https://github.com/ossc-db/pg_hint_plan (accessed on 30 April 2024).
- Krishnan, S.; Yang, Z.; Goldberg, K.; Hellerstein, J.M.; Stoica, I. Learning to Optimize Join Queries with Deep Reinforcement Learning. arXiv 2018, arXiv:1808.03196. [Google Scholar]
- Marcus, R.; Papaemmanouil, O. Deep Reinforcement Learning for Join Order Enumeration. In Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, Houston, TX, USA, 10 June 2018; pp. 1–4. [Google Scholar]
- Yu, X.; Li, G.; Chai, C.; Tang, N. Reinforcement Learning with Tree-LSTM for Join Order Selection. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1297–1308. [Google Scholar]
- Marcus, R.; Negi, P.; Mao, H.; Zhang, C.; Alizadeh, M.; Kraska, T.; Papaemmanouil, O.; Tatbul, N. Neo: A Learned Query Optimizer. Proc. VLDB Endow. 2019, 12, 1705–1718. [Google Scholar] [CrossRef]
- Lehmann, C.; Sulimov, P.; Stockinger, K. Is Your Learned Query Optimizer Behaving as You Expect? A Machine Learning Perspective. arXiv 2023, arXiv:2309.01551. [Google Scholar] [CrossRef]
- Song, H.; Zhou, W.; Li, F.; Peng, X.; Cui, H. Rethink Query Optimization in HTAP Databases. Proc. ACM Manag. Data 2023, 1, 1–27. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Zhao, Y.; Cong, G.; Shi, J.; Miao, C. QueryFormer: A Tree Transformer Model for Query Plan Representation. Proc. VLDB Endow. 2022, 15, 1658–1670. [Google Scholar] [CrossRef]
- Browne, C.B.; Powley, E.; Whitehouse, D.; Lucas, S.M.; Cowling, P.I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; Colton, S. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games 2012, 4, 1–43. [Google Scholar] [CrossRef]
- Yang, Z.; Chiang, W.-L.; Luan, S.; Mittal, G.; Luo, M.; Stoica, I. Balsa: Learning a Query Optimizer without Expert Demonstrations. In Proceedings of the 2022 International Conference on Management of Data, Philadelphia, PA, USA, 12–17 June 2022; pp. 931–944. [Google Scholar]
- Chen, X.; Chen, H.; Liang, Z.; Liu, S.; Wang, J.; Zeng, K.; Su, H.; Zheng, K. LEON: A New Framework for ML-Aided Query Optimization. Proc. VLDB Endow. 2023, 16, 2261–2273. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Sun, J.; Li, G. An End-to-End Learning-Based Cost Estimator. Proc. VLDB Endow. 2019, 13, 307–319. [Google Scholar] [CrossRef]
- AlloyDB Omni Overview. Available online: https://cloud.google.com/alloydb/docs/omni (accessed on 8 May 2024).
- PostgreSQL Global Development Group. PostgreSQL. Available online: https://www.postgresql.org/ (accessed on 19 April 2024).
- Leis, V.; Gubichev, A.; Mirchev, A.; Boncz, P.; Kemper, A.; Neumann, T. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 2015, 9, 204–215. [Google Scholar] [CrossRef]
- Yu, X.; Chai, C.; Li, G.; Liu, J. Cost-Based or Learning-Based? A Hybrid Query Optimizer for Query Plan Selection. Proc. VLDB Endow. 2022, 15, 3924–3936. [Google Scholar] [CrossRef]
- Pezzini, M.; Feinberg, D.; Rayner, N.; Edjlali, R. Hybrid Transaction/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation; Gartner: Stamford, CT, USA, 2014; pp. 4–20. [Google Scholar]
- Oracle. MySQL HeatWave: Real-Time Analytics for MySQL Database Service. Available online: https://www.oracle.com/mysql/heatwave/ (accessed on 1 April 2024).
- SingleStore. The Single Database for All Data-Intensive Applications. Available online: https://www.singlestore.com/ (accessed on 19 April 2024).
- Oracle. Oracle 21c: Automating Management of In-Memory Objects. Available online: https://docs.oracle.com/en/database/oracle/oracle-database/23/inmem/ (accessed on 19 April 2024).
- Appuswamy, R.; Karpathiotakis, M.; Porobic, D.; Ailamaki, A. The Case for Heterogeneous HTAP. In Proceedings of the 8th Biennial Conference on Innovative Data Systems Research (CIDR 2017), Chaminade, CA, USA, 8–11 January 2017. [Google Scholar]
- Raza, A.; Chrysogelos, P.; Anadiotis, A.C.; Ailamaki, A. Adaptive HTAP through Elastic Resource Scheduling. arXiv 2020, arXiv:2004.02380. [Google Scholar] [CrossRef]
- Sirin, U.; Dwarkadas, S.; Ailamaki, A. Performance Characterization of HTAP Workloads. In Proceedings of the 2021 IEEE 37th International Conference on Data Engineering (ICDE), Chania, Greece, 19–22 April 2021; pp. 1808–1819. [Google Scholar]
- Marcus, R.; Negi, P.; Mao, H.; Tatbul, N.; Alizadeh, M.; Kraska, T. Bao: Making Learned Query Optimization Practical. In Proceedings of the 2021 International Conference on Management of Data, Virtual Event, 20–25 June 2021; pp. 1275–1288. [Google Scholar]
- Zhu, R.; Chen, W.; Ding, B.; Chen, X.; Pfadler, A.; Wu, Z.; Zhou, J. Lero: A Learning-to-Rank Query Optimizer. Proc. VLDB Endow. 2023, 16, 1466–1479. [Google Scholar] [CrossRef]
- Yuan, H.; Li, G.; Feng, L.; Sun, J.; Han, Y. Automatic View Generation with Deep Learning and Reinforcement Learning. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1501–1512. [Google Scholar]
- Tai, K.S.; Socher, R.; Manning, C.D. Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. arXiv 2015, arXiv:1503.00075. [Google Scholar]
- Pascanu, R.; Mikolov, T.; Bengio, Y. On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; pp. 1310–1318. [Google Scholar]
- Mou, L.; Li, G.; Zhang, L.; Wang, T.; Jin, Z. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30, pp. 1287–1293. [Google Scholar]
- Ding, B.; Das, S.; Marcus, R.; Wu, W.; Chaudhuri, S.; Narasayya, V.R. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands, 30 June–5 July 2019; pp. 1241–1258. [Google Scholar]
- Akdere, M.; Çetintemel, U.; Riondato, M.; Upfal, E.; Zdonik, S.B. Learning-Based Query Performance Modeling and Prediction. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, Arlington, VA, USA, 1–5 April 2012; pp. 390–401. [Google Scholar]
- Chen, T.; Gao, J.; Chen, H.; Tu, Y. LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans. Proc. VLDB Endow. 2023, 16, 1777–1789. [Google Scholar] [CrossRef]
- Gunasekaran, K.P.; Tiwari, K.; Acharya, R. Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Ecosystem: Can One Optimizer Serve All Engines? arXiv 2024, arXiv:2407.02129. [Google Scholar]
Figure 1.
Comparison between the traditional OLTP/OLAP separated architecture and the unified row–column co-existing HTAP architecture.
Figure 1.
Comparison between the traditional OLTP/OLAP separated architecture and the unified row–column co-existing HTAP architecture.
Figure 2.
Query plans and execution performance of a single SQL query under different HINTs. Dashed boxes group sub-plans that share the same storage-format access pattern; different colors distinguish row-store and column-store access; the asterisk (*) marks the plan selected by the default cost-based optimizer.
Figure 2.
Query plans and execution performance of a single SQL query under different HINTs. Dashed boxes group sub-plans that share the same storage-format access pattern; different colors distinguish row-store and column-store access; the asterisk (*) marks the plan selected by the default cost-based optimizer.
Figure 3.
System architecture of the hybrid plan query optimizer. Solid arrows denote the forward query/data flow during inference, while dashed arrows denote feedback signals used for online training; different module colors distinguish the search, prediction, and execution components. The ellipsis (…) indicates further candidate plans omitted for clarity, and the asterisk (*) marks the plan finally selected by our optimizer.
Figure 3.
System architecture of the hybrid plan query optimizer. Solid arrows denote the forward query/data flow during inference, while dashed arrows denote feedback signals used for online training; different module colors distinguish the search, prediction, and execution components. The ellipsis (…) indicates further candidate plans omitted for clarity, and the asterisk (*) marks the plan finally selected by our optimizer.
Figure 4.
Monte Carlo Tree Search and hybrid access HINT estimator. Dashed boxes group logically related modules; solid arrows denote forward data flow during prediction; dashed lines indicate feedback used to update node statistics on the HINT tree; different colors distinguish the search, encoding, and estimation stages. The ellipsis (…) indicates intermediate tree nodes or feature vectors omitted for clarity and does not affect the technical content.
Figure 4.
Monte Carlo Tree Search and hybrid access HINT estimator. Dashed boxes group logically related modules; solid arrows denote forward data flow during prediction; dashed lines indicate feedback used to update node statistics on the HINT tree; different colors distinguish the search, encoding, and estimation stages. The ellipsis (…) indicates intermediate tree nodes or feature vectors omitted for clarity and does not affect the technical content.
Figure 5.
Tree attention network model and execution time prediction architecture.
Figure 5.
Tree attention network model and execution time prediction architecture.
Figure 6.
Overall execution time comparison across different methods. Speedup ratios are annotated with the multiplication sign “×” (e.g., “3.49×”); the slight overlap of bar labels does not affect scientific understanding.
Figure 6.
Overall execution time comparison across different methods. Speedup ratios are annotated with the multiplication sign “×” (e.g., “3.49×”); the slight overlap of bar labels does not affect scientific understanding.
Figure 7.
Per-query execution time comparison. The slight overlap of x-axis tick labels is purely a layout artifact (queries are plotted on a log scale) and does not affect scientific understanding.
Figure 7.
Per-query execution time comparison. The slight overlap of x-axis tick labels is purely a layout artifact (queries are plotted on a log scale) and does not affect scientific understanding.
Figure 8.
Stability and tail-latency analysis across methods. Boxes show the interquartile range (IQR), the central line marks the median, and whiskers extend to 1.5 × IQR; open circles denote outliers beyond the whiskers. The slight overlap of legend or tick labels does not affect scientific understanding.
Figure 8.
Stability and tail-latency analysis across methods. Boxes show the interquartile range (IQR), the central line marks the median, and whiskers extend to 1.5 × IQR; open circles denote outliers beyond the whiskers. The slight overlap of legend or tick labels does not affect scientific understanding.
Figure 9.
Architecture of a row–column co-existing HTAP database based on disk row store and in-memory column store. Solid arrows denote query and data flow; the dashed arrow denotes the incremental refresh of the in-memory column store from the row store; different module colors distinguish the storage, execution, and refresh components.
Figure 9.
Architecture of a row–column co-existing HTAP database based on disk row store and in-memory column store. Solid arrows denote query and data flow; the dashed arrow denotes the incremental refresh of the in-memory column store from the row store; different module colors distinguish the storage, execution, and refresh components.
Table 1.
Comparison between existing query optimization approaches and the proposed method.
Table 1.
Comparison between existing query optimization approaches and the proposed method.
| Approach | Main Idea | Limitation for Row–Column Dual-Store HTAP | Difference of Our Work |
|---|
| Traditional cost-based optimization | Uses statistics and handcrafted cost models | Hard to accurately model hybrid row/column access choices | Learns execution behavior directly from candidate hybrid plans |
| Learned cardinality/cost estimation | Improves optimizer components using machine learning | Usually targets conventional plans or single-engine settings | Predicts execution time for hybrid row/column access alternatives |
| Learning-to-rank optimization | Learns relative quality of candidate plans | Often relies on plans generated by native optimizers | Combines MCTS-based HINT search with a learned predictor |
| HTAP query optimization | Optimizes analytical queries over mixed storage/workloads | Often requires system-specific rules and cost models | Focuses on hybrid plan selection in row–column dual-store architecture |
| Proposed method | Generates and evaluates hybrid row–column plans | Current work focuses on analytical queries | Provides a learning-based hybrid-plan selection framework |
Table 2.
Hyperparameters and training settings of the Transformer predictor.
Table 2.
Hyperparameters and training settings of the Transformer predictor.
| Item | Value |
|---|
| Number of Transformer layers | 8 |
| Number of attention heads | 8 |
| Hidden dimension | 128 |
| Feed-forward dimension | 512 |
| Dropout rate | 0.1 |
| Batch size | 128 |
| Optimizer | AdamW |
| Learning rate | |
| Weight decay | |
| Max training rounds | 30 (per online iteration) |
| Model selection | Lowest validation loss |
| Train/validation/test split | 70/15/15 |
| Random seed | 42 |
Table 3.
Training data statistics for the Transformer predictor.
Table 3.
Training data statistics for the Transformer predictor.
| Item | Value |
|---|
| Benchmark | IMDB/JOB |
| Number of query templates | 33 |
| Number of query instances | 113 |
| Average number of candidate hybrid plans per query | up to retained from MCTS |
| Total generated candidate plans (MCTS rollouts) | ≈1500 |
| Retained executed (query, plan, latency) triples | ≈565 (up to 5 per query) |
| Training samples | 70% (≈395 triples) |
| Validation samples | 15% (≈85 triples; used for checkpoint selection) |
| Testing samples | 15% (≈85 triples) |
| Evaluation metrics | query latency, GMRL, plan regret |
Table 4.
Workload statistics.
Table 4.
Workload statistics.
| Statistic | Value |
|---|
| Dataset | IMDB |
| Attributes | 108 |
| Relations | 21 |
| Queries | 113 |
| Dataset Size (GB) | 3.6 |
Table 5.
Dominance and strict dominance proportions of different methods.
Table 5.
Dominance and strict dominance proportions of different methods.
| Method | Dominance (%) | Strict Dominance (%) |
|---|
| Our work | 76.11 | 30.97 |
| Hybrid plan | 50.44 | 4.42 |
| HybridQO | 47.79 | 1.77 |
| Row plan | 15.04 | 15.04 |
| Column plan | 1.77 | 1.77 |
Table 6.
Descriptive statistics of execution time (ms) across five methods.
Table 6.
Descriptive statistics of execution time (ms) across five methods.
| Statistic | Row Plan | Column Plan | Hybrid Plan | HybridQO | Our Work |
|---|
| Count | 113 | 113 | 113 | 113 | 113 |
| Mean | 1902.53 | 17,224.78 | 2184.42 | 1444.61 | 545.59 |
| Std Dev | 8301.75 | 42,218.32 | 8843.31 | 3564.76 | 1066.33 |
| Min | 15.53 | 45.05 | 0.12 | 0.12 | 0.12 |
| 25th %ile | 108.32 | 680.18 | 57.36 | 56.61 | 56.61 |
| Median | 222.65 | 1434.84 | 181.77 | 181.77 | 125.16 |
| 75th %ile | 1138.32 | 5459.22 | 1117.25 | 1117.25 | 447.45 |
| Max | 86,063.15 | 160,000.00 | 87,490.30 | 26,763.19 | 7550.71 |
| GMRL | 0.1297 | 0.7692 | 0.0863 | 0.0835 | 0.0565 |
Table 7.
End-to-end time comparison between AlloyDB Omni and our optimizer (seconds).
Table 7.
End-to-end time comparison between AlloyDB Omni and our optimizer (seconds).
| AlloyDB Omni | Our Optimizer |
|---|
| Plan Gen. | Execution | Total | Search | Plan Gen. | Inference | Execution | Total |
|---|
| 2.99 | 246.84 | 249.83 | 7.86 | 2.85 | 2.80 | 61.65 | 75.15 |
Table 8.
Overview of key technologies in HTAP databases [
2].
Table 8.
Overview of key technologies in HTAP databases [
2].
| Task Type | Key Technology | Corresponding HTAP Databases |
|---|
| Transaction Processing | MVCC + Log | Oracle Dual-Format [3], SQL Server [4], DB2 BLU [5], MySQL HeatWave [37], SAP HANA [6] |
| 2PC + Raft + Log | TiDB [7] |
| Analytical Processing | In-memory delta + column scan | Oracle [3], SQL Server [4], DB2 [5], MySQL HeatWave [37], SAP HANA [6] |
| Log-based delta + column scan | TiDB [7] |
| Column scan | SingleStore [38] |
| Data Synchronization | In-memory delta merge | Oracle [3], SQL Server [4], DB2 [5], MySQL HeatWave [37], SAP HANA [6] |
| Log-based delta merge | TiDB [7] |
| Rebuild from primary key row store | SingleStore [38], Oracle [3] |
| Query Optimization | In-memory column selection | Oracle 21c [39], MySQL HeatWave [37] |
| Hybrid row/column scan | TiDB [7], SQL Server [4] |
| CPU/GPU acceleration | RateupDB [9], Caldera [40] |
| Resource Scheduling | Freshness scheduling | RDE [41] |
| Workload scheduling | SAP HANA [6], Siper [42] |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |