A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN

Mao, Ya; Li, Yuhang; Lai, Yanhui; Fan, Fangshuo

doi:10.3390/app16042029

Open AccessArticle

A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN

School of Mechanical and Electronic Engineering, Wuhan University of Technology, Luoshi Road, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(4), 2029; https://doi.org/10.3390/app16042029

Submission received: 27 January 2026 / Revised: 10 February 2026 / Accepted: 15 February 2026 / Published: 18 February 2026

(This article belongs to the Special Issue Fuel Cell Technologies in Power Generation and Energy Recovery)

Download

Browse Figures

Versions Notes

Abstract

The accurate reconstruction of the internal temperature field in rotary kilns is critical for optimizing the clinker calcination process and ensuring energy efficiency. In this study, a rapid and high-fidelity surrogate modeling framework is proposed, utilizing snapshot ensembles generated by full-order Computational Fluid Dynamics (CFD) simulations to reconstruct the temperature field of the axial center section. The framework incorporates a symmetric Autoencoder (AE) coupled with a TabPFN network as its core components. Capitalizing on the kiln’s strong axial symmetry, this reduction–regression system efficiently maps the high-dimensional nonlinear thermodynamic topology of the central section into a compact low-dimensional latent manifold via AE, while utilizing TabPFN to establish a robust mapping between operating boundary conditions and these latent features. By leveraging the In-Context Learning (ICL) mechanism for prior-data fitting, TabPFN effectively overcomes the data scarcity inherent in high-cost CFD sampling. Predictive results demonstrate that the model achieves a coefficient of determination (R²) of 0.897 for latent feature regression, outperforming traditional algorithms by 6.53%. In terms of field reconstruction on the test set, the model yields an average temperature error of 15.31 K. Notably, 93.83% of the nodal errors are confined within a narrow range of 0–50 K, and the reconstructed distributions exhibit high consistency with the CFD benchmarks. Furthermore, compared to the hours required for full-scale simulations, the inference time is reduced to 0.45 s, representing a speedup of four orders of magnitude. Consequently, the predictive system demonstrates excellent accuracy and efficiency, serving as an effective substitute for traditional models to realize online monitoring and intelligent optimization.

Keywords:

rotary kiln; deep learning; numerical simulation; temperature field prediction

1. Introduction

The gas–solid multiphase coupling and unsteady combustion processes within rotary kilns exhibit significant strong nonlinearity, and the stability of their thermal regimes directly determines clinker quality and NO_x emissions [1,2]. Therefore, real-time perception of the full-field temperature is a prerequisite for realizing industrial intelligent control. However, existing discrete contact measurements (e.g., thermocouples [3]) or local infrared monitoring struggle to capture the complex three-dimensional temperature topology inside the kiln. Although Computational Fluid Dynamics (CFD) can reveal fine flow field details and heat transfer mechanisms [4,5,6], the enormous computational cost of full-order simulations makes them completely unable to meet the timeliness requirements for real-time monitoring in industrial sites.

In response to this computational bottleneck, data-driven surrogate modeling has emerged as a promising research frontier. Recent studies have extensively utilized machine learning and deep learning algorithms to construct soft sensors for key performance indicators. For instance, Huang et al. [7] combined Long Short-Term Memory (LSTM) networks with transfer learning to predict the operating temperature under short-time sample conditions. Similarly, Xu et al. [8] employed a Residual Network (ResNet) fused with Bi-directional Gated Recurrent Units (BiGRU) to forecast the clinker exit temperature, effectively addressing time-delay issues. In terms of traditional regression control, Yin et al. [9] proposed an adaptive Moving Window Partial Least Squares (MW-PLS) method for temperature monitoring in zinc rotary kilns, while Tian et al. [10] utilized Support Vector Machines (SVM) optimized by Particle Swarm Optimization (PSO) to track the calcination zone temperature. Furthermore, regarding pollutant emissions, Feng et al. [11] developed an Echo State Network (ESN) with modular outputs to predict NOx concentrations under multi-working conditions. However, the majority of these approaches treat the rotary kiln as a lumped parameter system, focusing solely on scalar point predictions (e.g., outlet temperature or average zone temperature). They fail to reconstruct the complex spatial topology of the full temperature field, which is critical for identifying local high-temperature hotspots, optimizing flame shapes, and preventing refractory failure.

To transcend the limitations of these scalar approaches and achieve high-resolution field reconstruction, Reduced-Order Models (ROM) have emerged as a critical research direction. By extracting the low-dimensional intrinsic modes of physical systems, ROMs seek a balance between calculation accuracy and efficiency. As a classical linear method, Proper Orthogonal Decomposition (POD) performs acceptably in steady-state or weakly nonlinear systems [12,13,14]. However, its nature based on linear projection makes it difficult to effectively characterize the nonlinear thermodynamic features associated with strong swirling flows and rapid reactions in rotary kilns. To cover sufficient energy, POD often requires an excessive number of modes, which directly limits its reconstruction accuracy.

In contrast, deep learning methods based on Autoencoders (AE) offer a new paradigm for the dimensionality reduction in complex flow fields due to their nonlinear manifold learning capabilities [15,16,17]. AEs can capture latent nonlinear correlations through an encoder-decoder architecture. However, a current challenge is that existing deep learning ROM research mostly relies on massive training data [18]. In industrial scenarios, acquiring high-fidelity CFD samples is extremely expensive, resulting in data scarcity. Traditional regression models are prone to overfitting under small-sample conditions, and hyperparameter optimization is time-consuming. This has become a key bottleneck restricting the application of deep learning methods for temperature field prediction in rotary kilns.

Addressing these challenges, this paper proposes a combined reduction-regression surrogate model integrating a symmetric Autoencoder (AE) and the TabPFN network. The main contributions are threefold: (1) a symmetric AE is constructed to compress the high-dimensional temperature field into a compact 6-dimensional latent manifold, effectively preserving complex flame topologies and gradients; (2) the TabPFN model, leveraging In-Context Learning (ICL), is introduced to bridge operating parameters and latent features, overcoming data scarcity inherent in industrial CFD without the need for hyperparameter tuning; (3) an end-to-end framework is established, which improves computational efficiency by four orders of magnitude compared to full-scale CFD, satisfying the strict requirements for real-time online monitoring.

The remainder of this paper is organized as follows: Section 2 details the physical modeling and full-order numerical framework, explicitly describing the model assumptions, governing equations solved via the Finite Volume Method (FVM), and specific sub-models for turbulence and combustion, followed by reliability validation against industrial data; Section 3 elaborates on the design and training strategy of the AE-TabPFN reduction-regression framework; Section 4 evaluates the model from three dimensions: reduction performance, regression accuracy, and end-to-end prediction effectiveness; and Section 5 summarizes the conclusions.

2. Full-Order Numerical Model of the Rotary Kiln Calcination Process

2.1. Overview of Physical Model and Burner Structure

This study investigates a

ϕ

4.5 m × 60 m industrial rotary kiln from a specific enterprise, as illustrated in Figure 1. Considering the influence of refractory bricks and the kiln coating in the burning zone, the effective inner diameter of the computational domain is set to 4.36 m. The rotary kiln system connects to a five-stage cyclone preheater and a calciner at the kiln tail; the pre-decomposed raw meal enters the rotary kiln through this inlet to complete the final calcination reaction. The heat source is provided by a five-channel coal burner located at the kiln head. The burner has an outlet diameter of 0.77 m and an effective insertion depth of 0.55 m into the kiln. The burner exhibits a concentric annular structure, arranged from the inside out as follows: the central air duct, inner swirl air duct, pulverized coal duct, outer swirl air duct, and axial air duct, as shown in Figure 2. The central air, inner swirl air, outer swirl air, and axial air collectively constitute the primary air system. Its core function is to enhance the dispersion of pulverized coal, control the flame shape and length, and improve combustion stability. The high-temperature secondary air from the grate cooler is induced through the kiln hood, serving as the primary oxidant for coal combustion.

2.2. Mesh Generation Strategy and Independence Verification

Given the large length-to-diameter ratio of the rotary kiln, this study utilizes Gambit software to generate a structured mesh for the computational fluid domain. Considering the strong shear turbulence and rapid chemical reactions in the near-burner region, which result in severe physical gradients, local refinement is implemented in this area to capture the core details of the flame. To eliminate the influence of grid size on the numerical simulation results, a grid independence test was conducted prior to the calculation. As shown in Figure 3, grid models with varying orders of magnitude were constructed for trial calculations, with the average temperature of the flue gas at the kiln outlet serving as the monitoring indicator. The results indicate that when the number of grid cells reaches 1.8 million, the fluctuation in the outlet temperature tends to converge. Balancing computational accuracy and efficiency, the 1.8-million-cell scheme was ultimately selected as the baseline model for the subsequent construction of the deep learning dataset. The global and local mesh topologies of the rotary kiln are shown in Figure 4 and Figure 5.

2.3. Numerical Model

2.3.1. Model Assumptions and Governing Equations

This paper employs ANSYS Fluent 2022R1 software to perform a full-scale numerical simulation of the gas–solid two-phase flow and heat and mass transfer processes within the rotary kiln. Given the complexity of the physicochemical processes inside the kiln, the following reasonable assumptions are made to balance computational accuracy and efficiency: (1) the system is in a steady-state operating condition, and the mass flow rate, composition of the clinker entering the kiln, and pulverized coal injection parameters remain constant; (2) the minor inclination of the kiln body and the transverse motion of particles are ignored; the clinker bed is treated as an isotropic porous medium, considering only its axial transport and reaction; (3) the direct shear effect of the kiln rotation on the main gas flow field is ignored, and the walls are treated as smooth boundaries. While these constraints preclude the analysis of transient phenomena, such as rapid load changes or kiln start-ups, they remain valid for the primary objective of this study: steady-state thermal monitoring. The simplified model effectively captures the dominant macroscopic heat transfer patterns required for online process control.

The numerical solution is based on the Finite Volume Method (FVM) and follows the laws of conservation of mass, momentum, and energy. To ensure numerical stability and reproducibility, the solution configuration is explicitly defined. Pressure–velocity coupling is handled using the SIMPLE algorithm to ensure robust convergence. For spatial discretization, the Standard scheme is utilized for pressure interpolation, while the Second-Order Upwind scheme is applied to the momentum, turbulent kinetic energy, and energy equations to minimize numerical diffusion.

To address the flow field characteristics of strong swirl and high curvature within the rotary kiln, the Realizable k-ε turbulence model is selected. Although the Reynolds Stress Model (RSM) is theoretically superior for strong swirling flows, it incurs prohibitive computational costs for generating large-scale datasets. Consequently, the Realizable k-ε formulation offers an optimal compromise, offering sufficient accuracy for engineering applications while maintaining computational efficiency. The particle phase is tracked using the Discrete Phase Model (DPM) under the Eulerian–Lagrangian framework. The particle size of the pulverized coal follows the Rosin–Rammler distribution. Two-way coupling of momentum, heat, and mass between the gas and solid phases is considered, while particle-particle collisions in the sparse particle flow are ignored. The P-1 model is adopted for radiation heat transfer.

2.3.2. Coal Combustion Reaction Kinetics

The chemical reactions for the coal combustion process are configured as follows: Gas-phase combustion adopts the Species Transport model, where the chemical reaction rate is controlled by the Finite-Rate/Eddy-Dissipation model, comprehensively considering the limitations imposed by both Arrhenius chemical kinetics and turbulent mixing on the reaction rate. The coal combustion process is decoupled into three stages: (1) Devolatilization: The Single-rate Model is adopted; (2) Gas-phase combustion: Volatile oxidation follows a five-step homogeneous reaction mechanism, with the reaction rate controlled by Kinetics/Diffusion-limited mechanisms; (3) Char combustion: A multiple surface reaction model and a three-step heterogeneous reaction are employed. Furthermore, a pollutant model is introduced to predict NO_x generation, comprehensively accounting for the formation and reburning mechanisms of Thermal NO_x, Fuel NO_x, and Prompt NO_x. The industrial analysis and elemental analysis of the coal are shown in Table 1. The particle size distribution of the pulverized coal conforms to the Rosin–Rammler distribution, with relevant parameters detailed in Table 2.

2.3.3. Clinker Calcination Coupling Model

The formation of clinker within the rotary kiln is a complex process involving multiphase phase changes and endothermic/exothermic reactions. After pre-decomposition, the raw meal enters the rotary kiln with primary components including CaO, SiO₂, Al₂O₃, and Fe₂O₃.Solid-phase reactions occur within the kiln, ultimately generating minerals such as 2CaO⋅SiO₂(C₂S), 3CaO⋅SiO₂(C₃S), 3CaO⋅Al₂O₃(C₃A) and 4CaO⋅Al₂O₃⋅Fe₂O₃(C₄AF). To accurately simulate this process, this study constructs a one-dimensional clinker bed model based on User-Defined Functions (UDFs) and integrates it as source terms to achieve two-way coupling with the CFD gas-phase flow field [19,20]. This model realizes heat and mass exchange between the gas phase and the material bed by solving the one-dimensional energy and species transport equations along the kiln axis. A schematic diagram of the internal heat exchange in the rotary kiln is shown in Figure 6.

The energy balance equation for the bed unit can be expressed as:

\frac{d ({\dot{m}}_{s} c_{p, s} T_{s})}{d x} = Q_{rad, g - b} + Q_{rad, w - b} + Q_{conv, g - b} + Q_{cond, w - b} + Q_{conv, w - b} + Q_{reac}

(1)

where

{\dot{m}}_{s}

is the mass flow rate of the clinker,

c_{p, s}

is the specific heat capacity of the clinker, and

T_{s}

is the clinker temperature. Subscripts

g

,

b

, and

ω

represent the gas phase, clinker phase, and wall within the rotary kiln;

Q_{rad}

,

Q_{c o n v}

, and

Q_{c o n d}

denote the heat transfer via radiation, convection, and conduction, respectively.

Q_{r e a c}

represents the net heat source from chemical reactions during clinker mineral formation. The chemical composition of the clinker at the rotary kiln inlet is listed in Table 3, and the relevant chemical reaction kinetic parameters are shown in Table 4.

2.3.4. Boundary Conditions

The boundary conditions of the computational domain are set based on actual operating conditions. Both the primary air and secondary air inlets adopt the Velocity Inlet boundary condition. Species mole fractions are set according to air composition, with the oxygen mole fraction set to 21% and the remainder as nitrogen, consistent with the actual conditions of air-assisted combustion. The rotary kiln outlet adopts the Pressure Outlet boundary condition; to simulate the negative pressure draft in actual operations, the outlet gauge pressure is set to −200 Pa.

The rotary kiln wall is treated as a constant temperature wall, set at 1200 K. The walls of the five-channel coal burner are set as adiabatic. The particle phase outlet is defined as escape, meaning that once particles leave the computational domain, they are considered to have flowed out and do not return, which aligns with actual gas–solid flow characteristics. Other simulation parameters for the baseline rated operating condition are listed in Table 5.

2.4. Numerical Model Validation

To verify the model’s fidelity across the actual operating range, validation was conducted against on-site DCS data under three distinct loads: Low, Rated, and High. Figure 7 displays the real-time monitoring interface for the baseline condition. As summarized in Table 6, the CFD predictions for flue gas temperature, exhaust O₂, and NO_x content align closely with field measurements. The relative errors for temperature are consistently maintained within 1.6%, while species concentration deviations are controlled below 5%. Given the typical measurement uncertainty of industrial sensors, these results confirm the high reliability of the established numerical framework.

3. AE-TabPFN Reduced-Order Model for Rotary Kiln Temperature Field

Although the full-order CFD model can accurately capture the complex heat transfer details within the rotary kiln, the computational time of several hours per simulation completely obstructs its application path in real-time optimization. To address this, this section constructs a “reduction-regression” two-layer data-driven framework: first, a fully connected Autoencoder (AE) is utilized to extract the low-dimensional nonlinear manifolds of the high-dimensional temperature field; subsequently, the TabPFN model is introduced to solve the mapping problem from operating parameters to latent features under small-sample constraints. This section will detail the implementation process and key architectural design of this framework.

3.1. Dataset Construction and Preprocessing

High-quality datasets are the foundation for building reduced-order models. To ensure the model truthfully reflects the thermal characteristics and coupling laws of the rotary kiln in actual production, this paper constructs a rotary kiln operating condition dataset directly from the historical operation database of the enterprise’s Distributed Control System (DCS).

First, the historical data recorded by the DCS is cleaned to eliminate data related to kiln startup and shutdown, sensor faults, and abnormal condition alarms. Adopting a steady-state screening strategy, 100 sets of typical stable operating data were selected. The statistical distribution of key operating parameters for these selected cases is detailed in Table 7. As evidenced by the wide span between the minimum and maximum values in the table, the constructed dataset comprehensively covers distinct capacity loads, thereby ensuring sufficient representativeness and diversity for the subsequent model training.

Using the experimentally validated full-order CFD model from Section 2, these 100 sets of real operating parameters are applied as boundary conditions for high-fidelity numerical simulations. Although the CFD simulations resolve the full three-dimensional flow field, given the cylindrical geometry of the rotary kiln’s internal flow domain and its resulting strong axial symmetry, the temperature node data on the axial center section is extracted using ANSYS CFD-Post. This dimensional reduction strategy effectively balances the trade-off between computational cost and reconstruction accuracy while retaining the representative thermal characteristics. These sectional temperature fields are discretized and flattened into column vectors to construct the snapshot matrix

T \in ℝ^{D \times N}

, where

D = 31410

denotes the precise number of mesh nodes on the center plane and

N = 100

is the total number of samples.

Since there is a huge difference in physical dimensions and orders of magnitude between input operating variables and output temperature fields, z-score normalization is employed to process the input feature matrix and temperature field snapshot matrix before model training to accelerate neural network convergence and eliminate numerical singularities, ensuring they conform to a standard normal distribution with zero mean and unit variance:

x^{'} = \frac{x - μ}{σ}

(2)

where

x

is the raw data,

μ

is the sample mean,

σ

is the sample standard deviation, and

x^{'}

is the normalized data.

To guarantee that the training and testing sets share consistent statistical distributions and to prevent data bias, a K-means clustering-based stratified sampling strategy was adopted for dataset partitioning. Specifically, the 100 samples were first clustered into distinct groups based on the Euclidean distance of their normalized operating parameters. Subsequently, samples were randomly selected from each cluster to form the training set and the testing set according to an 8:2 ratio. This strategy ensures that the testing set covers the entire operating space, including both common and extreme boundary conditions. The detailed operating parameters for the training set are presented in Table 8, while those for the testing set are listed in Table 9.

3.2. Autoencoder-Based Nonlinear Dimensionality Reduction

Addressing the high dimensionality of 31,410 distinct temperature nodes and the strong spatial correlation in the rotary kiln data, this paper constructs a five-layer symmetric fully connected Autoencoder [21]. The specific structure is shown in Figure 8a, and the design of the fully connected layers is illustrated in Figure 8b. The model consists of an encoder and a decoder connected in series. First, the encoder compresses the high-dimensional temperature field input

x

into a low-dimensional latent space through multi-layer nonlinear mapping to obtain the latent feature vector

z

. The decoder acts as the inverse process of the encoder, mapping the feature

z

back to the original high-dimensional space to output the reconstructed temperature field

\hat{x}

. The mathematical formulation is expressed as:

z = R e L U (W_{e} x + b_{e})

(3)

\hat{x} = W_{d} z + b_{d}

(4)

where

W

and

b

represent the weight matrix and bias vector, respectively. Subscripts

e

and

d

denote the encoder and decoder. To capture complex nonlinear features while avoiding the vanishing gradient problem,

R e L U (\cdot)

adopts the ReLU activation function in the hidden layers. The output layer of the decoder utilizes a Linear activation to precisely reconstruct the continuous temperature values.

The network input and output dimensions correspond directly to the temperature field mesh scale of 31,410 points. As detailed in Table 10, the architecture follows a symmetric encoder-decoder design where the intermediate hidden layers exhibit a layer-by-layer decreasing distribution. The encoder progressively compresses the high-dimensional input from 31,410 down to 1024, 512, 256, and 128 nodes, finally reaching the latent dimension. To efficiently capture nonlinear flow field characteristics and prevent overfitting on the limited CFD dataset, Batch Normalization and Dropout with a rate of 0.05 are integrated into every fully connected block, with the exception of the latent and final output layers. This configuration ensures stable normalization statistics and forces the network to learn robust feature representations. Furthermore, the ReLU activation function is applied across all hidden layers to handle nonlinearity. This stepwise compression strategy minimizes information loss while aligning with standard GPU memory optimization practices. The training objective is to minimize the Mean Squared Error between the input and reconstructed temperature fields:

L = \frac{1}{N} \sum_{i = 1}^{N} {‖x^{(i)} - {\hat{x}}^{(i)}‖}^{2}

(5)

where

L

is the mean squared error expression,

N

is the sample batch size,

x^{(i)}

is the ground truth temperature vector, and

{\hat{x}}^{(i)}

is the predicted value reconstructed by the autoencoder.

Model training relies on the PyTorch 2.9.0 deep learning framework and employs the AdamW optimizer for parameter updates. The initial learning rate is set to 1 × 10⁻³. To dynamically adjust the learning rate and avoid trapping in local optima, the model incorporates an adaptive learning rate scheduler known as ReduceLROnPlateau. This mechanism monitors the training loss and automatically halves the learning rate if no improvement is observed for a patience of 20 epochs. The training process sets the maximum number of epochs to 500 and the Batch Size to 64 to ensure the stability and generalization of the training process.

3.3. TabPFN Low-Dimensional Feature Regression Prediction Model

After obtaining the low-dimensional latent representation of the temperature field, the next critical step is to establish the mapping from operating parameters

x

to latent features

z

. Given the high cost of CFD sample generation, the dataset is inherently small (

N = 100

), which poses a risk of overfitting for traditional regressors that require iterative parameter optimization. To address this, this paper introduces the TabPFN (Tabular Prior-Data Fitted Network) model [22].

Unlike conventional machine learning paradigms that store knowledge in model weights through extensive training, TabPFN leverages an In-Context Learning (ICL) mechanism. It reformulates the regression task as an approximate Bayesian inference process, utilizing a Transformer network pre-trained on massive synthetic prior data to directly approximate the Posterior Predictive Distribution (PPD).

Operationally, in this study, we employ version 2.2.1 of the TabPFNRegressor interface, where the regression task is executed as a direct set-to-vector inference process without traditional gradient-based training or hyperparameter tuning. The model processes unscaled raw inputs directly, fitting the entire dataset within its standard context window. During the prediction phase for a new operating condition

x_{t e s t}

, the entire training set is fed into the network as a context prompt

D_{t r a i n} = {(x_{i}, z_{i})}_{i = 1}^{80}

. The model employs a multi-head attention mechanism to aggregate information from these context examples and predicts the target value in a single forward pass.

Furthermore, since the standard TabPFN architecture is designed for scalar regression, a multi-target strategy is adopted to handle the multi-dimensional latent manifold. Six independent inference processes are executed, each predicting a specific latent dimension

z_{k}

for indices

k

ranging from 1 to 6 based on the same input operating parameters. This approach effectively bridges the gap between limited industrial data and high-fidelity feature regression, offering robust generalization performance superior to traditional kernel-based or tree-based methods.

To ensure the reproducibility of the proposed framework and to clarify the integration logic between the dimensionality reduction and regression modules, the complete end-to-end computational procedure is summarized in Algorithm 1. This algorithm explicitly details the three sequential phases: data preprocessing, offline training of the Autoencoder and TabPFN, and the online inference workflow for temperature field reconstruction.

Algorithm 1: Autoencoder-TabPFN Model Training and Inference Procedure

Input: Dataset

D = {(x^{(i)}, T^{(i)})}_{i = 1}^{N}

, Boundary Conditions

x

, Temperature Field

T

, Hyperparameters: Latent dim

d

, Epochs

K

, Batchsize

B

Output: Reconstructed Temperature Field

T_{r e c}

for new condition

x_{n e w}

//Step 1. Dimensionality Reduction (Autoencoder)
1: Initialize Encoder

E

and Decoder

G

parameters
2: for epoch

k

= 1 to

K

do
3: Sample mini-batch fields

T_{b a t c h}

from

D

4: Compute Latent

z_{b a t c h} = E (T_{b a t c h})

and Reconstruct

{\hat{T}}_{b a t c h} = G (z_{b a t c h})

5: Update

E

,

G

by minimizing Loss

L = ‖T_{b a t c h} - {\hat{T}}_{b a t c h}‖

6: end for
//Step 2. Latent Space Regression (TabPFN)
7: Extract latent vectors

Z = E (T_{a l l})

for the entire dataset
8: Construct regression pairs: Input

X_{a l l}

, Target

Z

9: Fit TabPFN model

M

to map

x \to z

:

M \leftarrow F i t (X_{a l l}, Z)

//Step 3. Online Inference
10: Function Inference

(x_{n e w})

:
11: Predict latent feature

\hat{z} = M (x_{n e w})

12: Reconstruct field

T_{r e c} = G (\hat{z})

13: return

T_{r e c}

14: end Function

4. Results and Discussion

This section focuses on evaluating the performance and engineering applicability of the constructed Autoencoder-TabPFN model in predicting the rotary kiln temperature field. First, the optimal latent dimension is determined by balancing the trade-off between the feature compression ratio and reconstruction fidelity based on reconstruction errors. Subsequently, the flow field restoration capability of the Autoencoder and the feature regression accuracy of TabPFN are tested independently. Finally, the model is strictly compared with full-order CFD simulations through end-to-end full-field prediction experiments, assessing it from the three dimensions of accuracy, efficiency, and robustness.

4.1. Determination of Optimal Latent Feature Dimension

The dimension of the latent space is a critical hyperparameter governing the trade-off between feature extraction capability and computational redundancy. A dimension that is too low leads to the loss of high-frequency flow details, while an excessively high dimension introduces noise and weakens the regression robustness. Therefore, keeping the network depth fixed, we performed a sensitivity analysis on the latent dimension

d

. As illustrated in Figure 9, the reconstruction error shows a steep decline when

d

is less than 6, indicating the rapid capture of primary topological structures. However, beyond a dimension of 6, the marginal accuracy gain diminishes significantly and presents a distinct convergence plateau. Consequently, we lock the optimal dimension at 6 to efficiently encode the complex nonlinear thermodynamic information.

To further justify the architectural selection of the proposed symmetric Autoencoder, we conducted an ablation study comparing it against two representative dimensionality reduction methods: POD and CNN Autoencoder. For the comparative convolutional model, we implemented a deep architecture consisting of five layers with a one-dimensional kernel size of 5 to adapt to the feature sequence. All models were configured with a fixed latent dimension of 6 and trained on the identical dataset to ensure a fair evaluation.

As summarized in Figure 10, POD yields the highest reconstruction error with an MSE of 0.186. This confirms that linear decomposition methods fail to adequately capture the highly nonlinear features of the combustion flow field. Furthermore, despite the optimized deep architecture, the Convolutional Autoencoder performed sub-optimally in this study with an MSE of 0.154. This performance gap is attributed to the limited sample size of 100 and the specific topological nature of the temperature nodes, where the inductive bias of convolution layers may lead to overfitting or inefficient feature extraction compared to direct mapping. In contrast, the proposed Autoencoder achieves the lowest reconstruction error of 0.144, demonstrating that the fully connected architecture is the most robust and efficient choice for compressing the high-dimensional temperature field in this specific sparse-data scenario.

4.2. Validation of Autoencoder Reconstruction Performance

To verify the feature preservation capability and flow field restoration accuracy of the Autoencoder at the optimal latent dimension (

d = 6

), a representative operating condition (case = 81) was selected from the test set for reconstruction testing. Figure 11 displays the comparison of the original temperature field calculated by CFD, the reconstructed temperature field output by the AE decoder, and the error temperature field contour plots.

It can be seen in Figure 11a,b that the temperature field reconstructed by the Autoencoder exhibits high consistency with the original CFD temperature field. Specifically, in the high-temperature core region of the flame, the Autoencoder accurately reproduces the shape, length, and spatial position of the high-temperature flame formed by pulverized coal combustion. It also accurately captures the temperature of the clinker calcination process, indicating that the reconstructed temperature field maintains extremely high fidelity. Meanwhile, judging from the error comparison in Figure 11c, the Autoencoder inevitably loses some high-frequency details. Specifically, the reconstruction error is relatively larger near the pulverized coal nozzle of the burner and in the strong turbulence shear region at the flame tail. This is primarily attributed to the high physical gradients in these tail regions.

4.3. Comparative Analysis of Regression Algorithms

To comprehensively evaluate the superiority of the TabPFN model in the latent feature regression task, this paper rigorously compares it with four machine learning regression algorithms, including XGBoost [23], Least Squares Support Vector Machine (LSSVM) [24], and the standard Multi-Layer Perceptron (MLP) [25]. To ensure fairness in the comparison, hyperparameter optimization was performed for all baseline models using Grid Search, whereas TabPFN directly employed its default settings, leveraging its pre-trained prior-data fitting characteristics. Table 11 lists the average performance metrics of each model on the test set. The results show that TabPFN achieved optimal performance in two key indicators: the Coefficient of Determination (R²) and Mean Squared Error (MSE). Specifically, the R² of TabPFN reached 0.897, an improvement of 6.53% over the second-best performing LSSVM. This is primarily attributed to the fact that traditional models often struggle to capture subtle nonlinear distribution laws when dealing with continuous latent features compressed from high-dimensional manifolds, whereas TabPFN can utilize its In-Context Learning (ICL) mechanism to more precisely fit the complex mapping relationship between operating parameters and latent features.

The multi-dimensional performance radar chart shown in Figure 12 covers core evaluation metrics including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and 1 − R². It can be clearly observed from the figure that the area enclosed by the metric lines of the TabPFN model is the smallest, and it lies in the innermost layer across all dimensions. This implies that it not only possesses the highest prediction accuracy but is also significantly superior to baseline models like MLP and LSSVM in terms of error control robustness. Furthermore, Figure 13 displays the comparison between the regression predicted values of TabPFN for the 6 latent dimensions in the test set and the true values. It can be seen that the prediction curve of TabPFN exhibits high consistency with the Ground Truth latent features. It not only fits closely in stable intervals but also precisely captures the sharp fluctuations and extreme points in the feature curves. This indicates that the model has successfully established a precise mapping between operating boundary conditions and low-dimensional manifold features, ensuring that the subsequent decoder can reconstruct a high-fidelity rotary kiln temperature field based on high-quality latent features.

4.4. Assessment of Overall Model Prediction Performance and Computational Efficiency

Based on the trained AE Autoencoder and TabPFN regression network, this paper constructs an end-to-end rapid prediction model (AE-TabPFN) mapping operating boundaries to the full-field temperature distribution. To comprehensively test the generalization capability of this model under unknown operating conditions, the boundary conditions of Case 81 from the test set were selected for validation. This choice ensures consistency with the previous AE reconstruction analysis, allowing for a direct assessment of the end-to-end prediction performance under the same operational context.

Figure 14 displays the contour plots of the CFD full-order calculation ground truth, the AE-TabPFN model prediction, and the absolute error distribution between them. From the comparison of macroscopic flow field morphology (Figure 14a,b), the overall model exhibits extremely high reconstruction fidelity. The predicted temperature field accurately reproduces the complex nonlinear thermodynamic structure within the rotary kiln. It not only clearly depicts the geometric shape, length, and spatial position of the high-temperature flame core zone at the burner nozzle but also precisely captures the temperature gradient distribution trend that gradually attenuates along the axial direction toward the kiln tail. Figure 14c shows the contrast error of the predicted temperature field under this working condition. As can be seen, the prediction deviation in the vast majority of the computational domain is controlled at an extremely low level. Relatively significant local errors are mainly concentrated in the near-field of the burner nozzle and the flame tail region. This is due to the intense gas–solid turbulent mixing and rapid chemical reactions in this area, resulting in extremely high physical field gradients, which leads to a slight loss of high-frequency information during the Autoencoder’s compression. However, for the high-temperature zone of the critical burning zone that determines clinker calcination quality, the agreement between the model’s predicted values and the true values is equally high, verifying its reliability in engineering applications.

Figure 15 provides statistics on the average reconstruction error of the entire test set. From the perspective of the average reconstruction error across the entire test set, for the axial section of the rotary kiln, the proportion of the overall average error falling within the 0–50 K range is as high as 93.83%, thereby proving the reliability of this reduced-order regression surrogate model.

To further quantify the model’s generalization performance, Table 12 details the statistical metrics for the test set, with the last row providing the ensemble mean for all 20 cases. The results demonstrate that the model achieves robust global accuracy. While the maximum average deviation is 284.15 K, it is noteworthy that these extreme errors are physically confined to the highly turbulent region near the burner. Therefore, the prediction accuracy for the critical calcination region remains high, ensuring the model’s applicability in industrial monitoring.

In addition, computational efficiency is a key indicator for measuring the practicality of the model. All computations were performed under the same hardware configuration (Intel Core i9-10900K CPU @ 3.70 GHz). Table 13 details the computational cost breakdown, explicitly distinguishing between the Offline development phase and the Online application phase. The results indicate that the offline phase incurs a substantial one-time sunk cost of approximately 400 h for Dataset Generation and 284.9 s for Model Training. However, this investment is justified by the performance in the online phase. While a standard CFD simulation typically requires 4 h per case, the Model Inference of the proposed surrogate takes only 0.45 s. This extreme computational efficiency, representing a speedup of four orders of magnitude without compromising engineering accuracy, effectively overcomes the real-time bottleneck of traditional numerical simulations, offering promising technical support for the online monitoring and intelligent optimization of the rotary kiln production process.

5. Conclusions

This paper proposes a reduced-order surrogate model fusing CFD numerical simulation, a fully connected Autoencoder (AE), and the TabPFN network. The high-fidelity full-order CFD model is utilized to reconstruct industrial DCS operating conditions, generating a dataset of temperature field snapshots. The AE is employed to extract the low-dimensional nonlinear manifolds of the high-dimensional flow field. TabPFN is selected as the regression algorithm to establish the mapping between operating parameters and latent features under small-sample constraints, thereby achieving rapid prediction of the full-field temperature in the rotary kiln. The results indicate that this model accurately characterizes the complex combustion and heat transfer mechanisms within the rotary kiln, confirming its feasibility as a substitute for expensive numerical simulations.

(1) The proposed AE reduced-order model excellently characterizes the topological features of the complex temperature field in the rotary kiln, particularly the geometric morphology of the high-temperature flame core and the axial gradient. Compared with traditional linear methods, it achieves efficient feature compression and high-fidelity flow field reconstruction while retaining only 6 latent dimensions, significantly reducing the dimensionality and complexity of the problem.

(2) TabPFN effectively constructs the nonlinear mapping between input operating conditions and AE latent features. The study finds that in typical scenarios of industrial CFD data scarcity, TabPFN, leveraging its prior-data fitting mechanism, exhibits generalization accuracy and robustness superior to traditional algorithms such as LSSVM and XGBoost, without the need for hyperparameter fine-tuning.

(3) This method provides an effective paradigm for the deep integration of deep learning and numerical simulation. The CFD model provides the necessary high-fidelity physical field data support for the learning algorithm, while the surrogate model extracts thermodynamic evolution laws from it, thereby directly solving specific engineering problems in a data-driven manner.

(4) Traditional CFD simulations struggle to meet the real-time requirements of online monitoring and optimization control for rotary kilns due to the enormous computational burden of iterative calculations. This method improves computational efficiency by four orders of magnitude, providing a foundation with both high accuracy and high efficiency for real-time monitoring and intelligent control in the process industry.

Limitations and Future Directions:

Admittedly, the current framework relies on steady-state assumptions, neglecting secondary flows induced by kiln rotation and local wall fluctuations. Consequently, the model is strictly limited to the optimization of stable production phases and is unsuitable for transient processes like start-up. Future work will focus on two directions: first, employing active learning strategies to autonomously select informative samples, thereby minimizing the high computational burden of offline CFD data generation; second, embedding thermodynamic constraints directly into the loss function to develop a physics-informed architecture, ensuring physical consistency and improving generalization capability across unobserved regimes.

Author Contributions

Conceptualization, Y.M. and Y.L. (Yuhang Li); methodology, Y.L. (Yuhang Li) and Y.M.; software, Y.L. (Yuhang Li); validation, Y.L. (Yuhang Li), Y.L. (Yanhui Lai) and Y.M.; formal analysis, Y.L. (Yuhang Li); investigation, Y.L. (Yuhang Li) and F.F.; resources, Y.M.; data curation, Y.L. (Yuhang Li), Y.L. (Yanhui Lai) and F.F.; writing—original draft preparation, Y.L. (Yuhang Li); writing—review and editing, Y.M.; visualization, Y.L. (Yuhang Li); supervision, Y.M.; project administration, Y.M.; funding acquisition, Y.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Saidur, R.; Hossain, M.S.; Islam, M.R.; Fayaz, H.; Mohammed, H.A. A Review on Kiln System Modeling. Renew. Sustain. Energy Rev. 2011, 15, 2487–2500. [Google Scholar] [CrossRef]
Atmaca, A.; Yumrutas, R. Analysis of the Parameters Affecting Energy Consumption of a Rotary Kiln in Cement Industry. Appl. Therm. Eng. 2014, 66, 435–444. [Google Scholar] [CrossRef]
Liu, X.Y.; Specht, E. Temperature Distribution within the Moving Bed of Rotary Kilns: Measurement and Analysis. Chem. Eng. Process. Process Intensif. 2010, 49, 147–150. [Google Scholar] [CrossRef]
Liu, H.; Yin, H.; Zhang, M.; Xie, M.; Xi, X. Numerical Simulation of Particle Motion and Heat Transfer in a Rotary Kiln. Powder Technol. 2016, 287, 239–247. [Google Scholar] [CrossRef]
Wang, M.; Liao, B.; Liu, Y.; Wang, S.; Qing, S.; Zhang, A. Numerical Simulation of Oxy-Coal Combustion in a Rotary Cement Kiln. Appl. Therm. Eng. 2016, 103, 491–500. [Google Scholar] [CrossRef]
Jian, Q.; Gu, H.; Wang, K.; Wang, S.; Zhan, M.; Wang, J.; Ji, L.; Chi, Z.; Zhang, G. Numerical Study of Particle Behaviours and Heat Transfer in a Complex Rotary Kiln. Particuology 2024, 92, 81–94. [Google Scholar] [CrossRef]
Huang, G.; Ma, Y.; Gao, X.; Zhao, A.; Liu, P. Rotary Kiln Temperature Prediction Based on Transfer Learning and LSTM-AE with Short Time Samples. Case Stud. Therm. Eng. 2025, 76, 7477. [Google Scholar] [CrossRef]
Xu, X.; Yang, H.; Xu, K.; Yin, S.; Wang, Z.; Zhu, C.; Song, C. Cement Rotary Kiln Temperature Prediction Based on Time-Delay Calculation and Residual Network and Bidirectional Novel Gated Recurrent Unit Multi-Model Fusion. Measurement 2023, 218, 113123. [Google Scholar] [CrossRef]
Yin, Y.; Liu, Y.; Liang, X.; Luo, W.; Yang, C.; Gui, W. Adaptive Data-Driven Soft Sensor for Monitoring and Prediction of Temperature inside Zinc Rotary Kiln. IEEE Sens. J. 2025, 25, 15276–15294. [Google Scholar] [CrossRef]
Tian, Z.; Li, S.; Wang, Y.; Wang, X. SVM Predictive Control for Calcination Zone Temperature in Lime Rotary Kiln with Improved PSO Algorithm. Trans. Inst. Meas. Control 2018, 40, 3134–3146. [Google Scholar] [CrossRef]
Feng, J.; Li, F.; Shen, X.; Xue, S.; Cheng, S. NOx Concentration Prediction for Cement Rotary Kiln under Multiworking Conditions Based on an Echo State Network with Modular Output. IEEE Trans. Instrum. Meas. 2025, 74, 3593308. [Google Scholar] [CrossRef]
Lan, H.; Tang, W.; Gong, J.; Zhang, Z.; Xu, X. Fast Prediction of Temperature Distributions in Oil Natural Air Natural Transformers Using Proper Orthogonal Decomposition Reduced-Order Data-Driven Modelling. High Volt. 2024, 9, 1246–1259. [Google Scholar] [CrossRef]
Premaratne, P.; Tian, W.; Hu, H. A Proper-Orthogonal-Decomposition (POD) Study of the Wake Characteristics behind a Wind Turbine Model. Energies 2022, 15, 3596. [Google Scholar] [CrossRef]
Sui, X.; Djidjeli, K.; Sun, Z.; Xing, J.T. Reduced Order Modeling (ROM) Based Method for the Two-Dimensional Water Exit Problem Using Snapshot Proper Orthogonal Decomposition (POD) and CFD Simulations. Appl. Ocean Res. 2025, 161, 104697. [Google Scholar] [CrossRef]
Banihashemi, F.; Weber, M.; Lang, W. Model Order Reduction of Building Energy Simulation Models Using a Convolutional Neural Network Autoencoder. Build. Environ. 2022, 207, 108498. [Google Scholar] [CrossRef]
Romor, F.; Stabile, G.; Rozza, G. Non-Linear Manifold Reduced-Order Models with Convolutional Autoencoders and Reduced over-Collocation Method. J. Sci. Comput. 2023, 94, 74. [Google Scholar] [CrossRef]
Liu, M.; Grana, D.; de Figueiredo, L.P. Uncertainty Quantification in Stochastic Inversion with Dimensionality Reduction Using Variational Autoencoder. Geophysics 2022, 87, M43–M58. [Google Scholar] [CrossRef]
Chen, X.W.; Lin, X. Big Data Deep Learning: Challenges and Perspectives. IEEE Access 2014, 2, 514–525. [Google Scholar] [CrossRef]
Pieper, C.; Liedmann, B.; Wirtz, S.; Scherer, V.; Bodendiek, N.; Schaefer, S. Interaction of the Combustion of Refuse Derived Fuel with the Clinker Bed in Rotary Cement Kilns: A Numerical Study. Fuel 2020, 266, 117048. [Google Scholar] [CrossRef]
Bisulandu, B.J.R.M.; Huchet, F. Rotary Kiln Process: An Overview of Physical Mechanisms, Models and Applications. Appl. Therm. Eng. 2023, 221, 119637. [Google Scholar] [CrossRef]
Zhang, C.; Geng, Y.; Han, Z.; Liu, Y.; Fu, H.; Hu, Q. Autoencoder in Autoencoder Networks. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 2263–2275. [Google Scholar] [CrossRef]
Hollmann, N.; Mueller, S.; Purucker, L.; Krishnakumar, A.; Koerfer, M.; Bin Hoo, S.; Schirrmeister, R.T.; Hutter, F. Accurate Predictions on Small Data with a Tabular Foundation Model. Nature 2025, 637, 319–326. [Google Scholar] [CrossRef] [PubMed]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
Wang, B.; Yu, A.; Wang, H.; Liu, J. Modeling and Optimization of an Enhanced Soft Sensor for the Fermentation Process of Pichia Pastoris. Sensors 2024, 24, 3017. [Google Scholar] [CrossRef] [PubMed]
Zhao, Z.; Xu, S.; Kang, B.H.; Kabir, M.M.J.; Liu, Y.; Wasinger, R. Investigation and Improvement of Multi-Layer Perceptron Neural Networks for Credit Scoring. Expert Syst. Appl. 2015, 42, 3508–3516. [Google Scholar] [CrossRef]

Figure 1. Rotary Kiln Physical Diagram.

Figure 2. Structure of the five-channel burner.

Figure 3. Grid independence test.

Figure 4. Computational fluid domain of the rotary kiln.

Figure 5. Mesh topology of the rotary kiln.

Figure 6. Internal heat transfer mechanisms.

Figure 7. Real−time DCS monitoring interface.

Figure 8. Network architecture: (a) Five-layer symmetric Autoencoder; (b) Fully connected layer structure.

Figure 9. Reconstruction error vs. latent dimension.

Figure 10. Reconstruction performance of different models.

Figure 11. Reconstruction comparison (Case 81): (a) CFD ground truth; (b) AE reconstruction; (c) Absolute error.

Figure 12. Performance radar chart of regression models.

Figure 13. TabPFN predictions of latent features vs. ground truth.

Figure 14. End-to-end prediction (Case 81): (a) CFD ground truth; (b) AE-TabPFN prediction; (c) Absolute error.

Figure 15. Distribution of average reconstruction errors on the test set.

Table 1. Proximate and ultimate analysis of coal.

Ultimate Analysis/%					Proximate Analysis/%				Q_net,ad/(kJ/kg)
C	H	O	N	S	M	V	FC	A	Q_net,ad/(kJ/kg)
77.18	4.88	16.18	1.6	0.16	5.22	31.3	56.06	7.42	2.95 × 10⁴

Table 2. Rosin–Rammler parameters for pulverized coal.

Parameter	Value
Min.Diameter	5 × 10⁻⁵ m
Max.Diameter	1.8 × 10⁻⁴ m
Mean.Diameter	1 × 10⁻⁴ m
Spread Parameter	1.4

Table 3. Chemical composition of raw meal.

Components	Content (wt%)
CaO	69.32
SiO₂	22.18
Al₂O₃	4.25
Fe₂O₃	4.25

Table 4. Kinetic parameters of chemical reactions.

NO.	Reactions	$A$ (1/s)	$E$ $(kJ / kmol$ )	$∆ H (kJ / kg$ )
1	$2 Cao + {SiO}_{2} \to 2 Cao \cdot {SiO}_{2}$	$0.5 \times 10^{- 4}$	$2.30 \times 10^{5}$	−732.00
2	$2 Cao \cdot {SiO}_{2} + CaO \to 3 Cao \cdot {SiO}_{2}$	$2.1 \times 10^{7}$	$4.50 \times 10^{5}$	+59.00
3	$3 CaO + {Al}_{2} O_{3} \to 3 CaO \cdot {Al}_{2} O_{3}$	$8.0 \times 10^{- 4}$	$3.10 \times 10^{5}$	−33.36
4	$\begin{matrix} 4 CaO + {Al}_{2} O_{3} + {Fe}_{2} O_{3} \to \\ 4 CaO \cdot {Al}_{2} O_{3} \cdot {Fe}_{2} O_{3} \end{matrix}$	$7.0 \times 10^{- 10}$	$3.15 \times 10^{5}$	−71.27

Table 5. Boundary conditions for numerical simulation.

Boundary Type	Value	Temperature/K
Coal flow	33 m/s	320
Coal feed	2.6 kg/s	-
Axial flow	31 m/s	340
Internal swirling flow	15 m/s	340
External swirling flow	25 m/s	340
Central flow	2 m/s	340
Secondary flow	7 m/s	1173

Table 6. Validation of simulation results against on-site data.

Condition	Indicator	True Value	Simulated Value	Relative Error
Low Load	Flue Gas Temperature/K	1318.5	1303.23	1.16%
	Exhaust O₂ Content/%	2.80	2.68	4.29%
	Exhaust NO_x Content/mg·m⁻³	730	705	3.42%
Rated Load	Flue Gas Temperature/K	1344.15	1328.3	1.18%
	Exhaust O₂ Content/%	2.42	2.3	4.96%
	Exhaust NO_x Content/mg·m⁻³	619	598.7	3.28%
High Load	Flue Gas Temperature/K	1368.0	1346.7	1.55%
	Exhaust O₂ Content/%	1.60	1.53	4.38%
	Exhaust NO_x Content/mg·m⁻³	541	520	3.88%

Table 7. Range of key operating parameters.

Boundary Type	Max Value	Min Value
Coal flow	34.6 m/s	31.5 m/s
Coal feed	2.7 kg/s	2.5 kg/s
Axial flow	32.5 m/s	29.5 m/s
Internal swirling flow	16.5 m/s	13.5 m/s
External swirling flow	27.5 m/s	22.5 m/s
Central flow	2.2 m/s	1.8 m/s
Secondary flow	7.5 m/s	6.3 m/s

Table 8. Operating conditions of the training set.

Case No.	Coal Air Vel (m/s)	Axial Air Vel (m/s)	Inner Swirl Vel (m/s)	Outer Swirl Vel (m/s)	Center Vel (m/s)	Sec. Air Vel (m/s)	Coal Feed Rate (kg/s)
1	32.2	31.5	16.0	26.8	2.0	6.5	2.6
2	34.2	31.9	16.0	24.0	1.9	6.7	2.5
3	31.4	30.2	15.9	24.6	1.9	7.1	2.6
⁝	⁝	⁝	⁝	⁝	⁝	⁝	⁝
80	32.6	29.5	14.5	26.9	1.8	6.7	2.6

Table 9. Operating conditions of the test set.

Case No.	Coal Air Vel (m/s)	Axial Air Vel (m/s)	Inner Swirl Vel (m/s)	Outer Swirl Vel (m/s)	Center Vel (m/s)	Sec. Air Vel (m/s)	Coal Feed Rate (kg/s)
81	32.1	30.9	14.1	27.0	2.2	6.8	2.5
82	34.1	30.8	15.6	22.8	2.0	7.4	2.6
83	31.7	29.7	15.2	22.8	1.8	7.5	2.5
⁝	⁝	⁝	⁝	⁝	⁝	⁝	⁝
100	32.6	30.7	15.0	25.5	2.1	7.6	2.5

Table 10. Detailed configuration of Autoencoder layers.

Module	Layer Block	Input Dim	Output Dim	Activation	Regularization
Encoder	FC Block1	31410	1024	ReLU	BN + Dropout (0.05)
	FC Block2	1024	512	ReLU	BN + Dropout (0.05)
	FC Block3	512	256	ReLU	BN + Dropout (0.05)
	FC Block4	256	128	ReLU	BN + Dropout (0.05)
	FC Block5	128	Latent Dim	Linear	-
Decoder	FC Block6	Latent Dim	128	ReLU	BN + Dropout (0.05)
	FC Block7	128	256	ReLU	BN + Dropout (0.05)
	FC Block8	256	512	ReLU	BN + Dropout (0.05)
	FC Block9	512	1024	ReLU	BN + Dropout (0.05)
	FC Block10	1024	31410	Linear	-

Table 11. Performance comparison: TabPFN vs. baseline algorithms.

Model	MSE	MAE	RMSE	R²
LSSVM	0.207	0.367	0.447	0.842
MLP	0.282	0.432	0.532	0.831
XGBoost	0.268	0.411	0.522	0.784
TabPFN	0.142	0.281	0.322	0.897

Table 12. Statistical error metrics of the test set.

Case	Mean (K)	Median (K)	Maximum (K)	Average Error (%)
Case 81	10.43	4.41	279.11	1.23
Case 82	11.02	4.86	265.74	1.18
Case 83	9.87	4.02	241.63	1.15
Case 84	12.34	5.12	298.45	1.31
Case 85	10.95	4.58	272.18	1.22
Case 86	13.21	5.76	310.64	1.38
Case 87	9.54	3.91	228.37	1.12
Case 88	11.68	4.97	287.53	1.27
Case 89	12.89	5.34	305.29	1.35
Case 90	10.12	4.19	251.84	1.19
Case 91	11.47	4.88	289.76	1.26
Case 92	13.76	6.02	322.41	1.44
Case 93	12.15	5.09	296.88	1.3
Case 94	15.92	7.41	336.58	1.98
Case 95	11.24	4.73	278.66	1.24
Case 96	10.61	4.35	263.17	1.2
Case 97	12.47	5.21	301.92	1.33
Case 98	9.96	4.08	246.39	1.16
Case 99	11.83	4.99	291.07	1.28
Case 100	13.08	5.68	315.44	1.39
Overall mean	11.72	4.98	284.15	1.30

Table 13. Computational time: Full-order CFD vs. AE-TabPFN.

Phase	Task	Time Cost
Offline	Dataset Generation (CFD)	400 h (4 h × 100)
	Model Training (AE + TabPFN)	284.9 s
Online	Model Inference (AE + TabPFN)	0.45 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mao, Y.; Li, Y.; Lai, Y.; Fan, F. A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN. Appl. Sci. 2026, 16, 2029. https://doi.org/10.3390/app16042029

AMA Style

Mao Y, Li Y, Lai Y, Fan F. A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN. Applied Sciences. 2026; 16(4):2029. https://doi.org/10.3390/app16042029

Chicago/Turabian Style

Mao, Ya, Yuhang Li, Yanhui Lai, and Fangshuo Fan. 2026. "A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN" Applied Sciences 16, no. 4: 2029. https://doi.org/10.3390/app16042029

APA Style

Mao, Y., Li, Y., Lai, Y., & Fan, F. (2026). A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN. Applied Sciences, 16(4), 2029. https://doi.org/10.3390/app16042029

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Data-Driven Reduced-Order Model for Rotary Kiln Temperature Field Prediction Using Autoencoder and TabPFN

Abstract

1. Introduction

2. Full-Order Numerical Model of the Rotary Kiln Calcination Process

2.1. Overview of Physical Model and Burner Structure

2.2. Mesh Generation Strategy and Independence Verification

2.3. Numerical Model

2.3.1. Model Assumptions and Governing Equations

2.3.2. Coal Combustion Reaction Kinetics

2.3.3. Clinker Calcination Coupling Model

2.3.4. Boundary Conditions

2.4. Numerical Model Validation

3. AE-TabPFN Reduced-Order Model for Rotary Kiln Temperature Field

3.1. Dataset Construction and Preprocessing

3.2. Autoencoder-Based Nonlinear Dimensionality Reduction

3.3. TabPFN Low-Dimensional Feature Regression Prediction Model

4. Results and Discussion

4.1. Determination of Optimal Latent Feature Dimension

4.2. Validation of Autoencoder Reconstruction Performance

4.3. Comparative Analysis of Regression Algorithms

4.4. Assessment of Overall Model Prediction Performance and Computational Efficiency

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI