Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints

Wang, Chenchen; Hu, Chunyan; Li, Wei; Xu, Hanling; Miao, Keqiang; Sun, Jiaxian

doi:10.3390/sym17091441

Open AccessArticle

Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints

by

Chenchen Wang

^1,2,3

,

Chunyan Hu

^1,2,*,

Wei Li

^1,2,

Hanling Xu

^1,2,

Keqiang Miao

^1,2 and

Jiaxian Sun

^1,2

¹

Institute of Engineering Thermophysics, Chinese Academy of Sciences, Beijing 100190, China

²

National Key Laboratory of Science and Technology on Advanced Light-Duty Gas-Turbine, Beijing 100190, China

³

School of Aeronautics and Astronautics, University of Chinese Academy of Sciences (UCAS), Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(9), 1441; https://doi.org/10.3390/sym17091441

Submission received: 30 July 2025 / Revised: 15 August 2025 / Accepted: 16 August 2025 / Published: 3 September 2025

(This article belongs to the Section Engineering and Materials)

Download

Browse Figures

Versions Notes

Abstract

During high-altitude simulation tests, the accurate reproduction of environmental conditions directly affects the ability to reliably evaluate the engine’s performance under simulated high-altitude conditions. Traditional physical models, though interpretable, often fall short in handling nonlinear dynamics and time-delayed effects caused by disturbances such as abrupt flow rate fluctuations. For this problem, this work proposes a novel hybrid modeling framework that integrates physical principles with deep learning architectures. The proposed approach incorporates three key innovations. First, a physics-guided residual learning scheme is introduced, where the theoretical outlet temperature derived from energy conservation laws and assumptions of symmetric inlet mixing serves as a prior, and a data-driven model corrects residual deviations. Second, a multiscale feature extraction module is constructed by combining a Transformer with DWT, capturing both long-term regularities and short-term fluctuations that often exhibit quasi-symmetric patterns in structured systems. Third, a perturbation-aware memory structure is designed, fusing a Transformer branch for long-term dependency modeling with an extended LSTM (xLSTM) branch for short-term dynamic sensitivity. The experimental results on real-world test datasets demonstrate that the proposed hybrid model significantly outperforms traditional physical models. Specifically, it achieves an MAPE of 0.0156 and a R² of 0.9867, indicating high predictive accuracy. The model not only achieves superior prediction accuracy but also preserves physical interpretability, making it a promising solution for intelligent control in industrial mixing systems.

Keywords:

temperature prediction; physics-guided learning; Transformer-xLSTM; perturbation-aware learning

1. Introduction

Gas mixers are widely used in various industrial applications, serving as a critical component in systems where precise blending of different gas streams is required. From chemical reactors [1] and HVAC systems [2] to medical ventilators and fuel reformers, gas mixers ensure that downstream processes receive gases with desired thermodynamic properties. In the aerospace industry, the flight environment simulation intake system is the core subsystem of the altitude ground test facility, which is used to simulate the intake environmental conditions of aero-engines operating at cruising altitudes [3,4]. Within the flight environment simulation intake system, the mixer is responsible for blending high-pressure cold air, hot air, and ambient air to create the desired temperature and pressure conditions for aero-engine inlet testing. The thermal condition at the mixer outlet not only determines the accuracy of simulated flight conditions but also influences key parameters such as engine thrust, fuel–air ratio, and thermal loads on engine components [5]. As such, ensuring accurate prediction and control of the mixer outlet temperature is of paramount importance for reliable engine testing and evaluation.

However, achieving precise thermal simulation in such mixers remains a technical challenge. The nonideal behavior of the airflow, energy losses, uneven mixing, and slow mixing under complex real-world conditions cause substantial discrepancies between the actual outlet temperature and the theoretical temperature [6,7]. From an analysis of existing high-altitude engine test data, the mixing rate in the mixer for different operating points ranges from 0.3 to 0.85 °C/min. Traditional modeling approaches—often based on steady-state energy conservation and enthalpy difference methods—assume ideal mixing and ignore spatial non-uniformity, temporal delay, and thermal inertia introduced by the mixer’s internal geometry and unsteady operating conditions. These simplifications are inadequate for reflecting transient dynamics caused by rapid changes in valve adjustments or flow disturbances commonly encountered during test campaigns. Moreover, the thermal behavior of the system is further complicated by several factors, including large-volume flow within the mixer cavity, flow separation before it enters the mixer, and imperfect interactions between the mixing jets. These phenomena introduce nonlinearities and spatial–temporal variations that make predictions based solely on first-principles models prone to significant errors and instability.

Recent studies have demonstrated the effectiveness of hybrid modeling and Physics-Informed Machine Learning (PIML) in environmental and industrial time-series prediction. For example, Bagheri et al. [8] developed a hybrid framework for modeling soil water content that integrated deep learning models (e.g., LSTM, random forest) with a simplified hydrological model through residual learning and physics-informed neural networks, achieving improved accuracy while preserving physical interpretability. In another study, Naeini and Snaiki [9] proposed a physics-informed approach for simulating time-series wave runup by combining the computational efficiency of the Surfbeat (XBSB) mode with the accuracy of the non-hydrostatic (XBNH) mode of the XBeach model, using a conditional generative adversarial network to incorporate physics-based knowledge into the learning process.

In order to improve the prediction accuracy of mixer outlet temperature, this work proposes a novel hybrid modeling framework integrating domain physical knowledge with time-series learning techniques. The major contributions of this work are summarized as follows:

The theoretical outlet temperature, calculated based on energy conservation, is embedded physically prior to the modeling process. Instead of directly predicting temperature, the model focuses on learning the residual error between measured and theoretical values. This residual learning framework improves generalization capability, enhances model stability under complex working conditions, and ensures the predictions remain physically interpretable.
To accurately model both the long-term delayed effects and the short-term disturbances in the outlet temperature sequence, a dual-path structure is designed. The differential Transformer captures flow-induced abrupt changes by processing first-order differences of input sequences, effectively enhancing sensitivity to perturbations. The xLSTM branch simultaneously captures local nonlinear dynamics and slow-varying trends, achieving fine-grained multi-scale temporal representation.

The remainder of this paper is organized as follows: Section 2 reviews existing research on thermal modeling of mixers and temperature prediction methods. Section 3 models the prediction problem and analyzes the available data. Section 4 presents the architecture and principles of the proposed hybrid prediction model. Section 5 discusses experimental results based on measured data. Finally, Section 6 concludes the work and outlines future research directions.

2. Related Works

2.1. Mixer Modeling of Engine Test Systems

Extensive studies have been conducted on the temperature regulation process of high-altitude simulation systems. Zhu et al. [10] simplified the mixer as a volume model without considering airflow mixing and heat transfer, while Zhu et al. [4] further proposed a multi-volume model incorporating these factors for improved accuracy. Wang et al. [11] optimized mixer structures to address non-uniform airflow distribution at the outlet. Zhu et al. [12] analyzed dynamic heat transfer during high-altitude heating tests, considering local heat loss. Montgomery et al. [13,14] explored how metal mass, cavity volume, and airflow states influence intake temperature change in turbine engine test cells. To address the problem of substantial temperature delays in the flight environment simulation system for altitude ground test facilities, Zhu [15] investigated the effects of flow ratios and structural parameters on cold and hot airflow mixing characteristics. Most existing studies still focus on the influence of heat transfer processes, while lacking in-depth consideration of the coupling relationships among internal system variables.

2.2. Temperature Prediction Method

In the modeling process, various disturbance factors cannot be corrected within the original mechanistic model. However, hybrid modeling methods based on mechanisms and data-driven methods effectively solve this problem [16,17]. Ning et al. [18] demonstrated improved spacecraft control system modeling using mechanism-data fusion. Peng [19] addressed nonlinear and strong hysteresis issues in the heating process of vacuum sintering furnaces by combining both mechanistic analysis and data-driven methods. Yu [20] developed an outlet temperature prediction model for a decomposition furnace using elastic networks and LSTM, achieving precise control. Adam Kula et al. [21] compared machine learning methods for data center temperature forecasting in the warm corridor of the data centre. Zhang et al. [22] integrated mechanism models and stacking frameworks for accurate parameter prediction of a condenser system. Yan et al. [23] proposed a hybrid ALSTM model for predicting solar collector outlet temperatures. However, the current methods still fall short of fully meeting the requirements for high-accuracy and robust outlet temperature prediction under complex operating conditions.

3. Problem Modeling and Data Analysis

The mixing of two air streams with different temperatures, pressures, and flow rates is a highly coupled and dynamically evolving thermodynamic process. To more effectively characterize the input–output behavior and dynamic features of this process, this chapter approaches the problem from a system modeling perspective. Based on original data, this section conducts a comprehensive analysis of the temporal variation characteristics of the mixer outlet temperature. Furthermore, it critically evaluates the strengths and limitations of different modeling paradigms, thereby laying a theoretical foundation for the selection and design of predictive models in subsequent stages.

3.1. System Description and Data Acquisition

This study focuses on the intake temperature regulation subsystem within the altitude ground test facility. The original data used in this study are derived from high-altitude test data of a certain aero-engine, with a sampling period of 0.2 s. A total of 19 characteristic variables were recorded, covering the pressure, temperature, and mass flow rate of two intake streams (ambient air and cooled air), secondary flow, and the mixer outlet. The variables are listed in Table 1. Due to proprietary constraints, the original dataset cannot be publicly released. To facilitate reproducibility, we provide statistical characteristics of the data: input parameters (e.g.,

W_{c}

,

W_{n}

,

P_{n 1}

) exhibit rapid temporal variations, while the target output parameter (

T_{m i x}

) shows a delayed response and slow-varying behavior.

The system schematic diagram is shown in Figure 1, with all measurement points clearly labeled. The air supply system delivers two streams of air with distinct temperatures and pressures, which are mixed within the mixer. Flow rate and pressures of each stream are precisely regulated via control valves to ensure that the mixed airflow meets the target temperature and pressure requirements before being supplied to the chamber for aero-engine tests. The aero-engine under test is a twin-spool turbofan engine with a maximum rotational speed of 37,000 rpm. The secondary flow is used for component cooling and does not exceed 5% of the mainstream flow. Structurally, the mixing unit exhibits geometrical and functional symmetry, with two symmetric inlets and one outlet—facilitating balanced energy and mass exchange under ideal conditions.

Assuming the confluence of two airstreams with different temperatures, pressures, and flow rates within the mixer. Under the assumption of a steady-state mixing process, the mass conservation principle can be expressed as follows [24]:

m_{1} + m_{2} = m_{m i x}

(1)

The energy conservation equation can be expressed as:

m_{1} C_{p 1} T_{1} + m_{2} C_{p 2} T_{2} = m_{3} C_{p m i x} T_{c a l}

(2)

Combining Equations (1) and (2), the mixed temperature is obtained:

T_{c a l} = \frac{m_{1} C_{p 1} T_{1} + m_{2} C_{p 2} T_{2}}{(m_{1} + m_{2}) \cdot C_{p m i x}}

(3)

where

m_{1}

,

m_{2}

, and

m_{3}

represent the mass flow rates of the first, second, and mixed airflows, respectively, in

k g / s

,

C_{p 1}

,

C_{p 2}

, and

C_{p 3}

are the specific heat capacities of the first, second, and mixed airflows, respectively, in

J / (k g \cdot K)

, and

T_{1}

,

T_{2}

, and

T_{3}

are their corresponding temperatures in

K

.

Assuming that the specific heat capacity of air remains constant during the mixing process, Equation (3) can be simplified as:

T_{c a l} = \frac{m_{1} T_{1} + m_{2} T_{2}}{(m_{1} + m_{2})}

(4)

To eliminate the influence of the secondary flow, the mass flow rate of the ambient air and the cooled air within the secondary flow are calculated using Equations (1) and (4):

W_{n s e c} = \frac{(T_{s e c} - T_{c 2}) \cdot W_{s e c}}{T_{n 2} - T_{c 2}}

(5)

W_{c s e c} = W_{s e c} - W_{n s e c}

(6)

where

W_{n s e c}

represents the mass flow rate of ambient air in the secondary flow, and

W_{c s e c}

represents the mass flow rate of cooled air in the secondary flow. The meanings of other symbols are as shown in Figure 1. The effective mass flow rates of the two incoming air streams at the mixer are determined by subtracting the corresponding secondary flow contributions.

While Equations (1)–(4) theoretically define the ideal outlet temperature based on symmetric energy exchange between two streams, practical systems deviate from symmetry due to factors such as thermal inertia, volumetric lag, and imperfect mixing. Therefore, a deviation term

ϵ

is introduced:

T_{c a l} = \frac{m_{1} T_{1} + m_{2} T_{2}}{(m_{1} + m_{2})} + ϵ

(7)

3.2. Statistical Characteristics Analysis of Mixer Outlet Temperature

To analyze the dynamic behavior of the mixing temperature, this section performs the following statistical analyses on

T_{m i x}

:

3.2.1. Stationarity Analysis

The Augmented Dickey–Fuller (ADF) [25] test is employed to evaluate the stationarity of the outlet temperature sequence

T_{m i x}

. This test determines whether a time series contains unit roots, which are indicative of non-stationarity. In contrast, a stationary series does not contain unit roots. The ADF test is conducted under the following hypotheses:

Null hypothesis ( $H_{0}$ ): the series contains unit roots (i.e., it is non-stationary);
Alternative hypothesis ( $H_{1}$ ): the series does not contain unit roots (i.e., it is stationary).

The ADF test statistic is based on the following regression equation:

a = 1, ∆ y_{t} = α + β t + γ y_{t - 1} + \sum_{i = 1}^{p} δ_{i} {∆ y}_{t - i} + ε_{t}

(8)

where

∆ y_{t}

denotes the first difference of

y_{t}

,

t

is the deterministic time trend,

p

is the lag order, and

ε_{t}

is white noise.

The reported t-value corresponds to the statistic for testing

H_{0}

:

γ = 0

, calculated as:

t_{r} = \frac{\hat{γ}}{S E (\hat{γ})}

(9)

where

\hat{γ}

is the estimated coefficient of

y_{t - 1}

and

S E (\hat{γ})

is its standard error.

The p-value represents the probability of observing a test statistic at least as extreme as

t_{r}

under the null hypothesis, based on the non-standard empirical distribution of the ADF statistic. A lower p-value indicates stronger evidence against

H_{0}

.

The decision-making procedure is as follows:

Check the p-value: If the p-value is less than 0.05, the null hypothesis can be rejected, suggesting that the series is stationary. If the p-value is greater than or equal to 0.05, the result is inconclusive, and further examination is necessary.
Evaluate the test statistic (t-value): If the t-value is smaller than the 10% critical value, the null hypothesis can be rejected at the 90% confidence level, indicating stationarity. Otherwise, the series is considered non-stationary.

The test statistic is compared against critical values at the 10%, 5%, and 1% significance levels. If the statistic is less than the critical value at a given level, the null hypothesis can be rejected with 90%, 95%, or 99% confidence, respectively, indicating stationarity.

As shown in Table 2, the p-value exceeds the commonly used significance thresholds, and the t-value is higher than the 10% critical value. Therefore, the null hypothesis cannot be rejected, and the original temperature series exhibits evident non-stationary behavior.

3.2.2. Autocorrelation Characteristics Analysis

The Autocorrelation Function (ACF) describes the similarity between a time series and its own lagged values. It is used to evaluate the correlation between the series at different lags and to reflect periodicity or dependency within the series. Given a time series

\{y_{t}\}

, the ACF at lag

k

is calculated as follows:

ρ_{k} = \frac{\sum_{t = 1}^{N - k} (y_{t} - \bar{y}) (y_{t + k} - \bar{y})}{\sum_{t = 1}^{N} {(y_{t} - \bar{y})}^{2}}

(10)

where

N

is the total length of the series,

y_{t}

is the value of the series at time

t

,

\bar{y}

is the sample mean, and

\bar{y} = \frac{1}{N} \sum_{t = 1}^{N} y_{t}

,

k

is the lag order.

The Partial Autocorrelation Function (PACF) measures the direct correlation between a time series

\{y_{t}\}

and its

k

-lagged vision

\{y_{t - k}\}

, after removing the linear dependence on all intermediate lags

\{y_{t - 1}, y_{t - 2}, \dots, y_{t - (k - 1)}\}

. The PACF at lag

k

is computed as:

f_{k} = \{\begin{matrix} C o r (y_{1}, y_{2}) = r_{1}, i f k = 1; \\ C o r (y_{k} - y_{k}^{k - 1}, y_{0} - y_{0}^{k - 1}) = r_{1}, i f k \geq 2; \end{matrix}

(11)

After differencing the outlet temperature

T_{m i x}

, its ACF and PACF were computed, and the resulting correlation plots are shown in Figure 2 and Figure 3. The blue lines indicate the approximate 95% confidence intervals under the null hypothesis of white noise. Spikes exceeding these bounds suggest statistically significant correlations at the corresponding lags. The red vertical line at lag zero serves as a reference axis, distinguishing negative from positive lags. Analysis of the ACF and PACF reveals that the PACF at lag 1 deviates significantly from zero, indicating a strong linear correlation between the differenced temperature and its previous time step. Lags 2 to 5 show slight but noticeable significance, suggesting a certain degree of short-term negative feedback in the sequence. Beyond lag 5, the partial autocorrelations fall within the confidence bounds, indicating weak long-lag effects. Therefore, the differenced

T_{m i x}

series can be regarded as a short-memory process, mainly relying on the values from the previous five time steps.

3.2.3. Lag Response Analysis

To investigate the dynamic influence of different air source flow rates on the mixer outlet temperature

T_{m i x}

this study calculates the Cross-Correlation Function (CCF) between cold air flow

W_{c}

and ambient air flow

W_{n}

with

T_{m i x}

, with the maximum lag order set to 50. The results are shown in Figure 4 and Figure 5.

The analysis reveals a significant negative correlation between the

W_{c}

and

T_{m i x}

, indicating that an increase in

W_{c}

leads to a reduction in

T_{m i x}

, with a clear time-delay effect. In contrast, the correlation between

W_{n}

and

T_{m i x}

is relatively weak, suggesting that

W_{n}

has limited direct influence on outlet temperature variations. These observed lag characteristics reflect the inherent system inertia and energy accumulation effects, highlighting the necessity of incorporating temporal window mechanisms or recurrent structures into the prediction model to accurately capture the underlying dynamic behavior.

3.3. Physics Model with Inertia Process

The physical model derived from the energy conservation equation can be used for a rough estimation of the outlet temperature under ideal conditions. However, this theoretical formulation assumes perfect mixing and homogeneous thermal properties, which cannot fully represent the real system’s response latency and dynamic disturbances. To introduce these temporal characteristics, the system is modeled as a first-order inertia process:

τ \cdot \frac{{d T}_{c a l_d e l a y} (t)}{d t} + T_{c a l_d e l a y} (t) = T_{c a l} (t)

(12)

where

τ

is the time constant of the system, reflecting the degree of delay,

T_{c a l} (t)

is the instantaneous outlet temperature computed from the energy conservation model, and

T_{c a l_d e l a y} (t)

is the delayed (inertia-affected) outlet temperature.

By discretizing the differential equation using a forward Euler method, the model can be rewritten as:

T_{c a l_d e l a y} (t + ∆ t) = T_{c a l_d e l a y} (t) + \frac{∆ t}{τ} [{T_{c a l} (t) - T}_{c a l_d e l a y} (t)]

(13)

where

∆ t

is the sampling interval.

Figure 6 compares the actual outlet temperature, the theoretical prediction from the ideal physical model, and the output of the inertia-based model. The results demonstrate that the ideal model

T_{c a l}

responds rapidly to input perturbations but shows large deviations from the actual temperature trend. In contrast, introducing the inertial term

T_{c a l_d e l a y}

effectively smooths the model response and enhances its ability to capture gradual trends. However, the inertia-based model still exhibits limitations in tracking rapid fluctuations or complex nonlinear disturbances.

4. Methodology

To achieve high-precision and interpretable modeling of the outlet temperature

T_{m i x}

in dynamic air-mixing systems, this study proposes a hybrid modeling framework. The method integrates physics-informed priors, multiscale feature extraction, and a perturbation-aware dual-path memory structure to capture both the global trends and local disturbances in time-series data. This chapter elaborates on each component of the framework.

4.1. Physics-Guided Hybrid Modeling

The outlet temperature of an air mixer is fundamentally determined by the mixing enthalpy balance between two inlet airflows. The theoretical outlet temperature

T_{c a l}

can be derived via energy conservation principles, as introduced in Section 3.1 (Equation (4)). However, in real-world applications, disturbances such as actuator delay, heat loss, turbulence, and sensor noise result in significant deviations between the theoretical value and the actual measured

T_{m i x}

.

To reconcile physical interpretability with data-driven flexibility, we adopt a gray-box modeling strategy [26]. The final predicted temperature is expressed as:

T_{m i x} (t) \approx T_{c a l} (t) + f_{θ} (x_{t}) + ε

(14)

where

x_{t}

denotes the historical input sequence (e.g., previous flow rates and temperatures),

f_{θ} (\cdot)

is a neural network parameterized by

θ

, trained to learn the residual between

T_{m i x}

and

T_{c a l}

, and

ε

is the residual noise term.

In training, a physics-regularized loss function is adopted to enhance consistency with domain knowledge:

L_{t o t a l} \approx L_{m s e} (T_{p r e d}, T_{m i x}) + λ \cdot L_{p h y s} (T_{p r e d}, T_{c a l})

(15)

where

L_{p h y s}

enforces consistency between the predicted output and the physical model,

λ

is a hyperparameter to control the trade-off between data fitting and physical regularization.

To exploit the trend-consistency advantage of

T_{c a l}

while mitigating the risk of performance degradation due to over-constraining, we propose a trend-aware weak physical constraint in the loss formulation. The physical loss is defined as:

L_{p h y s} (T_{p r e d}, T_{c a l}) = \frac{1}{N} \sum_{t = 1}^{N} {(\frac{T_{p r e d} (t) - T_{c a l} (t)}{T_{c a l} (t) + ε})}^{2} \cdot m a s k (t)

(16)

where

ε

is a small constant (set to

10^{- 6}

) to prevent division by zero, and

m a s k (t)

is a dynamic weighting term designed to weaken the physical constraint when the theoretical temperature is deemed unreliable.

The mask term is given by:

m a s k (t) = e x p (- \frac{|T_{c a l} (t) - T_{c a l} (t - 1)|}{k})

(17)

where

k

is a decay-rate hyperparameter. This formulation draws inspiration from adaptive weighting strategies in physics-informed neural networks [27], in which the balance between data loss and physical residual minimization is dynamically tuned to handle noisy or partially inaccurate physical models. In this paper,

m a s k (t)

decays exponentially when large fluctuations occur in

T_{c a l}

, signaling potential unreliability in the theoretical model under unstable operating conditions. This mechanism allows the hybrid model to fully leverage the theoretical trend information in stable regimes while relaxing the constraint when the physical model is less trustworthy.

4.2. Discrete Wavelet Transform (DWT)

Given that the outlet temperature reflects both global trends (e.g., flow regulation) and local disturbances (e.g., sharp flow changes), traditional LSTM or single-layer convolutional model is insufficient to capture these features [28]. Traditional time–domain or frequency–domain analysis methods struggle to effectively capture such complex temporal features. To address this limitation, DWT is introduced into the modeling process.

DWT is a powerful tool for time–frequency analysis, which decomposes a signal into different frequency bands with preserved temporal localization. Unlike the Fourier transform, which only provides frequency information, wavelet transform retains both time and scale (frequency) details, making it particularly suitable for analyzing non-stationary signals such as temperature and flow rate fluctuations.

Raw temperature and flow signals are decomposed into low- and high-frequency components using wavelet transform, allowing the model to separately capture global trends and local fluctuations:

x (t) = A (t) + D_{1} (t) + D_{2} (t) + \dots

(18)

where

A (t)

represents the smooth trend (low-frequency), and

D_{i} (t)

are high-frequency perturbations. These components are then concatenated across channels to enhance robustness under dynamic conditions. This multi-resolution input significantly improves the model’s adaptability to non-stationary disturbances.

4.3. Perturbation-Aware Memory Structure

The temperature system typically exhibits slow-varying or lagging behavior, meaning that the effects of an upstream disturbance (e.g., a sudden change in airflow rate) may not immediately manifest at the outlet temperature. Instead, such perturbations propagate through the physical system over time, leading to a causal delay between input changes and output response. Capturing this lagged dependency is essential for improving multi-step prediction accuracy and system interpretability.

This work proposes a perturbation-aware memory structure that leverages dual-path temporal decomposition, integrating a decomposed Transformer path for long-term dependency modeling and an Extended Long Short-Term Memory (xLSTM) path for local short-term dynamics. Notably, the Transformer pathway processes the first-order differences of the input features, allowing the model to focus on temporal variations indicative of system perturbations. This hybrid design enhances both predictive accuracy and interpretability under dynamic input conditions.

4.3.1. Transformer for Long-Term Dependency Modeling

Compared to traditional Recurrent Neural Networks (RNNs), the Transformer model does not rely on sequential order. Instead, it captures dependencies among elements in the sequence through a self-attention mechanism. The core components of the Transformer model are the multi-head self-attention mechanism and a position-wise feedforward neural network. Through the multi-head attention mechanism, the model can automatically identify which historical inputs (e.g., sudden flow fluctuations) are more relevant to the current output. The attention weights provide interpretability regarding the causal origin of disturbances.

The self-attention mechanism takes as input a query matrix

Q

, a key matrix

K

, and a value matrix

V

, and computes the output as [29]:

A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(19)

where

d_{k}

is the dimensionality of the key vectors,

1 / \sqrt{d_{k}}

is a scaling factor used to prevent the dot-product from becoming excessively large, and

s o f t m a x

denotes the normalized exponential function.

The multi-head self-attention mechanism computes multiple self-attention operations in parallel, allowing the model to capture dependencies across different subspaces of the sequence. The output is obtained by concatenating the results from each attention head:

\{\begin{matrix} M u l t i h e a d (Q, K, V) = C o n c a t ({h e a d}_{1}, \dots, {h e a d}_{h}) W^{o} \\ {h e a d}_{i} = A t t e n t i o n (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V}) \end{matrix}

(20)

where

W_{i}^{Q}

,

W_{i}^{K}

,

W_{i}^{V}

, and

W^{o}

are learnable projection matrices for the query, key, value, and output, respectively.

The multi-head self-attention mechanism enables the model to identify which historical inputs (e.g., sudden flow changes) are most relevant to the current output. The attention weights provide interpretable insights into the causal origin of perturbations.

Figure 7 shows the average attention heatmap across all Transformer layers. The model predominantly attends to input time steps 8 and 36 when predicting the mixed temperature, indicating that these time points are the most informative for the output. This demonstrates that the model can identify and focus on the critical features, supporting its interpretability.

4.3.2. xLSTM for Short-Term Disturbance Sensitivity

To complement the long-term modeling capability of the Transformer, we introduce an xLSTM module that enhances sensitivity to short-term fluctuations and local nonlinear dynamics.

Compared to conventional LSTM, xLSTM introduces several key enhancements, including residual connections, layer-wise memory fusion, and gate-modulated memory updates. These modifications allow xLSTM to better capture fine-grained transient dynamics in time series while maintaining numerical stability during multistep predictions. Consequently, xLSTM provides superior capability in modeling complex nonlinear temporal dependencies and achieves higher predictive accuracy than standard LSTM models. The architecture of xLSTM is shown in Figure 8 [30], where its key submodules, sLSTM and mLSTM, are illustrated in are illustrated in Figure 9 [30].

4.3.3. Hybrid Integration and Perturbation-Aware Loss

The outputs from the Transformer and xLSTM branches are fused using a learnable gating or attention fusion layer. This enables the model to dynamically blend long-term memory with recent context.

To further prioritize disturbance periods, we design a perturbation-weighted loss that emphasizes prediction accuracy during abnormal intervals:

L_{p - a w a r e} = \sum_{t} ω (t) \cdot {‖T_{p r e d} (t) - T_{m i x} (t)‖}^{2}

(21)

where the weight

ω (t)

is derived from the temporal gradient (e.g., first-order derivative of flow rate or temperature). Larger weights are assigned during high-variance or critical control phases.

The final hybrid model integrates all the above modules, as shown in Figure 10.

To improve clarity and readability, a functional diagram of the fusion layer is provided, illustrating how the outputs from the Transformer and xLSTM branches are combined. Figure 11 demonstrates that the fusion layer first aligns the feature dimensions of both branches and then integrates them using a weighted combination, allowing the model to leverage both the temporal dependencies captured by the xLSTM and the long-range contextual information captured by the Transformer. This design enhances the overall predictive capability by effectively merging complementary information from the two branches.

5. Results and Discussion

5.1. Experimental Setup

5.1.1. Dataset and Operating Environment

To validate the effectiveness of the proposed hybrid modeling framework, we conduct a series of experiments based on the dataset sourced from real data collected during a high-altitude engine test for a specific model of aero-engine at a test site in Qingdao. The raw dataset is preprocessed to form a time-series dataset containing 60,000 samples with a time interval of 0.2 s. All model training and simulation experiments in this work were carried out on a laptop equipped with a Windows 11 operating system, Intel i7-12700H processor, 16G RAM, Python 3.9, and PyTorch 2.6.0+cu126.

5.1.2. Evaluation Metrics

The prediction evaluation metrics include the mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE) and coefficient of determination (

R^{2}

), calculated as Equations (15)–(17), respectively.

M A E = \frac{1}{N} \sum_{i = 1}^{N} |y_{i} - {\hat{y}}_{i}|

(22)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(23)

M A P E = \frac{1}{N} \sum_{i = 1}^{N} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}| \times 100 %

(24)

R^{2} = 1 - \frac{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \frac{1}{N} \sum_{i = 1}^{N} y_{i})}^{2}}

(25)

where

y_{i}

and

{\hat{y}}_{i}

represent the actual and predicted values at time

i

, respectively, and

N

is the number of samples. MAE and RMSE represent the absolute prediction errors. whereas MAPE reflects the relative magnitude of the prediction error with respect to the actual values, Lower values of MAE, RMSE, and MAPE indicate higher prediction accuracy. R² measures the goodness of fit of the model, with values closer to 1 indicating stronger predictive performance and better alignment with the observed data.

5.2. Experimental Results and Analysis

5.2.1. Performance Evaluation of the Prediction Model

As shown in Figure 12 and Table 3, it can be observed that the physics-only model shows the lowest performance across all metrics, which is expected due to its simplified assumptions. Data-driven models such as xLSTM, BiLSTM, and Transformer significantly improve the prediction accuracy, with R² values above 0.94. Among them, the xLSTM achieves relatively good results with RMSE = 0.7721.

The proposed Hybrid Model outperforms all baselines, achieving the lowest error values and the highest coefficient of determination, with the best results highlighted in bold. The detailed hyper-parameter configuration of all models is provided in Appendix A.

These results clearly demonstrate that the integration of multi-scale features and physical constraints within the Hybrid Model framework effectively improves outlet temperature prediction performance, providing a reliable solution for practical engineering applications.

5.2.2. Wavelet Ablation Study

To verify the applicability of DWT for temperature sequences, this section conducts a wavelet ablation study to compare the prediction performance of the hybrid model.

Figure 13 and Table 4 investigate the impact of DWT and different decomposition levels on the prediction performance of the proposed Hybrid Model. The results clearly indicate that introducing DWT significantly enhances model accuracy, while the choice of decomposition level plays a crucial role in determining the final performance.

Among the tested configurations, the Hybrid Model with level 3 decomposition achieves the best overall results, with the lowest RMSE, MAE, and MAPE, as well as the highest

R^{2}

. This highlights that level 3 decomposition effectively balances the extraction of global trends and local fluctuations, providing optimal multi-scale feature representation for the model.

The Hybrid Model without DWT demonstrates the poorest performance among all DWT-enhanced models, confirming that incorporating wavelet-based multi-scale features is essential for accurately capturing complex, non-stationary characteristics in the outlet temperature sequence.

5.2.3. Ablation Study on Physical Constraint

To assess the contribution of the physical constraint module, a comparative experiment was conducted between the full hybrid model and a variant without the physics-informed loss term. As shown in Table 5, the inclusion of physical constraint significantly improves model performance across all evaluation metrics:

These results indicate that the physics-informed regularization not only reduces prediction error but also enhances model generalization and consistency with physical laws. The constraint acts as a soft prior that guides the model toward more physically plausible predictions, especially under conditions with limited or noisy data. This validates the effectiveness of integrating domain knowledge into data-driven architectures.

6. Conclusions

In this study, a hybrid mixer outlet temperature prediction method incorporating physical constraints and multiscale feature decomposition was proposed. The method effectively combines mechanistic modeling and advanced deep learning techniques to enhance prediction accuracy and robustness under complex conditions. Specifically, a theoretical outlet temperature was calculated based on mass and energy conservation principles, serving as a physical constraint to improve the model’s consistency with real-world thermodynamic behavior. Furthermore, DWT was employed to decompose raw time series into low- and high-frequency components, allowing the model to capture both global trends and local fluctuations.

A hybrid deep learning framework based on a difference-enhanced Transformer and xLSTM architecture was constructed to extract multi-scale temporal dependencies. Extensive experiments, including ablation studies and comparisons with baseline models, demonstrate that the proposed method significantly outperforms traditional machine learning and standard deep learning models in terms of prediction accuracy, generalization, and physical interpretability.

This research provides a practical and reliable approach for temperature prediction in a flight environment simulation intake system. Future work will focus on extending the method to multi-source airflow mixing scenarios and exploring adaptive physical parameter identification to further enhance model generalization capabilities.

Although the proposed model demonstrates excellent predictive accuracy on the current dataset, the data was collected from a single aero-engine and a specific experimental setup. The model’s performance under different engine types, operating conditions, or geometrical configurations may vary. Future work will investigate transfer learning and domain adaptation strategies to enhance generalizability across diverse engine architectures.

Author Contributions

Conceptualization, C.W. and C.H.; Data curation, H.X. and J.S.; Formal analysis, K.M.; Funding acquisition, C.H.; Investigation, H.X.; Methodology, C.W., C.H. and W.L.; Project administration, K.M.; Resources, W.L.; Software, C.W. and H.X.; Supervision, J.S.; Validation, C.W., C.H. and W.L.; Visualization, H.X.; Writing—original draft, C.W. and C.H.; Writing—review and editing, W.L. and J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Engine Thermophysical Test Apparatus, Grant No. E21I010201.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to confidentiality requirements.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Model hyper-parameter configuration.

Model	Hyper-Parameter
xLSTM	Layers = 2; Hidden_dim = 64; Optimizer = ‘adam’; Learning rate = 1 × 10⁻⁴; Batch size = 64; Epochs = 300; Loss = MSE
Transformer	d_model = 64; nhead = 4; num_layers = 1; dropout = 0.3; Optimizer = ‘adam’; Learning rate = 1 × 10⁻⁴; Batch size = 64; Epochs = 300; Loss = MSE
MLP	Hidden layers = 3; Units per layer = [128, 64, 32]; Activation = ‘relu’; dropout = 0.3; Optimizer = ‘adam’; Learning rate = 1 × 10⁻⁴; Batch size = 64; Epochs = 300; Loss = MSE
BiLSTM	Layers = 2; Hidden_dim = 64; Optimizer = ‘adam’; Learning rate = 1 × 10⁻⁴; Batch size = 64; Epochs = 300; Loss = MSE
TCN	Hidden layers = 3; Units per layer = [128, 64, 32]; Activation = ‘relu’; dropout = 0.3; Optimizer = ‘adam’; Learning rate = 1 × 10⁻⁴; Batch size = 64; Epochs = 300; Loss = MSE
Hybrid Model	DiffTransformer: d_model = 64; nhead = 4; num_layers = 1; dropout = 0.3; xLSTM: Layers = 2; hidden = 64; dropout = 0.3; Optimizer = ‘adam’; Learning rate = 1 × 10⁻⁴; Batch size = 64; Epochs = 300; Loss = Hybrid(MSE + L_phys);

References

Avril, A.; Hornung, C.H.; Urban, A.; Fraser, D.; Horne, M.; Veder, J.-P.; Tsanaktsidis, J.; Rodopoulos, T.; Henry, C.; Gunasegaram, D.R. Continuous flow hydrogenations using novel catalytic static mixers inside a tubular reactor. React. Chem. Eng. 2017, 2, 180–188. [Google Scholar] [CrossRef]
Park, H.; Bach, C.K. A literature review of air mixing devices for psychrometric performance measurement applications (ASHRAE RP-1733). Sci. Technol. Built Environ. 2020, 26, 778–789. [Google Scholar] [CrossRef]
Pei, X.; Zhang, S.; Dan, Z.; Zhu, M.; Qian, Q.; Wang, X. Study on digital modeling and simulation of altitude test facility flight environment simulation system. J. Propuls. Technol. 2019, 40, 1144–1152. [Google Scholar] [CrossRef]
Zhu, M.; Wang, X.; Pei, X.; Zhang, S.; Dan, Z.; Miao, K.; Liu, J.; Jiang, Z. Multi-volume fluid-solid heat transfer modeling for flight environment simulation system. J. Propuls. Technol. 2020, 41, 2848–2859. [Google Scholar] [CrossRef]
Wu, X. Research on Flow Characteristics and Structural Improvement of Engine Transition Test Mixer. Master’s Thesis, Southwest University of Science and Technology, Mianyang, China, 2024. [Google Scholar] [CrossRef]
Huang, J.; Ming, P.; Sun, W. Dynamic modal analysis of fluid sweeping rods bundle. J. Eng. Thermophys. 2023, 44, 2279–2284. [Google Scholar]
Zhang, P.; Li, Y.; Cheng, R. Control of Corner Separation for a Linear Compressor Cascade via Bionic Slanting Riblets at the Endwall. J. Therm. Sci. 2024, 34, 129–144. [Google Scholar] [CrossRef]
Bagheri, A.; Patrignani, A.; Ghanbarian, B.; Pourkargar, D.B. A hybrid time series and physics-informed machine learning framework to predict soil water content. Eng. Appl. Artif. Intell. 2025, 144, 110105. [Google Scholar] [CrossRef]
Naeini, S.S.; Snaiki, R. A physics-informed machine learning model for time-dependent wave runup prediction. Ocean Eng. 2024, 295, 116986. [Google Scholar] [CrossRef]
Zhu, M.; Wang, X. An Integral Type Synthesis Method for Temperature and Pressure Control of Flight Environment Simulation Volume. In Proceedings of the ASME Turbo Expo 2017: Turbomachinery Technical Conference and Exposition, Charlotte, NC, USA, 26–30 June 2017. [Google Scholar]
Wang, J.; Li, J.; Gao, Z.; He, X.; Zou, S.; Wan, J. Numerical simulation of internal flow field and structure improvement of hot air mixer. Chin. J. Process Eng. 2020, 20, 148–157. [Google Scholar]
Zhu, J.; Dong, W.; Wu, F.; Tian, X. Calculation and analysis of heat transfer process in pipes of altitude test facility. Gas Turbine Exp. Res. 2011, 24, 10–14+24. [Google Scholar]
Montgomery, P.; Burdette, R.; Klepper, J.; Milhoan, A. Evolution of a Turbine Engine Test Facility to Meet the Test Needs of Future Aircraft Systems. In Proceedings of the American Society of Mechanical Engineers(ASME) Turbo Expo 2002 v.1: Aircraft Engine Coal, Biomass, and Alternative Fuels Combustion and Fuels Education Electric Power Vehicular and Small Turbomachines, Amsterdam, The Netherlands, 1 December 2002; pp. 119–128. [Google Scholar]
Montgomery, P.A.; Burdette, R.; Wilhite, L.; Salita, S. Modernization of a Turbine Engine Test Facility Utilizing a Real-Time Facility Model and Simulation. In Proceedings of the ASME TURBO EXPO 2001: Power for Land, Sea, & Air, New Orleans, LA, USA, 4–7 June 2001; pp. 4474–4481. [Google Scholar]
Zhu, D. Structural Optimization Design of Cold and Hot Airflow Mixing Chamber. Master’s Thesis, Huazhong University of Science and Technology, Wuhan, China, 2022. [Google Scholar] [CrossRef]
Hajirahimi, Z.; Khashei, M. Hybrid structures in time series modeling and forecasting: A review. Eng. Appl. Artif. Intell. 2019, 86, 83–106. [Google Scholar] [CrossRef]
Wang, L.; Wang, Z.; Qu, H.; Liu, S. Optimal Forecast Combination Based on Neural Networks for Time Series Forecasting. Appl. Soft Comput. 2018, 66, 1–17. [Google Scholar] [CrossRef]
Ning, Z.; Liu, X.; Wang, S. A digital twin modeling approach for aerospace control systems with mechanism and data fusion. Aerosp. Control Appl. 2022, 48, 1–7. [Google Scholar]
Peng, L. Soft Measurement of Temperature in Vacuum Sintering Furnace Based on Mechanism and Data-Driven Hybrid Modeling. Master’s Thesis, Guangdong University of Technology, Guangzhou, China, 2020. [Google Scholar] [CrossRef]
Yu, G. Research on Neural Network Predictive Control of Decomposing Furnace Outlet Temperature. Master’s Thesis, Hefei University of Technology, Hefei, China, 2021. [Google Scholar] [CrossRef]
Kula, A.; Dąbrowski, D.; Blachnik, M.; Sajkowski, M.; Smalcerz, A.; Kamiński, Z. Modelling the Temperature of a Data Centre Cooling System Using Machine Learning Methods. Energies 2025, 18, 2581. [Google Scholar] [CrossRef]
Zhang, Y.; Tian, Q.; Bai, Y. Prediction of Outlet Temperature of Circulating Water Based on Hybrid Model and Stacking Framework. Comput. Simul. 2023, 40, 172–177. [Google Scholar]
Yan, L.; Lei, D.; Li, X.; Xu, L.; Dong, J.; Wang, Z. Outlet Temperature Prediction of Parabolic Trough Solar Field Based on Hybrid Neural Network. Acta Energiae Solaris Sin. 2023, 44, 265–273. [Google Scholar]
Wang, J.; Tang, W.; Lv, T.; Ding, N. Simulation theory and methods of pipeline networks. Liaoning Chem. Ind. 2013, 42, 1476–1478. [Google Scholar]
Elliott, G.; Rothenberg, T.J.; Stock, J.H. Efficient Tests for an Autoregressive Unit Root. Econometrica 1996, 64, 813–836. [Google Scholar] [CrossRef]
Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2020, 404, 109136. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Chen, H.; Ren, B. A multivariate time series forecasting model based on time-frequency feature fusion. J. Huazhong Univ. Sci. Technol. 2025, 1–13. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar] [CrossRef]
Beck, M.; Pöppel, K.; Spanring, M.; Auer, A.; Prudnikova, O.; Kopp, M.; Klambauer, G.; Brandstetter, J.; Hochreiter, S. xLSTM: Extended Long Short-Term Memory. arXiv 2024, arXiv:2405.04517. [Google Scholar] [CrossRef]

Figure 1. Air piping system schematic diagram of the altitude ground test facilities.

Figure 2. ACF of the differenced outlet temperature

T_{m i x}

.

Figure 2. ACF of the differenced outlet temperature

T_{m i x}

.

Figure 3. PACF of the differenced outlet temperature

T_{m i x}

.

Figure 3. PACF of the differenced outlet temperature

T_{m i x}

.

Figure 4. CCF of

W_{c}

with

T_{m i x}

.

Figure 4. CCF of

W_{c}

with

T_{m i x}

.

Figure 5. CCF of

W_{n}

with

T_{m i x}

.

Figure 5. CCF of

W_{n}

with

T_{m i x}

.

Figure 6. Outlet temperature comparison: physical model vs. inertia-enhanced model.

Figure 7. The average attention heatmap across all transformer layers.

Figure 8. The architecture of xLSTM. Adapted from [30].

Figure 9. Schematic of the sLSTM architure: (a) sLSTM block; (b) mLSTM block. Adapted from [30].

Figure 10. Hybrid modeling framework.

Figure 11. Fusion layer for integrating Transformer and xLSTM outputs.

Figure 12. Comparison of prediction performance across different models.

Figure 13. Prediction results of the wavelet ablation study.

Table 1. Measurement parameters.

Flow/Location	Mass Flow Rate	Pressure	Temperature
Ambient Air	$W_{n}$	$P_{n 1}$ $, P_{n 2}$ $, P_{n 3}$	$T_{n 1}$ $, T_{n 2}$ $, T_{n 3}$
Cooled Air	$W_{c}$	$P_{c 1}$ $, P_{c 2}$ $, P_{c 3}$	$T_{c 1}$ $, T_{c 2}$ $, T_{c 3}$
Mixer Outlet	-	$P_{m i x}$	$T_{m i x}$
Secondary Flow	$W_{s e c}$	$P_{s e c}$	$T_{s e c}$

Table 2. ADF test results.

Test Statistic (t-Value)	p-Value	Critical Value (1%)	Critical Value (5%)	Critical Value (10%)
−2.31979	0.16566	−3.43046	−2.86159	−2.56679

Table 3. Comparative evaluation of the proposed model and baseline models.

Comparative Models	Evaluation Metrics
Comparative Models	RMSE	MAE	MAPE	$R^{2}$
physics-only model	5.2445	4.6958	0.1109	0.3471
xLSTM	0.7721	0.7040	0.0159	0.9856
Transformer	1.5738	1.5046	0.0343	0.9401
MLP	2.0920	1.9455	0.0433	0.8942
hybrid grey-box model with Kalman filtering	2.8381	2.7645	0.0669	0.8053
BiLSTM	1.1802	1.0961	0.0248	0.9663
TCN	1.5031	1.4683	0.0344	0.9454
Hybrid Model	0.7414	0.5651	0.0156	0.9867

Table 4. Comparative evaluation of the wavelet ablation study.

Comparative Models	Evaluation Metrics
Comparative Models	RMSE	MAE	MAPE	$R^{2}$
Hybrid Model (level 2)	3.6093	3.4126	0.0766	0.6852
Hybrid Model (level 3)	0.7414	0.5651	0.0156	0.9867
Hybrid Model (level 4)	2.5644	2.4258	0.0552	0.8411
Hybrid Model Without DWT	2.8025	2.6708	0.0603	0.8102

Table 5. Ablation study on the physical constraint module.

Comparative Models	Evaluation Metrics
Comparative Models	RMSE	MAE	MAPE	$R^{2}$
Hybrid Model	0.7414	0.5651	0.0156	0.9867
Hybrid Model Without Physical Constraint	1.8806	1.7655	0.0416	0.9145

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Hu, C.; Li, W.; Xu, H.; Miao, K.; Sun, J. Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints. Symmetry 2025, 17, 1441. https://doi.org/10.3390/sym17091441

AMA Style

Wang C, Hu C, Li W, Xu H, Miao K, Sun J. Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints. Symmetry. 2025; 17(9):1441. https://doi.org/10.3390/sym17091441

Chicago/Turabian Style

Wang, Chenchen, Chunyan Hu, Wei Li, Hanling Xu, Keqiang Miao, and Jiaxian Sun. 2025. "Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints" Symmetry 17, no. 9: 1441. https://doi.org/10.3390/sym17091441

APA Style

Wang, C., Hu, C., Li, W., Xu, H., Miao, K., & Sun, J. (2025). Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints. Symmetry, 17(9), 1441. https://doi.org/10.3390/sym17091441

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mixer Temperature Prediction via Decomposed Transformer and xLSTM with Physical Constraints

Abstract

1. Introduction

2. Related Works

2.1. Mixer Modeling of Engine Test Systems

2.2. Temperature Prediction Method

3. Problem Modeling and Data Analysis

3.1. System Description and Data Acquisition

3.2. Statistical Characteristics Analysis of Mixer Outlet Temperature

3.2.1. Stationarity Analysis

3.2.2. Autocorrelation Characteristics Analysis

3.2.3. Lag Response Analysis

3.3. Physics Model with Inertia Process

4. Methodology

4.1. Physics-Guided Hybrid Modeling

4.2. Discrete Wavelet Transform (DWT)

4.3. Perturbation-Aware Memory Structure

4.3.1. Transformer for Long-Term Dependency Modeling

4.3.2. xLSTM for Short-Term Disturbance Sensitivity

4.3.3. Hybrid Integration and Perturbation-Aware Loss

5. Results and Discussion

5.1. Experimental Setup

5.1.1. Dataset and Operating Environment

5.1.2. Evaluation Metrics

5.2. Experimental Results and Analysis

5.2.1. Performance Evaluation of the Prediction Model

5.2.2. Wavelet Ablation Study

5.2.3. Ablation Study on Physical Constraint

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI