A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models

Dong, Qi; Jiang, Haolong; Liu, Qinyao; Gao, Yuan

doi:10.3390/a19060472

Open AccessArticle

A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models

¹

School of Advanced Technology, Xi’an Jiaotong Liverpool University, Suzhou 215123, China

²

Department of Computer Science, University of Liverpool, Liverpool L69 3BX, UK

³

Electrical and Electronic Engineering, School of Engineering, University of Leicester, Leicester LE1 7RH, UK

^*

Authors to whom correspondence should be addressed.

Algorithms 2026, 19(6), 472; https://doi.org/10.3390/a19060472 (registering DOI)

Submission received: 14 May 2026 / Revised: 5 June 2026 / Accepted: 8 June 2026 / Published: 10 June 2026

(This article belongs to the Special Issue Computational Modeling and Intelligent Simulation of Next-Generation Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

Reliable mathematical models are essential for high-performance analysis and optimization of complex power and energy systems. However, inherent nonlinearities pose significant challenges to accurate model identification. The Hammerstein model, a typical block oriented nonlinear system, consists of a static nonlinear block followed by a linear dynamic block. This paper investigates the data-driven modeling method for the Hammerstein model and proposes a hierarchical identification strategy that integrates the correlation analysis with the Levenberg–Marquardt algorithm. Unlike traditional methods, this hierarchical algorithm strategy decouples the linear and nonlinear modules to avoid parameter coupling and reduces computational complexity. Simulations on a solid oxide fuel cell system and a real-world wind power system confirm the effectiveness and feasibility of the proposed method. The results demonstrate that the hierarchical identification strategy achieves accurate parameter estimation with satisfactory convergence performance.

Keywords:

Hammerstein model; hierarchical identification; correlation analysis; Levenberg–Marquardt algorithm

1. Introduction

Complex dynamic systems are widely found in industrial processes [1,2], mechanical systems [3,4], chemical production [5,6], energy systems [7,8], and many other fields. These systems typically exhibit strong nonlinear characteristics, which pose significant challenges to system analysis, state estimation, and optimal control. Mathematical modeling provides a viable means to understand the dynamic behavior of nonlinear systems under varying operating conditions. It also enables the prediction of system responses, the optimization of performance, and the design of effective control strategies. To accurately estimate the charge state of lithium-ion batteries, Chen et al. established a non-commensurate fractional-order observer model, which effectively describes the charging and discharging characteristics of the battery [9]. Xu et al., addressed the coloured noise in industrial processes, proposed a linear filter-based identification framework that transforms coloured noise into white noise, thereby improved parameter estimation accuracy for nonlinear feedback systems [10].

Among the numerous mathematical modeling approaches, data-driven modeling strategies have garnered increasing attention due to their flexibility and adaptability [11,12]. Data-driven modeling methods offer greater ease of use in modeling complex systems compared to mechanistic modeling. As a cost-effective technique, it allows engineers to obtain control-oriented system models through a simpler way [13]. In real-world industrial processes, nonlinear systems are often highly complex and difficult to model based solely on physical principles, yet they generate abundant data through sensors and smart meters. This makes data-driven modeling an ideal approach for capturing their behavior. Based on charge–discharge data, Zhang et al. employed data-driven machine learning and deep learning algorithms to establish a state of health prediction model for lithium-ion batteries, achieving high-precision estimation of the battery degradation process [14]. For real-time intelligent control of wastewater treatment plants, data-driven modeling methods are employed to establish prediction models for key substrates, significantly improving computational efficiency and reducing energy consumption, thereby enhancing the environmental sustainability of wastewater treatment systems [15].

Among the nonlinear structures commonly used in data-driven modeling, the Hammerstein model is one of the most representative classes of block-oriented nonlinear systems, consisting of a static nonlinear module followed by a dynamic linear module in cascade [16,17]. This structure aligns closely with the characteristics of many dynamic systems, where nonlinear steady-state behavior and linearizable dynamics often coexist. Li et al. applied a kernel-based Hammerstein subspace model to energy efficiency prediction in chemical processes, achieving high-precision energy efficiency estimation [18]. Nevertheless, the inherent cascade structure of the Hammerstein model introduces difficulties for its identification. The intermediate signal between the nonlinear and linear blocks is unmeasurable, creating a coupled estimation problem. As a result, in many traditional identification methods, unless appropriate normalization constraints are imposed, the parameter estimates suffer from non-uniqueness [19,20]. In order to solve this problem, researchers usually use the over-parameterization method. However, the over-parameterization method has larger computational complexity and may causes redundant parameters [21].

To address these identification difficulties, this paper proposes a correlation analysis-based hierarchical identification strategy that combines the least squares algorithm with the Levenberg–Marquardt (LM) algorithm. Correlation analysis is a technique that estimates a system’s impulse response by computing the cross-correlation function between its input and output. It is insensitive to noise, requires no prior knowledge, and relies exclusively on input–output data, making it an effective tool for decoupling the parameters of a Hammerstein model [22,23]. The least squares algorithm offers a simple and efficient approach to obtaining unbiased estimates of the linear module parameters. The LM algorithm provides an effective compromise between the gradient descent method and Newton’s method. Compared with the gradient descent method, which only utilizes first-order derivative information and converges slowly in ill-conditioned problems, the LM algorithm achieves significantly faster convergence by approximating second-order information, making it particularly suitable for solving complex nonlinear problems [24]. Compared with Newton’s method, which requires the computation of the explicit Hessian matrix and fails when the Hessian is singular, the LM algorithm ensures matrix invertibility by introducing a damping term, thereby significantly improving numerical stability and robustness while maintaining fast convergence [25]. The proposed hierarchical identification hierarchical first employs the correlation analysis to decouple the Hammerstein model and uses the least squares algorithm to estimate the linear module parameters directly from input–output data. Subsequently, the nonlinear module estimation is reformulated as an optimization problem and solved using the LM algorithm with the previously obtained linear estimates, leading to a complete identification of the Hammerstein model.

This paper focuses on the identification of the Hammerstein models using a hierarchical identification strategy. The main contributions of this paper are as follows:

A correlation-analysis-based decoupling strategy is developed to separate the linear and nonlinear modules, enabling the linear module parameters to be estimated independently.
The LM algorithm is introduced to estimate the nonlinear parameters, which balances convergence speed and numerical stability through its damping mechanism.
Quantitative comparisons with the conventional over-parameterization method demonstrate that the proposed hierarchical strategy achieves superior estimation accuracy and a much lower mean squared error, while effectively avoiding parameter redundancy. The effectiveness of the proposed method is further validated on a real-world wind power system using publicly available data, confirming its practical applicability.

The remainder of this paper is organized as follows. Section 2 formulates the Hammerstein model. The hierarchical identification strategy is derived in Section 3, which combines the correlation analysis-based least squares algorithm for the linear module and the LM algorithm for the nonlinear module. Section 4 presents the simulation results for both the solid oxide fuel cell (SOFC) system and the wind power system. Finally, the conclusions are given in Section 5.

2. Problem Formulation

This paper focuses on the identification challenge of the Hammerstein model with colored noise. The structure diagram of the Hammerstein model is shown in Figure 1.

The input and output relationship of the model can be described as follows:

\begin{matrix} g (t) & : = & N (u (t)), \end{matrix}

(1)

\begin{matrix} h (t) & : = & \frac{ν (z)}{ζ (z)} g (t), \end{matrix}

(2)

\begin{matrix} q (t) & : = & \frac{1}{ζ (z)} p (t), \end{matrix}

(3)

\begin{matrix} y (t) & = & h (t) + q (t), \end{matrix}

(4)

where

u (t)

and

y (t)

are the input and output signals of the model,

g (t)

is the output of the nonlinear module,

h (t)

is the noise-free output,

p (t)

is zero-mean white noise and

q (t)

is colored noise.

N (\cdot)

is used to describe the characteristics of the nonlinear module. In this paper, a polynomial is selected for the fitting and the specific expression is

N (u (t)) = μ_{1} u (t) + μ_{2} u {(t)}^{2} + \dots + μ_{n_{μ}} u {(t)}^{n_{μ}}

,

μ_{i}

is the parameters of the nonlinear module.

ζ (z) = 1 + ζ_{1} z^{- 1} + ζ_{2} z^{- 2} + \dots + ζ_{n_{ζ}} z^{- n_{ζ}}

and

ν (z) = ν_{1} z^{- 1} + ν_{2} z^{- 2} + \dots + ν_{n_{ν}} z^{- n_{ν}}

are polynomials with

z^{- 1}

being the backward-shift operator

z^{- 1} g (t) = g (t - 1)

,

ζ_{j}

and

ν_{k}

are the parameters of the linear module.

For a given tolerance

ε

, the objective of the proposed identification method is to obtain a Hammerstein model for which the following cost function is acceptably small. The cost function and its associated constraints are given as follows:

\begin{matrix} E ({\hat{μ}}_{1}, \dots, {\hat{μ}}_{n_{μ}}, {\hat{ζ}}_{1}, \dots, {\hat{ζ}}_{n_{ζ}}, {\hat{ν}}_{1}, \dots, {\hat{ν}}_{n_{ν}}) \\ = \frac{1}{2 T} \sum_{t = 1}^{T} {(y (t) - \hat{y} (t))}^{2} \leq ε, \\ s . t . & \hat{g} (t) = {\hat{μ}}_{1} u (t) + {\hat{μ}}_{2} u {(t)}^{2} + \dots + {\hat{μ}}_{n_{μ}} u {(t)}^{n_{μ}}, \\ \hat{h} (t) = \frac{\hat{ν} (z)}{\hat{ζ} (z)} \hat{g} (t), \\ \hat{q} (t) = \frac{1}{\hat{ζ} (z)} \hat{p} (t), \\ \hat{y} (t) = \hat{h} (t) + \hat{q} (t), \end{matrix}

where

{\hat{μ}}_{i}

,

{\hat{ζ}}_{j}

and

{\hat{ν}}_{k}

are the estimated values of the parameters,

\hat{y} (t)

is the estimated output,

\hat{g} (t)

and

\hat{h} (t)

are the estimated internal variables,

\hat{q} (t)

is the estimated colored noise, and T is the length of the data.

3. The Hierarchical Identification Strategy for the Hammerstein Model

For the Hammerstein model described in Equations (1)–(4), this section employs separable input signals and proposes a hierarchical identification strategy. Within this strategy, the correlation analysis is first employed to decouple the linear and nonlinear parts. Then, the linear parameters

ζ_{j}

and

ν_{k}

are estimated using the least squares method, and the nonlinear parameters

μ_{i}

are subsequently obtained via the LM algorithm.

3.1. Linear Parameter Estimation via Least Squares Based on Correlation Analysis

In this subsection, separable input signals

u_{1} (t)

and the corresponding outputs

y_{1} (t)

are employed to estimate the parameters of the linear module. The separable signals refer to stationary zero-mean input signals that are mutually independent and satisfy the conditional expectation property

E [a (t - τ) | a (t)] = Z (τ) a (t)

, where

Z (τ) = \frac{R_{a} (τ)}{R_{a} (0)}

. By leveraging correlation analysis, the input–output data are used to decouple the linear and nonlinear components, enabling the parameters of the linear module to be estimated independently without interference from the nonlinear module.

Based to the stochastic process theory, we can get their correlation functions. The cross-correlation function

R_{y_{1} u_{1}} (τ)

is defined as a measure of covariance between the output

y_{1} (t)

and the delayed input

u_{1} (t - τ)

, reflecting the impulse response characteristics of the linear system. The autocorrelation function

R_{u_{1}} (τ)

describes the statistical property of the input signal itself.

R_{y_{1} u_{1}} (τ)

and

R_{u_{1}} (τ)

can be estimated using Equations (5) and (6), respectively, where L denotes the data length of

u_{1} (t)

and

y_{1} (t)

, which should be sufficiently large.

\begin{matrix} R_{y_{1} u_{1}} (τ) & = & \frac{1}{L} \sum_{t = 1}^{L} y_{1} (t) u_{1} (t - τ), \end{matrix}

(5)

\begin{matrix} R_{u_{1}} (τ) & = & \frac{1}{L} \sum_{t = 1}^{L} u_{1} (t) u_{1} (t - τ) . \end{matrix}

(6)

From Equations (1)–(4), the output

y_{1} (t)

can be derived as

\begin{matrix} y_{1} (t) & = & h_{1} (t) + q_{1} (t), \\ = & \frac{ν (z)}{ζ (z)} g_{1} (t) + \frac{1}{ζ (z)} p_{1} (t), \\ = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} y_{1} (t - j) + \sum_{k = 1}^{n_{ν}} ν_{k} g_{1} (t - j) + p_{1} (t) . \end{matrix}

(7)

Multiplying both sides of Equation (7) by

u_{1} (t - τ)

:

\begin{matrix} y_{1} (t) u_{1} (t - τ) & = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} y_{1} (t - j) u_{1} (t - τ) + \sum_{k = 1}^{n_{ν}} ν_{k} g_{1} (t - k) u_{1} (t - τ) + p_{1} (t) u_{1} (t - τ) . \end{matrix}

Then, take the mathematical expectation on both sides:

\begin{matrix} E [y_{1} (t) u_{1} (t - τ)] \\ = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} E [y_{1} (t - j) u_{1} (t - τ)] + \sum_{k = 1}^{n_{ν}} ν_{k} E [g_{1} (t - k) u_{1} (t - τ)] + E [p_{1} (t) u_{1} (t - τ)] . \end{matrix}

Using the definition of the correlation function

R_{a b} = E [a (t) b (t - τ)]

, rewrite the above equation in terms of correlation functions:

\begin{matrix} R_{y_{1} u_{1}} (τ) & = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} R_{y_{1} u_{1}} (τ - j) + \sum_{k = 1}^{n_{ν}} ν_{k} R_{g_{1} u_{1}} (τ - k) + R_{p_{1} u_{1}} (τ) . \end{matrix}

Since the input

u_{1} (t)

is statistically independent of the noise

p_{1} (t)

, it follows that

R_{p_{1} u_{1}} (τ) = 0

. Substituting this into the above equation yields

\begin{matrix} R_{y_{1} u_{1}} (τ) & = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} R_{y_{1} u_{1}} (τ - j) + \sum_{k = 1}^{n_{ν}} ν_{k} R_{g_{1} u_{1}} (τ - k) . \end{matrix}

(8)

It can be seen that Equation (8) contains the correlation function

R_{g_{1} u_{1}} (τ - k)

of the unmeasurable intermediate variable

g_{1} (t)

, which causes the identification challenge of the Hammerstein model. To overcome this, we use the correlation analysis theory to solve the relationship between the intermediate variable

g_{1} (t)

and the input

u_{1} (t)

. Then, Theorem 1 can be established as follows:

Theorem 1.

For the Hammerstein model with colored noises described in Equations (1)–(4), if the input signals are independent separable signals satisfying

E (u_{1} (t)) = 0

, then there exists a constant λ such that the following equation holds:

\begin{matrix} R_{g_{1} u_{1}} (τ) & = & λ R_{u_{1}} (τ), \end{matrix}

(9)

where

R (\cdot)

denotes the correlation function and

λ = \frac{E [g_{1} (t) u_{1} (t)]}{E [u_{1} (t) u_{1} (t)]}

is a constant.

Proof.

According to the definition of the correlation function and the law of total expectation, the cross-correlation function

R_{g_{1} u_{1}} (τ)

can be expressed as

\begin{matrix} R_{g_{1} u_{1}} (τ) & = & E [g_{1} (t) u_{1} (t - τ)] \\ = & E [E [g_{1} (t) u_{1} (t - τ) | u_{1} (t)]] . \end{matrix}

(10)

Since

g_{1} (t) = N (u_{1} (t))

is a function of

u_{1} (t)

, Equation (10) can be expressed as

\begin{matrix} R_{g_{1} u_{1}} (τ) & = & E [E [N (u_{1} (t)) u_{1} (t - τ) | u_{1} (t)]] \\ = & E [N (u_{1} (t)) E [u_{1} (t - τ) | u_{1} (t)]] \\ = & E [g_{1} (t) E [u_{1} (t - τ) | u_{1} (t)]] . \end{matrix}

For separable input signals

u_{1} (t)

satisfying

E [u_{1} (t - τ) | u_{1} (t)] = Z (τ) u_{1} (t)

, where

Z (τ) = \frac{R_{u_{1}} (τ)}{R_{u_{1}} (0)}

, this yields

\begin{matrix} R_{g_{1} u_{1}} (τ) & = & E [g_{1} (t) Z (τ) u_{1} (t)] \\ = & Z (τ) E [g_{1} (t) u_{1} (t)] . \end{matrix}

(11)

Similarly, the autocorrelation function of

u_{1} (t)

satisfies

\begin{matrix} R_{u_{1}} (τ) & = & Z (τ) E [u_{1} (t) u_{1} (t)] . \end{matrix}

(12)

Therefore, combining Equations (11) and (12) gives

\begin{matrix} \frac{R_{g_{1} u_{1}} (τ)}{R_{u_{1}} (τ)} & = & \frac{Z (τ) E [g_{1} (t) u_{1} (t)]}{Z (τ) E [u_{1} (t) u_{1} (t)} \\ = & \frac{E [g_{1} (t) u_{1} (t)]}{E [u_{1} (t) u_{1} (t)} . \end{matrix}

(13)

Define

λ = \frac{E [g_{1} (t) u_{1} (t)]}{E [u_{1} (t) u_{1} (t)}

. Then,

R_{g_{1} u_{1}} (τ) = λ R_{u_{1}} (τ)

. This completes the proof of Theorem 1. □

Theorem 1 relates the unmeasurable intermediate variable

g_{1} (t)

to the input

u_{1} (t)

via the constant

λ

[26]. This relationship serves as the foundation for decoupling the linear and nonlinear blocks.

Substituting Equation (9) into Equation (8) yields

\begin{matrix} R_{y_{1} u_{1}} (τ) & = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} R_{y_{1} u_{1}} (τ - j) + \sum_{k = 1}^{n_{ν}} {\bar{ν}}_{k} R_{u_{1}} (τ - k), \end{matrix}

(14)

where

{\bar{ν}}_{k} = λ ν_{k}

.

Let

τ = 1, 2, \dots, M (M \geq n_{ζ} + n_{ν})

. Define the estimated parameter vector, the cross-correlation function vector and the correlation function matrix:

\begin{matrix} {\hat{ϑ}}_{l} & : = & [{\hat{ζ}}_{1}, {\hat{ζ}}_{2}, \dots, {\hat{ζ}}_{n_{ζ}}, {\hat{\bar{ν}}}_{1}, {\hat{\bar{ν}}}_{2}, \dots, {\hat{\bar{ν}}}_{n_{ν}}] \in R^{(n_{ζ} + n_{ν})}, \\ R & = & [R_{y_{1} u_{1}} (1), R_{y_{1} u_{1}} (2), \dots, R_{y_{1} u_{1}} (M)] \in R^{M}, \\ φ_{l} & = & [\begin{matrix} - R_{y_{1} u_{1}} (0) & - R_{y_{1} u_{1}} (1) & - R_{y_{1} u_{1}} (2) & \dots & - R_{y_{1} u_{1}} (M - 1) \\ 0 & - R_{y_{1} u_{1}} (0) & - R_{y_{1} u_{1}} (1) & \dots & - R_{y_{1} u_{1}} (M - 2) \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & - R_{y_{1} u_{1}} (M - n_{ζ}) \\ R_{u_{1}} (0) & R_{u_{1}} (1) & R_{u_{1}} (2) & \dots & R_{u_{1}} (M - 1) \\ 0 & R_{u_{1}} (0) & R_{u_{1}} (1) & \dots & R_{u_{1}} (M - 2) \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & R_{u_{1}} (M - n_{ν}) \end{matrix}] \in R^{(n_{ζ} + n_{ν}) \times M} . \end{matrix}

Based on the above definitions, Equation (14) can be explicitly written as

\begin{matrix} R & = & {\hat{ϑ}}_{l} φ_{l} . \end{matrix}

(15)

For Equation (15), we construct the following least squares cost function:

\begin{matrix} J ({\hat{ϑ}}_{l}) & = & ∥ R - {\hat{ϑ}}_{l} φ_{l} ∥^{2} . \end{matrix}

Setting the derivative of the cost function with respect to

{\hat{ϑ}}_{l}

to zero yields the analytical solution:

\begin{matrix} {\hat{ϑ}}_{l} & = & R φ_{l}^{T} {(φ_{l} φ_{l}^{T})}^{- 1}, \end{matrix}

(16)

The steps of the correlation analysis-based least squares algorithm are as follows:

Collect the input and output data $u_{1} (t)$ and $y_{1} (t)$ .
Obtain the cross-correlation function $R_{y_{1} u_{1}} (τ)$ and the autocorrelation function $R_{u_{1}} (τ)$ using Equations (5) and (6).
Construct the cross-correlation function vector $R$ and the correlation function matrix $φ_{l}$ .
Estimate the parameters of the linear module $ϑ_{l}$ by Equation (16).

Remark 1.

Compared to the over-parameterization and the key term separation method, the correlation analysis can decouple the Hammerstein model by directly exploiting the statistical properties of the input and output signals. It avoids the issue of redundant parameters and the risk of overfitting.

3.2. Nonlinear Parameter Estimation via Levenberg–Marquardt Algorithm

Based on the preceding correlation analysis and least squares algorithm, the parameters

ζ_{j}

and

ν_{k}

have been identified. In this subsection, a distinct set of input–output data

u_{2} (t)

and

y_{2} (t)

is employed to estimate the nonlinear module parameters

μ_{i}

via the LM algorithm.

From Equations (1)–(4), the output signal can be rewritten as

\begin{matrix} y_{2} (t) & = & h_{2} (t) + q_{2} (t), \\ = & \frac{ν (z)}{ζ (z)} g_{2} (t) + \frac{1}{ζ (z)} p_{2} (t), \\ = & - \sum_{j = 1}^{n_{ζ}} ζ_{j} y_{2} (t - j) + \sum_{k = 1}^{n_{ν}} \sum_{i = 1}^{n_{μ}} ν_{k} μ_{i} u_{2} {(t - k)}^{i} + p_{2} (t), \\ = & φ_{n}^{T} (t) ϑ_{n} + p_{2} (t), \end{matrix}

(17)

where

φ_{n} (t) \in R^{n_{0}}

is the information vector,

ϑ_{n} \in R^{n_{0}}

is the parameter vector and

n_{0} : = n_{ζ} + n_{ν} n_{μ}

.

Express the model in Equation (17) for

t = 1, 2, \dots, N

in matrix form, where N denotes the data length of

u_{2} (t)

and

y_{2} (t)

for nonlinear module estimation.

\begin{matrix} Y & = & Φ ϑ_{n} + P, \end{matrix}

where

Y = {[y_{2} (1), y_{2} (2), \dots, y_{2} (N)]}^{T}

is the output vector,

Φ = {[φ_{n} (1), φ_{n} (2), \dots, φ_{n} (N)]}^{T}

is the regression matrix and

P = [p_{2} (1), p_{2} (2), \dots, p_{2} (N)]

is the noise vector.

The goal of parameter estimation is to minimize the following least squares objective function:

\begin{matrix} J (ϑ_{n}) = \frac{1}{2} {∥ Y - Φ ϑ_{n} ∥}^{2} . \end{matrix}

Define the residual vector as

\begin{matrix} r (ϑ_{n}) = Y - Φ ϑ_{n} . \end{matrix}

The Jacobian matrix is defined as follows:

\begin{matrix} J = \frac{\partial r (ϑ_{n})}{\partial {ϑ_{n}}^{T}} = - Φ . \end{matrix}

The gradient of the objective function is

\begin{matrix} g r a d = J^{T} r (ϑ_{n}) = - Φ^{T} (Y - Φ ϑ_{n}) . \end{matrix}

The norm of the gradient

∥ g r a d ∥

reflects how close the current parameters are to the optimal solution and can serve as a criterion for terminating the iteration.

Then we can get the LM algorithm as follows:

\begin{matrix} {\hat{ϑ}}_{n}^{(k + 1)} & = & {\hat{ϑ}}_{n}^{(k)} - {(J^{T} J + λ I)}^{- 1} J^{T} r {({\hat{ϑ}}_{n})}^{(k)}, \end{matrix}

(18)

\begin{matrix} r {({\hat{ϑ}}_{n})}^{(k)} & = & Y - Φ {\hat{ϑ}}_{n}, \end{matrix}

(19)

\begin{matrix} J & = & - Φ, \\ φ_{n} (t) & = & [- y_{2} (t - 1), - y_{2} (t - 2), \dots, - y_{2} (t - n_{ζ}), u_{2} (t - 1), u_{2} {(t - 1)}^{2}, \dots, \\ u_{2} {(t - 1)}^{n_{μ}}, u_{2} (t - 2), u_{2} {(t - 2)}^{2}, \dots, u_{2} {(t - 2)}^{n_{μ}}, \dots, u_{2} (t - n_{ν}), \end{matrix}

(20)

\begin{matrix} u_{2} {(t - n_{ν})}^{2}, \dots, u_{2} {(t - n_{ν})}^{n_{μ}} {,]}^{T} \in R^{n_{0}}, \end{matrix}

(21)

\begin{matrix} Φ & = & {[φ_{n} (1), φ_{n} (2), \dots, φ_{n} (T)]}^{T} \in R^{T \times n_{0}}, \end{matrix}

(22)

\begin{matrix} Y & = & {[y_{2} (1), y_{2} (2), \dots, y_{2} (T)]}^{T} \in R^{T}, \end{matrix}

(23)

\begin{matrix} {\hat{ϑ}}_{n} & = & [{\hat{ζ}}_{1}, {\hat{ζ}}_{2}, \dots, {\hat{ζ}}_{n_{ζ}}, {\hat{ν}}_{1} {\hat{μ}}_{1}, {\hat{ν}}_{1} {\hat{μ}}_{2}, \dots, {\hat{ν}}_{1} {\hat{μ}}_{n_{μ}}, {\hat{ν}}_{2} {\hat{μ}}_{1}, {\hat{ν}}_{2} {\hat{μ}}_{2}, \dots, {\hat{ν}}_{2} {\hat{μ}}_{n_{μ}}, \dots, {\hat{ν}}_{n_{ν}} {\hat{μ}}_{1}, \\ {\hat{ν}}_{n_{ν}} {\hat{μ}}_{2}, \dots, {\hat{ν}}_{n_{ν}} {\hat{μ}}_{n_{μ}}]^{T} \in R^{n_{0}}, \end{matrix}

(24)

where

λ > 0

is the damping parameter. When

λ = 0

, the algorithm reduces to the Newton method. When

λ

is large enough, the algorithm approximates the gradient descent method.

The steps for parameter estimation of the Hammerstein model in (1)–(4) using the LM algorithm in (18)–(24) are as follows:

Initialization: let $k = 1$ . Set $n_{0} = n_{ζ} + n_{ν} n_{μ}$ , $p_{0} = 1 \times e^{10^{5}}$ , ${\hat{ϑ}}_{n}^{(0)} = \frac{1}{p_{0}} I_{n_{0}}$ , $K_{max} = 50$ and $ε = 10^{- 6}$ .
Collect the input data $u_{2} (t)$ and the output data $y_{2} (t)$ for $t = 1, 2, \dots, T$ , construct the information vector $φ_{n} (t)$ using Equation (21). Then construct the information matrix $Φ$ and the output vector $Y$ by Equations (22) and (23).
Compute the residual vector $r {({\hat{ϑ}}_{n})}^{(k)}$ at the k-th iteration and the Jacobian matrix $J$ using Equations (19) and (20).
Update the parameter estimation vector ${\hat{ϑ}}_{n}^{(k)}$ at the k-th iteration by Equation (18).
Increase k by 1 and go back to Step 3 until $k > K_{max}$ or $∥ g r a d^{(k)} ∥ < ε$ .

The pseudo-code of the proposed algorithm is shown in Algorithm 1.

By substituting the estimated parameters

{\hat{ζ}}_{j}

and

{\hat{ν}}_{k}

of the linear block, derived in the preceding subsection, into the parameter vector

{\hat{ϑ}}_{n}

, the parameters

{\hat{μ}}_{i}

of the nonlinear block are obtained via decoupling.

Algorithm 1 The LM algorithm.

Require:: $y_{2} (t)$ , $u_{2} (t)$ for $t = 1, 2, \dots, T$ ;
$n_{ζ}$ , $n_{ν}$ , $n_{μ}$ ;
$p_{0} = 1 \times e^{10^{5}}$ ;
$K_{max} = 50$ ;
$ε = 10^{- 6}$ .
Ensure:: estimated parameter vector ${\hat{ϑ}}_{n}$ .

Objective function: $J (ϑ_{n}) = \frac{1}{2} {∥ Y - Φ ϑ_{n} ∥}^{2}$
Initialize: $n_{0} = n_{ζ} + n_{ν} n_{μ}$ , ${\hat{ϑ}}_{n}^{(0)} = \frac{1}{p_{0}} I_{n_{0}}$ , $k = 1$ .
while $k \leq K_{max}$ do
Construct the information matrix and the output vector: $Φ$ and $Y$
Compute the residual vector: $r {(ϑ_{n})}^{(k)} = Y - Φ ϑ_{n}^{(k)}$
Compute the Jacobian matrix: $J = - Φ$
Compute gradient: $g r a d^{(k)} = - Φ^{T} (Y - Φ ϑ_{n}^{(k)})$
if $∥ g r a d^{(k)} ∥ < ε$ then
break
end if
Update parameters: ${\hat{ϑ}}_{n}^{(k + 1)} = {\hat{ϑ}}_{n}^{(k)} - {(J^{T} J + λ I)}^{- 1} J^{T} r {({\hat{ϑ}}_{n})}^{(k)}$
$k = k + 1$
end while
return ${\hat{ϑ}}_{n}$

Remark 2.

In this paper, the damping parameter λ in the LM algorithm is treated as a fixed constant. This choice is made for simplicity and to clearly demonstrate the effectiveness of the proposed hierarchical identification strategy. The effect of different fixed λ values on convergence performance is investigated in Section 4.1. An adaptive λ update strategy will be incorporated in future work to further improve the algorithm’s efficiency, without affecting the core contribution of this paper.

4. Example

In this section, two simulation examples are presented to demonstrate the effectiveness of the proposed algorithm. The first example is based on an SOFC, and the second example is based on a practical wind power system.

4.1. Example 1: The SOFC Model

SOFC is an advanced electrochemical energy conversion device that can directly convert the chemical energy of fuels into electrical energy, offering significant advantages such as high efficiency, low emissions and strong fuel flexibility. These features position SOFC as a key enabling technology for environmental sustainability in the power generation sector. It has a wide range of applications, including small-scale residential combined heat and power systems to provide electricity and hot water for households, or as an auxiliary or main power source for vehicles, ships and drones [27,28]. As shown in Figure 2, hydrogen is used as the fuel for the SOFC in this example, and the Hammerstein model is employed for modeling the energy conversion process. The nonlinear module describes the mapping from the hydrogen flow rate

u (t)

to the current density

g (t)

, and the linear module describes the dynamic characteristics from the current density

g (t)

to the output power

y (t)

.

In this example, a set of separable inputs is constructed using two distinct pairs of input–output datasets. The first signal,

u_{1} \sim U (- \sqrt{3}, \sqrt{3})

, is a uniformly distributed random sequence with zero mean and a variance of one, consisting of 3000 data points to ensure persistent excitation. The second signal,

u_{2} \sim U (0, 3)

, comprising 500 data points, is intended to excite the nonlinear characteristics. The noise variance is set to

σ^{2} = 0.5

. The input signal

u_{1} (t) \sim U (- \sqrt{3}, \sqrt{3})

satisfies the assumptions of Theorem 1. It has zero mean by construction, i.e.,

E [u_{1} (t)] = 0

. Moreover, a zero-mean uniformly distributed random sequence with a sufficiently large data length approximates a Gaussian distribution, which is a typical class of separable signals. Hence, the assumptions of Theorem 1 are considered to be satisfied in this simulation.

First, the parameters

ζ_{j}

and

ν_{k}

of the linear module are estimated using the correlation analysis-based least squares algorithm with the first input–output dataset

u_{1} (t)

and

y_{1} (t)

. The parameter estimation results, along with the normalized estimation error

∥ {\hat{ϑ}}_{l} (t) - ϑ_{l} ∥ / ∥ ϑ_{l} ∥

, are presented in Table 1. The evolution of this error over time t is illustrated in Figure 3.

Based on the second input–output dataset

u_{2} (t)

and

y_{2} (t)

, the parameters

μ_{i}

of the nonlinear module are then estimated using the LM algorithm, incorporating the previously estimated parameters of the linear module. Set the maximum number of iterations to 500 and the convergence tolerance

ε = 10^{- 6}

.

To analyze the effect of

λ

on the convergence performance of the LM algorithm, Table 2 presents the number of iterations required for convergence and the mean squared error (MSE) achieved under different values of

λ

.

To illustrate the impact of

λ

on fitting performance, two representative values are selected from Table 2 for comparison in Figure 4:

λ = 0.1

and

λ = 85

. The former leads to fast convergence and high fitting accuracy. However, the latter fails to converge within the maximum number of iterations. This is because when

λ

is excessively large, the

λ I

term dominates the Hessian approximation

(J^{T} J + λ I)

, making the LM update approximate gradient descent with a very small step size. Consequently, the algorithm progresses extremely slowly toward the optimum, demonstrating that an overly large damping parameter deteriorates the fitting performance. Figure 4 presents a comparison between the actual nonlinear characteristics and the polynomial fitting results obtained using the estimated parameters. The blue line represents the actual nonlinear module. The yellow line denotes the polynomial fitting result based on the LM algorithm when

λ = 0.1

. The purple line is the fitting result of the LM algorithm when

λ = 85

.

To quantitatively evaluate the performance of the proposed hierarchical identification strategy, this paper compares it with the conventional over-parameterization method combined with least squares, which is one of the most widely used approaches for Hammerstein model identification. The over-parameterization method rewrites the Hammerstein model as a linear regression model by treating the product terms

{\bar{ν}}_{k, i} = ν_{k} μ_{i}

as independent parameters. These parameters are estimated directly using the least squares algorithm. To obtain the individual parameters

ν_{k}

and

μ_{i}

for a fair comparison, the normalization constraint

ν_{1} = 1

is adopted. All comparisons are performed under the same simulation conditions. Table 3 summarizes the estimated linear module parameters obtained by the proposed algorithm strategy and the over-parameterization method, together with the true values. Figure 5 shows the fitted nonlinear curves of both methods, where the blue line represents the true nonlinear characteristics, the red dashed line represents the proposed algorithm strategy, and the green dotted line represents the over-parameterization method. Table 4 presents the MSE for both methods.

Based on the simulation results summarized in Table 1, Table 2, Table 3 and Table 4 and Figure 3, Figure 4 and Figure 5, the following conclusions are obtained.

Table 1 and Figure 4 show that the correlation analysis-based least squares algorithm demonstrates high accuracy and fast convergence in estimating the parameters of the linear module. As time t increases, the identification error gradually decreases and ultimately falls below 5%.
Figure 5 and Table 2 show that the LM algorithm can effectively estimate the parameters of the nonlinear module and achieve good fitting performance. As the damping parameter $λ$ increases, the number of iterations required for convergence gradually grows. For all converging cases, the algorithm achieves exactly the same MSE of 0.033196, indicating that the optimization problem is convex and the global optimum is unique. However, when $λ$ exceeds a certain threshold, e.g., $λ = 85$ , the algorithm fails to converge within the maximum number of iterations and yields a higher MSE of 0.036153, demonstrating that an excessively large damping parameter leads to deteriorated fitting performance.
Compared with the over-parameterization method, the proposed algorithm strategy achieves significantly higher estimation accuracy. As shown in Table 3, the linear parameter estimates obtained by the proposed algorithm are much closer to the true values. Moreover, Table 4 and Figure 5 demonstrate that the proposed algorithm achieves a much lower MSE of 0.0317, while the over-parameterization method gives an MSE of 0.1486. The proposed algorithm also achieves a better fit to the true nonlinear characteristics.

4.2. Example 2: The Wind Power System

Wind power systems exhibit complex physical relationships, strong nonlinearity, and inherent randomness, which pose significant challenges for system modeling and power prediction. The wind turbine converts wind energy into mechanical energy through the rotation of blades driven by wind force and then into electrical energy via a generator. The relationship between wind speed and output power is highly nonlinear due to factors such as wind speed variation, yaw adjustment lag, and turbine dynamics, while the dynamic response from the current density to the power output can be approximated as linear under certain operating conditions. Therefore, the wind power system can be appropriately described by a Hammerstein model, where the static nonlinear module captures the nonlinear mapping from wind speed to the current density, and the dynamic linear module characterizes the linear dynamics from the current density to the output power.

To validate the proposed identification algorithm on a real-world system, we apply the publicly available wind farm data from Turkey (https://www.kaggle.com/winternguyen/wind-power-curve-modeling/data (accessed on 1 March 2021)). This dataset contains wind speed and power measurements collected every 10 min throughout each month. The data are categorized into two seasons: the breeze season and the gale season. Before parameter estimation, data preprocessing is performed [29]. After that, the gale season data are reduced to 3214 data points, and the breeze season data are reduced to 1351 data points. In this study, the breeze season data and the gale season data are used as two separate input–output datasets. The gale season data, characterized by stronger fluctuations and richer dynamic characteristics, are used to estimate the parameters of the linear module and are denoted as (

u_{1} (t), y_{1} (t)

). The breeze season data, which capture more stable and representative nonlinear relationships, are used to estimate the parameters of the nonlinear module and are denoted as (

u_{2} (t), y_{2} (t)

). This strategy ensures that the distinct characteristics of the wind power system are fully exploited for accurate decoupled identification.

First, the linear module parameters are estimated using the correlation analysis-based least squares method with the gale season data (

u_{1} (t), y_{1} (t)

). The linear module orders are set as

n_{s} = 2

and

n_{r} = 3

. The poles of the estimated linear system are 0.7680 and 0.1318, both with magnitudes less than one, confirming the stability of the identified linear module. Next, the nonlinear module parameters are estimated using the Levenberg–Marquardt algorithm with the breeze season data (

u_{2} (t), y_{2} (t)

). The damping parameter is set as

λ = 0.1

, and the maximum number of iterations is 500.

Using the estimated linear and nonlinear parameters, the predicted output

{\hat{y}}_{2} (t)

is computed. Figure 6 presents a comparison between the true output

y_{2} (t)

and the estimated output

{\hat{y}}_{2} (t)

, where the blue line represents the true output and the yellow dots represent the estimated outputs. The proposed algortihm accurately tracks the actual power output throughout the entire time horizon, demonstrating the effectiveness of the identification strategy.

To further assess the estimation accuracy, Figure 7 presents the scatter plot of estimated versus true values, where the blue line represents the true output and the yellow dots represent the estimated outputs. Most points lie close to the diagonal line, indicating a strong linear relationship between the predicted and actual outputs, with a correlation coefficient of 0.9676.

Figure 8 shows the autocorrelation function (ACF) of the residuals, where the blue bars represent the sample autocorrelations at different lags, the red dashed lines indicate the 95% confidence bounds, and the black solid line is the zero reference. The residuals exhibit no significant autocorrelation beyond the confidence bounds, supporting the whiteness of the residual sequence. The Ljung–Box test yields a p-value of 0.2435, which is greater than the significance level of 0.05, confirming that the residuals are white noise.

Figure 9 presents the Taylor diagram, which provides a visual summary of the model performance. The Taylor diagram uses polar coordinates, where the azimuthal angle represents the correlation coefficient and the radial distance represents the standard deviation ratio. In the diagram, the red circle represents the proposed algorithm, the black dashed line is the reference unit circle, and the black dotted line indicates the 1.5× reference radius. The point representing the predicted output lies very close to the reference point on the unit circle, with a correlation coefficient of 0.9676 and a standard deviation ratio of 0.9678, further confirming the high accuracy of the proposed identification method.

The numerical error metrics are summarized in Table 5. The coefficient of determination (

R^{2}

) measures the proportion of the variance in the output that is explained by the model, indicating the overall goodness of fit. The correlation coefficient (R) quantifies the linear relationship between the predicted and true outputs, with values close to one indicating strong agreement. The normalized root mean square error (NRMSE) is calculated with respect to the data range and provides a scale-independent measure of prediction accuracy, where lower values indicate better performance. The Ljung–Box test is applied to the residuals to check for whiteness; a p-value greater than 0.05 indicates that the residuals are uncorrelated and the model adequately captures the system dynamics.

Based on the simulation results of the wind power system, the following conclusions can be drawn.

The correlation analysis-based least squares method successfully estimates the linear module parameters. The poles of the identified linear system are 0.7680 and 0.1318, both with magnitudes less than one, confirming that the linear module is stable. This stability is essential for reliable long-term prediction and control of the wind power system.
The Levenberg–Marquardt algorithm converges within 18 iterations with a damping parameter of $λ = 0.1$ , demonstrating fast convergence properties. This makes the LM algorithm well-suited for the parameter estimation of nonlinear systems.
The proposed hierarchical identification strategy achieves high prediction accuracy. The coefficient of determination ( $R^{2} = 0.9363$ ) indicates that the model explains over 93% of the variance in the output, and the correlation coefficient ( $R = 0.9676$ ) confirms a strong linear relationship between the predicted and true outputs. These results demonstrate that the proposed method effectively captures both the nonlinear and dynamic characteristics of the wind power system.
The Ljung–Box test yields a p-value of 0.2435, which is greater than the significance level of 0.05. This indicates that the residuals are white noise and exhibit no significant autocorrelation, confirming that the model adequately captures the system dynamics without leaving systematic information in the residuals.
The NRMSE with respect to the data range is 9.11%, meaning that the average prediction error is less than one-tenth of the full output range. This relatively low error, combined with the high $R^{2}$ value, further confirms the effectiveness and practical applicability of the proposed method for real-world wind power system identification.

5. Conclusions

This paper has presented a hierarchical identification strategy for Hammerstein models by integrating correlation analysis with the LM algorithm. By exploiting the cross-correlation and autocorrelation properties of separable signals, the proposed strategy successfully decouples the linear and nonlinear modules, allowing them to be estimated independently. The linear module parameters are efficiently obtained using a correlation analysis-based least squares method, while the nonlinear module parameters, represented by polynomial basis functions, are accurately estimated using the LM algorithm.

The effectiveness of the proposed method has been validated through two simulation examples. The first example, based on an SOFC system, demonstrates that the correlation analysis-based least squares method achieves high accuracy in estimating the linear module parameters, with the identification error ultimately falling to a low level. The LM algorithm effectively estimates the nonlinear module parameters, and the damping parameter plays a critical role in convergence performance. Comparative results show that the proposed hierarchical strategy significantly outperforms the conventional over-parameterization method, achieving a much lower mean squared error and a better fit to the true nonlinear characteristics.

The second example applies the proposed method to a real-world wind power system using publicly available wind farm data. The identified linear system is stable, with poles located within the unit circle. The LM algorithm converges within a small number of iterations with an appropriate damping parameter, demonstrating fast convergence properties. The proposed method achieves high prediction accuracy, with a high coefficient of determination and a strong correlation between the predicted and true outputs. The Ljung–Box test confirms that the residuals are white noise, indicating that the model adequately captures the system dynamics. The normalized RMSE with respect to the data range remains at a low level, demonstrating the practical applicability of the proposed method.

These results confirm that the proposed hierarchical identification strategy is effective, accurate, and practical for identifying Hammerstein systems in both simulated and real-world applications. Future work will focus on extending the method to more complex nonlinear systems and exploring its application to other industrial processes.

Author Contributions

Conceptualization, Q.D.; methodology, Q.D.; software, Q.D. and H.J.; validation, Q.L.; formal analysis, Q.D.; investigation, Q.L.; resources, Q.L.; writing—original draft preparation, Q.D. and Q.L.; writing—review and editing, Q.L. and Y.G.; visualization, Q.D.; supervision, Q.L. and Y.G.; project administration, Q.L.; funding acquisition, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by Basic Research Program of Jiangsu (China, Grant No. BK20230229).

Data Availability Statement

All data generated or analyzed during this study are included in this article.

Conflicts of Interest

No potential conflicts of interest are reported by the authors.

Abbreviations

The following abbreviations are used in this manuscript:

LM	Levenberg–Marquardt
SOFC	Solid Oxide Fuel Cell
MSE	Mean Squared Error
ACF	Autocorrelation Function
NRMSE	Normalized Root Mean Square Error

References

Wang, D.; Dang, G. Fuzzy Recurrent Stochastic Configuration Networks for Industrial Data Analytics. IEEE Trans. Fuzzy Syst. 2025, 33, 1178–1191. [Google Scholar] [CrossRef]
Li, K.; Qiao, J.; Wang, D. Fuzzy Stochastic Configuration Networks for Nonlinear System Modeling. IEEE Trans. Fuzzy Syst. 2024, 32, 948–957. [Google Scholar] [CrossRef]
Cheng, Y.; Yan, J.K.; Zhang, F.; Li, M.D.; Zhou, N.; Shi, C.J.; Jin, B.; Zhang, W.H. Surrogate Modeling of Pantograph-Catenary System Interactions. Mech. Syst. Signal Process. 2025, 224, 112134. [Google Scholar] [CrossRef]
Zhan, Q.; Zhao, Y.; Ouakad, H. A Theoretical and Experimental Study on the Vibration Control of a Coupled Beam System Using a Series-Coupled Dual Oscillator System Model. Mech. Syst. Signal Process. 2026, 242, 113608. [Google Scholar] [CrossRef]
Li, S.; Khan, M.I.; Alzahrani, F.; Eldin, S.M. Heat and Mass Transport Analysis in Radiative Time Dependent Flow in the Presence of Ohmic Heating and Chemical Reaction, Viscous Dissipation: An Entropy Modeling. Case Stud. Therm. Eng. 2023, 42, 102722. [Google Scholar] [CrossRef]
Li, L.; Li, Q.; Ni, Y.; Wang, C.; Tan, Y.; Tan, D. Critical Penetrating Vibration Evolution Behaviors of the Gas-Liquid Coupled Vortex Flow. Energy 2024, 292, 130236. [Google Scholar] [CrossRef]
Tan, J.; Zuo, L.; Lavidas, G.; Metrikine, A. Extending the statistical linearization method to multi-variate non-differentiable nonlinearities in floating renewable energy devices. Renew. Energy 2026, 256, 123964. [Google Scholar] [CrossRef]
Lee, C.C.; Yan, J. Will artificial intelligence make energy cleaner? Evidence of nonlinearity. Appl. Energy 2024, 363, 123081. [Google Scholar] [CrossRef]
Chen, L.; Guo, W.; Lopes, A.M.; Wu, R.; Li, P.; Yin, L. State-of-charge estimation for lithium-ion batteries based on incommensurate fractional-order observer. Commun. Nonlinear Sci. Numer. Simul. 2023, 118, 107059. [Google Scholar] [CrossRef]
Xu, L.; Xu, H.; Wei, C.; Ding, F.; Zhu, Q. The Filtering-Based Recursive Least Squares Identification and Convergence Analysis for Nonlinear Feedback Control Systems with Coloured Noises. Int. J. Syst. Sci. 2024, 55, 3461–3484. [Google Scholar] [CrossRef]
Chen, Z.; Renda, F.; Gall, A.L.; Mocellin, L.; Bernabei, M.; Dangel, T.; Ciuti, G.; Cianchetti, M.; Stefanini, C. Data-driven methods applied to soft robot modeling and control: A review. IEEE Trans. Autom. Sci. Eng. 2025, 22, 2241–2256. [Google Scholar] [CrossRef]
Liu, X.; Liu, X.; Dai, W. Robust SCN for data-driven modeling based on heavy-tailed noise distribution. IEEE Trans. Instrum. Meas. 2025, 74, 5013713. [Google Scholar] [CrossRef]
Bishnu, S.K.; Alnouri, S.Y.; Al-Mohannadi, D.M. Computational applications using data driven modeling in process systems: A review. Digit. Chem. Eng. 2023, 8, 100111. [Google Scholar] [CrossRef]
Zhang, M.; Yang, D.; Du, J.; Sun, H.; Li, L.; Wang, L.; Wang, K. A review of SOH prediction of Li-ion batteries based on data-driven algorithms. Energies 2023, 16, 3167. [Google Scholar] [CrossRef]
Dai, W.; Pang, J.W.; Ding, J.; Wang, J.H.; Xu, C.; Zhang, L.Y.; Ren, N.Q.; Yang, S.S. Integrated real-time intelligent control for wastewater treatment plants: Data-driven modeling for enhanced prediction and regulatory strategies. Water Res. 2025, 274, 123099. [Google Scholar] [CrossRef] [PubMed]
Yu, F.; He, D.; Jia, R.; Mao, Z. Structure identification of time delay polynomial Hammerstein models. Automatica 2025, 179, 112386. [Google Scholar] [CrossRef]
Li, L.; Wang, F.; Zhang, J.; Liu, X. Hammerstein system identification using robust estimator based on quantized observation. In Proceedings of the 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS), Xiangtan, China, 12–14 May 2023; pp. 95–99. [Google Scholar]
Li, Z.; Zhu, L.; Chung, C.-W.; Chen, J. Recursive identification of kernel-based Hammerstein model for online energy efficiency estimation of large-scale chemical plants. Energy 2024, 309, 132946. [Google Scholar] [CrossRef]
Zhai, S.M.; Yu, P.; Wang, D.Q. PCA based Hammerstein model for fuel cell degradation prediction by the hierarchical estimation technique. J. Electrochem. Soc. 2025, 172, 104511. [Google Scholar] [CrossRef]
Wang, D.Q. Key-term separation based hierarchical gradient approach for NN based Hammerstein battery model. Appl. Math. Lett. 2024, 157, 109207. [Google Scholar] [CrossRef]
Ding, F.; Xu, L.; Zhang, X.; Ma, H. Hierarchical Gradient- and Least-Squares-Based Iterative Estimation Algorithms for Input-Nonlinear Output-Error Systems from Measurement Information by Using the Over-Parameterization. Int. J. Robust Nonlinear Control 2024, 34, 1120–1147. [Google Scholar] [CrossRef]
Jing, S. Time-delay Hammerstein system identification using modified cross-correlation method and variable stacking length multi-error algorithm. Math. Comput. Simul. 2023, 207, 288–300. [Google Scholar] [CrossRef]
Lan, X.; Tu, Q.; Han, J.; Zhang, J.; Tang, Y.; He, Y.; Hua, C.; Liu, W. Cross-correlation matrix-based three-dimensional de-reverberation beamforming for improving acoustic beamforming maps in a reverberant environment. Measurement 2026, 270, 120798. [Google Scholar] [CrossRef]
Fu, D.; Chen, T.Q.; Jia, R.; Sharan, V. Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression. Adv. Neural Inf. Process. Syst. 2024, 37, 98675–98716. [Google Scholar]
Fischer, A.; Izmailov, A.F.; Solodov, M.V. The Levenberg–Marquardt Method: An Overview of Modern Convergence Theories and More. Comput. Optim. Appl. 2024, 89, 33–67. [Google Scholar] [CrossRef]
Li, F.; Sun, X.; Cao, Q. Parameter learning of multi-input multi-output Hammerstein system with measurement noises utilizing combined signals. Int. J. Adapt. Control Signal Process. 2025, 39, 1416–1433. [Google Scholar] [CrossRef]
Du, J.; Chen, J.; Li, J. Auto-balanced multi-MPC control of a SOFC system based on included angle. Renew. Energy 2025, 249, 123205. [Google Scholar] [CrossRef]
Frenkel, W.; Kersten, J.; Husmann, R.; Aschemann, H. Design of robust PID controllers for SOFC stacks. In Proceedings of the 2022 IEEE Conference on Control Technology and Applications (CCTA), Trieste, Italy, 23–25 August 2022; pp. 510–515. [Google Scholar]
Li, F.; Zhang, M.; Yu, Y.; Li, S. Deep Belief Network-Based Hammerstein Nonlinear System for Wind Power Prediction. IEEE Trans. Instrum. Meas. 2024, 73, 6505912. [Google Scholar] [CrossRef]

Figure 1. The structure diagram of the Hammerstein model.

Figure 2. Schematic Diagram of SOFC.

Figure 3. Parameter estimation errors of the linear module.

Figure 4. The fitting result of the LM algorithm under different

λ

.

Figure 4. The fitting result of the LM algorithm under different

λ

.

Figure 5. Comparison of nonlinear module fitting between the proposed algorithm strategy and the over-parameterization method.

Figure 6. Comparison between true output and estimated output.

Figure 7. Scatter plot of estimated versus true values with diagonal reference line.

Figure 8. ACF of the residuals.

Figure 9. Taylor diagram for the wind power system identification.

Table 1. Parameter estimation results of the linear module.

t	$ζ_{1}$	$ζ_{2}$	$ν_{1}$	$ν_{2}$	$ν_{3}$	$δ$ (%)
100	1.02587	0.81050	0.52231	0.15579	0.14440	15.85200
200	1.03940	0.94314	0.46649	0.21879	0.20075	13.20445
500	0.95387	0.88798	0.40070	0.13650	0.17892	5.23579
1000	0.93358	0.91735	0.39469	0.10660	0.21269	4.36271
2000	0.91654	0.91387	0.37391	0.09349	0.20617	3.58833
3000	0.93223	0.90890	0.39673	0.11251	0.21454	4.68793
True values	0.95000	0.91000	0.35000	0.09000	0.18000	0.00000

Table 2. Relationship between the damping parameter

λ

and the number of iterations.

Table 2. Relationship between the damping parameter

λ

and the number of iterations.

The Damping Parameter $λ$	The Number of Iterations	Convergence Status	MSE
0.01	6	Yes	0.033196
0.1	10	Yes	0.033196
1	24	Yes	0.033196
10	130	Yes	0.033196
85	NC	No	0.036153

Note: NC indicates that the algorithm did not converge within the maximum number of iterations.

Table 3. Comparison of linear module parameter estimates between the proposed algorithm strategy and the over-parameterization method.

Method	$ζ_{1}$	$ζ_{2}$	$ν_{1}$	$ν_{2}$	$ν_{3}$
Proposed algorithm strategy	0.9322	0.9089	0.3967	0.1125	0.2145
Over-parameterization method	0.9166	0.8704	1.0000	0.6513	0.0421
True value	0.9500	0.9100	0.3500	0.0900	0.1800

Table 4. Total MSE of the proposed algorithm strategy and the over-parameterization method.

Method	MSE
Proposed algorithm strategy	0.031674
Over-parameterization method	0.148578

Table 5. Error metrics for the wind power system identification.

Metric	Value
$R^{2}$	0.9363
R	0.9676
NRMSE (range)	9.11%
Ljung–Box p-value	0.2435

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dong, Q.; Jiang, H.; Liu, Q.; Gao, Y. A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models. Algorithms 2026, 19, 472. https://doi.org/10.3390/a19060472

AMA Style

Dong Q, Jiang H, Liu Q, Gao Y. A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models. Algorithms. 2026; 19(6):472. https://doi.org/10.3390/a19060472

Chicago/Turabian Style

Dong, Qi, Haolong Jiang, Qinyao Liu, and Yuan Gao. 2026. "A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models" Algorithms 19, no. 6: 472. https://doi.org/10.3390/a19060472

APA Style

Dong, Q., Jiang, H., Liu, Q., & Gao, Y. (2026). A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models. Algorithms, 19(6), 472. https://doi.org/10.3390/a19060472

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Correlation Analysis-Based Hierarchical Identification Strategy for Hammerstein Models

Abstract

1. Introduction

2. Problem Formulation

3. The Hierarchical Identification Strategy for the Hammerstein Model

3.1. Linear Parameter Estimation via Least Squares Based on Correlation Analysis

3.2. Nonlinear Parameter Estimation via Levenberg–Marquardt Algorithm

4. Example

4.1. Example 1: The SOFC Model

4.2. Example 2: The Wind Power System

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI