1. Introduction
Deep neural networks (DNNs) have become the standard for addressing complex real-world problems across various domains, achieving near state-of-the-art performance in various applications [
1,
2], such as image processing, speech recognition, and natural language processing, owing to their high capability to learn from large datasets. However, increasing the size of datasets to attain improved accuracy increases the complexity of DNN model parameters, resulting in increased computational, memory, and energy consumption requirements. To address these issues, lightweight network architectures such as extreme learning machines (ELMs) [
3] and echo state networks (ESNs) [
4] have gained considerable attention as alternatives to DNNs. ELMs are data-driven learning algorithms for single-layer feedforward neural networks (NNs) with random hidden neurons, while ESNs are recurrent NN models within reservoir computing frameworks. The lightweight structure and high training speeds of these networks [
5] make them suitable for resource-constrained environments. Consequently, they have been successfully applied to nonlinear function approximation and temporal data processing in many applications [
6,
7].
In control engineering, NNs are popular and powerful tools for nonlinear signal processing and system identification and control owing to their nonlinear mapping, learning, and generalisation capabilities. Although DNNs have been applied to address control problems in nonlinear dynamic systems, they occasionally encounter challenges due to the aforementioned constraints. Considering their success in addressing nonlinear systems, ELMs are considered a promising alternative for effectively handling the dynamic relationships in data related to the control of nonlinear systems. In particular, multi-layer ELMs (MLELMs) [
8], which comprise stacked hidden layers tuned via autoencoders, enable propagation of the hierarchical representation of external inputs to the last layer and combine the expressiveness of deep architectures for the inputs with learning efficiency. The widespread adoption of MLELMs has demonstrated their effectiveness in various real-world applications [
9].
This study investigated the potential of MLELMs for control system applications. As a practical example, an MLELM-based controller was designed for an autonomous unmanned vehicle, particularly an unmanned surface vehicle (USV). The control objectives were accurate reference trajectory tracking and attitude control. Although numerous control techniques for USVs, including feedback, adaptive, nonlinear, and intelligent control, have been reported in the literature [
10,
11], we employed a control architecture comprising two parallel loops. In this architecture, position control is implemented in the first loop to generate the desired translational forces based on the reference trajectory, while attitude control is executed in the second loop to produce desired torques for ensuring that the USV’s orientation aligns with the reference attitude computed by a line-of-sight guidance law. The control functions for each loop were approximated by separate MLELMs. The performance of the MLELM-based controller was then evaluated through computational experiments.
2. Multi-Layer Extreme Learning Machine
Currently, three approaches are available for creating a multi-layered ELM structure: random mapping, kernel correntropy strategy, and conditional probability strategy. This study adopts the random mapping approach to design an MLELM.
The forward calculation of the MLELM between the
-th and
-th layers (
) can be expressed as follows:
where
is the state vector in the
-th layer at time
,
indicates the number of neuron units in the
-th layer,
represents the connection weight matrix between the (
) and
-th layers and
is a component-wise activation function of a neuron. When an external input with
components is provided to the MLELM by
, it is assigned to the vector
wherein
equals to
. After random initialisation, an autoencoder is applied to tune the connection weight matrix. Subsequently, ridge regression is applied to minimise the objective function, defined as
, which yields Equation (2).
where
,
indicates the data length,
is an identity matrix and
is a regularisation parameter.
The output of the MLELM
can be expressed as follows:
where
represents the connection weight matrix between the
-th and output layers. The matrix
is trained offline using a precollected dataset of the desired input–output sets to minimise the objective function
via ridge regression as follows:
where
,
is the desired output vector and
is a regularisation parameter.
3. Computational Experiments for Tracking Control of USV
Considering a six-degree-of-freedom mathematical model of a USV as the control target, the equations of motion can be given as follows [
12]:
where
comprises the linear (surge, sway, heave) and angular (roll, pitch, yaw) velocities in the body frame,
comprises the position
and orientation
in the inertial frame,
is the inertia matrix including added mass,
is the Coriolis and centripetal matrix,
is the hydrodynamic damping matrix comprising linear and quadratic terms,
represents the restoring forces owing to gravity and buoyancy and
is the vector of forces generated by the thrusters. The relationship between the parameters in both frames is given by
where
is the rotation matrix from the body to the inertial frame and
is the transfer matrix, defined as follows:
The heave, roll, and pitch of surface vehicles can often be neglected or passively controlled as these vehicles usually operate with small roll and pitch angles. Consequently, the USV assumes linearised static stability for heave, roll, and pitch based on the steady-state equilibrium of gravity and buoyancy, resulting in
(
) and
. The USV employs two stern thrusters and one bow thruster to control its position and attitude. The thrust vector
is related to the force vector through the following allocation matrix:
where
and
denote the offset distances from the body centre to the stern and bow thrusters, respectively. Herein, a reversible thrust with upper and lower limits is assumed.
A two-loop control architecture is adopted for USV trajectory tracking. The first loop performs position control and generates the desired surge and sway forces according to the reference trajectory. The second loop conducts attitude control, generating a desired yaw torque to ensure that the vehicle orientation aligns with the desired yaw angle designed by the line-of-sight guidance law. In the control system, separate MLELMs are used for the position and attitude controllers. The MLELM constituting the position controller, hereafter referred to as MLELM1, uses the external input , where and are the - and -direction position errors in the inertia frame, respectively, which are defined using the difference between the reference position and the USV’s position . Subsequently, the outputs of the MLELM1 are assigned as the desired forces, i.e., (). The other MLELM constituting the attitude controller, hereafter referred to as MLELM2, uses the external input , where is the attitude error in the inertia frame, which is defined using the difference between the reference angle and the USV’s orientation . The output of MLELM2 is assigned as the desired torque, i.e., .
Both MLELMs are pre-trained offline in advance using pre-collected datasets obtained through point-to-point (PTP) control of the USV, wherein proportional–integral–derivative controllers with empirically tuned gains are employed for both position and attitude control. In addition, the MLELMs undergo online training to compensate for errors during the control process. The MLELMs are implemented as discrete-time models with a sampling interval of , and online training is conducted for each sampling instance to minimise the cost functions and for MLELM1 and MLELM2, respectively.
In the numerical experiments, the mass of the USV was set to 30 kg, and the inertia tensor was set to 1.2, 1.3, and 1.5 around each axis of the body frame. A disturbance force vector with a uniform distribution in the range was added to Equation (5). In the controller, MLELM1 comprised seven input and two output units, while MLELM2 comprised four input units and one output unit. The number of hidden layers was identical for both networks, and the number of neurons in MLELM2 was set to half that of MLELM1 by considering the input and output numbers of both networks. To account for the neuron threshold, a constant value of one was added to the external input component of each network. The training datasets for the MLELMs were collected by conducting PTP control experiments with an operational area defined by and . This area was divided into a grid dividing the and axes into six equal segments. Experiments were conducted from the origin on every grid vertex. This process, involving 48 target positions with samples per trial collected over a fixed duration of 5 s, yielded datasets. In the control experiments, the sampling interval for the MLELMs was 50 ms. The regularisation parameters and used for offline training of both networks were set to an identical value of . The learning factors for online training of both networks were set to . Tracking control experiments were conducted under the following two conditions. In the first condition, the reference trajectory comprised a poly line and squared sin wave, and the initial condition of the USV was set to and . In the second condition, the reference trajectory was a figure-eight-shaped curve defined using the Bernoulli lemniscate, and the initial condition of the USV was set to and . In both conditions, denotes the terminal time of simulation, which was set to 20 s.
Figure 1 shows the simulation results for tracking control using the MLELM-based controller under the first condition. The controller configuration used three hidden layers, with 200 and 100 neurons per layer for MLELM1 and MLELM2, respectively. Despite attitude errors at the point where the reference trajectory changed, small position errors, and thrust saturation during large control errors, the controller ensured that the USV followed the reference trajectory. This result confirms the feasibility of using MLELMs for USV tracking control.
Figure 2 shows an evaluation of the effect of hidden layer configurations on controller performance using the mean squared error (MSE) of the position and attitude of the USV. Although increasing the number of neurons enhances control performance, the number of layers does not affect control performance when sufficient neurons are provided. This indicates that when using MLELMs, deep architectures are not necessarily useful for accomplishing this task.
Figure 3 and
Figure 4 present the simulation results for tracking control under the second condition and the relationship between the hidden layer configurations and control performance, respectively. As shown in
Figure 3, the MLELMs used a network topology identical to that of the first condition (
Figure 1). Although the trajectory’s continuously changing curvature induced attitude errors and resulting position error fluctuations, the MLELM-based controller completed the tracking task. This result demonstrates the adaptability of the MLELM-based controller. The effect of the hidden layer configurations on control performance exhibits a similar trend to that observed under the first condition (
Figure 4).