1. Introduction
The development of unmanned aerial vehicles (UAVs), specifically quadrotors, has experienced an exponential rise in the last decade owing to their remarkable versatility in a wide range of applications such as surveillance, logistics, search and rescue operations and environmental and scientific studies [
1,
2,
3]. This rapid rise can be attributed to the advantages that quadrotors offer: vertical takeoff and landing, hovering stability, simplicity of design and maneuverability for fast direction changes and accurate positioning. The control of quadrotor systems is a fundamentally difficult task that is characterized by its natural underactuation (four control inputs for six degrees of freedom), strong coupling effects between the rotational axes, strong non-linearities in aerodynamics, fast response demands and operational constraints such as motor saturation and energy constraints [
4].
The linear quadratic regulator (LQR) is the traditional optimal control solution for stabilizing complex linear systems like quadrotor attitude dynamics, ensuring guaranteed closed-loop stability, robustness margins and efficient real-time implementations for embedded flight control systems [
5]. Nevertheless, the effectiveness of LQR control is highly sensitive to the choice of weighting matrices
Q and
R, which determine the essential trade-off between trajectory tracking accuracy and energy efficiency. The traditional manual tuning process, which relies on trial-and-error experimentation, lacks systematic methods and often results in suboptimal solutions, especially when multiple conflicting objectives need to be satisfied simultaneously [
6]. A similar multi-criteria perspective has also been adopted in UAV controller design, where Pareto-based optimization has been used to balance multiple step-response objectives simultaneously in quadrotor tuning [
7]. This inherent shortcoming has triggered extensive research efforts in the application of meta-heuristic optimization techniques for automating and improving the parameter selection process, which marked a paradigm shift in controller design methods.
Recent studies have clearly shown the effectiveness of bio-inspired algorithms in optimizing LQR parameters by searching high-dimensional parameter spaces. Comparative studies have also shown that several population-based optimization methods, including GA, DE, PSO and GWO, can be effectively used to determine the LQR weighting matrices of quadrotor trajectory-tracking controllers in a systematic manner [
8]. Particle swarm optimization (PSO), inspired by the collective behavior of bird flocks, has been shown to be very effective in optimizing LQR weighting matrices using population-based search methods that do not require gradient computation [
5]. In addition, genetic-algorithm-based optimization has also been employed for LQR weighting-matrix selection in quadrotor control problems, further demonstrating the usefulness of meta-heuristic search in reducing the reliance on manual tuning [
9]. Comparative analyses of PSO, the flower pollination algorithm (FPA) and ant colony optimization (ACO) on the three-DOF hover platform show that FPA-based optimization outperforms others in terms of response speed with the least overshoot and hence is most useful in situations that demand fast transient response [
10]. Grey wolf optimization (GWO) has recently emerged as a successful meta-heuristic, inspired by the complex social behavior and hunting strategies of grey wolf packs, which organizes the search population into a wolf pack hierarchy where alpha, beta and delta wolves correspond to solutions of increasing quality [
11].
Comparative studies directly comparing PSO and GWO on the three-DOF hover system show clear differences in performance trade-offs: GWO-optimized LQR controllers show better disturbance rejection performance with less oscillation during external disturbances, while PSO-optimized controllers show better recovery speed with smoother voltage command trajectories. A hybrid algorithm incorporating simulated annealing (SA) with grey wolf optimization (hSA-GWO) has been proposed and tested on three-DOF systems, capitalizing on the fast convergence properties of SA and the excellent exploration abilities of GWO [
12]. More recent studies have further shown that hybrid meta-heuristic structures can improve convergence quality and robustness in LQR-based quadrotor control by combining the exploration and exploitation capabilities of different search mechanisms [
13]. Recent studies have also focused on improving the internal search mechanisms of MOGWO itself through enhanced population initialization, non-linear convergence control and archive update strategies in order to achieve better convergence accuracy and solution diversity in multi-objective engineering optimization problems [
14]. The modified monarch butterfly optimization (M2BO) algorithm with distribution functions has been specifically proposed and experimentally tested on three-DOF hover systems, showing better control performance than traditional MBO algorithms and previous stochastic optimization methods like SMDO and DSO [
15].
Sliding mode control (SMC) has been identified as a highly preferred method owing to its established robustness properties against parameter uncertainties and unmodeled dynamics, with robust implementations using combinatorial reaching laws that guarantee finite-time convergence and effectively mitigate the chattering effect using high-slope saturation functions in place of traditional sign functions [
16]. The amplified linear quadratic regulator (ALQR) method integrates linear control topology with non-linear sliding mode control dynamics, which has shown better robustness properties against noise and unmodeled dynamics when implemented on the three-DOF system [
17]. Simulink response optimization has been employed systematically to identify LQR weight matrices that minimize rise time, settling time and overshoot simultaneously, which enables automatic parameter tuning in conventional MATLAB settings [
18]. In the case of systems with state delays and actuator faults, enhanced guaranteed cost control with quantum adaptive control ensures robust stability properties while preserving tracking performance.
Hamilton–Jacobi–Bellman (HJB) equation-based optimal control represents a sophisticated approach to multi-rotor attitude control that explicitly handles system non-linearities, with multiple design methods based on the HJB equation including linear control with stability guarantees for non-linear systems and non-linear suboptimal control techniques that have been developed and experimentally evaluated on the three-DOF Hover [
19]. The experimental evaluation reveals important practical insights: linear control methods often outperform theoretically more sophisticated non-linear strategies in real-world implementation, suggesting that the practical trade-offs between theoretical optimality and implementation complexity must be carefully considered in control system design. Artificial neural networks (ANNs) have been employed to approximate complex system non-linearities, simplifying the design of both LQR and SMC controllers for highly non-linear systems [
20]. Sophisticated non-linear control approaches for twin rotor MIMO systems (TRMSs) representing two-DOF helicopter configurations have been developed, including dual boundary conditional integral backstepping control (DBCIBC) that provides an innovative modification of integral backstepping techniques to ensure efficient asymptotic output regulation without degrading transient response [
21]. This approach decouples the helicopter-like system into vertical and horizontal subsystems with cross couplings considered as uncertainties, enabling robust output regulation in the presence of system uncertainties and external disturbances.
Stochastic optimization algorithms have been employed to fine-tune feedback gains for multi-rotor systems in stochastic environments. Ates et al. employed stochastic multi-parameter divergence optimization (SMDO) and discrete stochastic optimization (DSO) to improve the stability of micro-platforms and showed the effectiveness of their approach over traditional LQR methods using probabilistic convergence and robustness guarantees, which are highly beneficial for systems that must function in uncertain environments. Fuzzy logic controllers with meta-heuristic optimization of parameters have been shown to be effective in model uncertainty while ensuring stability margins [
22].
Besides control design, a real-world quadrotor system must also address fault detection, diagnosis and fault tolerance. Methods of fault diagnosis based on generalized Hamiltonian system models have been proposed for robotic systems and directly applied to two-DOF helicopter robots for the effective detection and isolation of particular failure modes by decomposing the original system into subsystems that are sensitive to particular faults [
23]. Passive fault-tolerant controllers based on robust control methods have been proposed and tested on two-DOF robotic helicopters by using linear matrix inequalities to modify proportional-derivative control methods and adding non-linear terms to enhance the overall control accuracy [
24].
Recent developments in machine learning and adaptive control have brought about new ideas for designing quadrotor controllers. Learning-based robust trajectory tracking control of two-DOF helicopter systems using gradient descent-based learning control laws aims to minimize cost functions associated with desired closed-loop error dynamics of non-linear systems, where the learning ability of the controller makes it more suitable for dealing with system uncertainties and unknown disturbances during online flight operations [
25]. Reinforcement learning (RL)-based approaches, such as deep deterministic policy gradient (DDPG) algorithms, have been successfully utilized for online fine-tuning of proportional-derivative controller parameters, thereby filling the large gap between simulation-trained controllers and real-world quadrotor control applications [
26].
State estimation plays an ever more important role with increasing system complexity, as in real-world applications, not all states of the system can be measured directly by the sensors on board, which has a profound effect on control system performance and closed-loop stability. Observer designs of both full and reduced order have been successfully employed on two-DOF helicopter models to stabilize the system and achieve real-time state estimation [
27]. For two-DOF systems with strong non-linearities, neural network-based observers have been designed that do not need any prior knowledge of the system dynamics, employing two-layer neural networks to model the strong non-linearities of complex systems and state estimation methods become even more important for three-DOF systems, where the higher dimensionality makes it increasingly impractical to measure all states directly [
28].
The three-DOF hover platform is not only used for research purposes but is also a valuable educational platform for teaching advanced control concepts to students and researchers. Educational platforms for non-linear control systems focus on the improvement of student motivation and learning through specific curricula that include comprehensive experimental approaches and the utilization of advanced platforms such as the Quanser hover allows students and researchers to gain conceptual knowledge while also providing hands-on experience with advanced control implementations [
29]. Laboratory manuals for MATLAB/Simulink users include detailed information on how to derive state-space models of open-loop systems, as well as the design and implementation of LQR-based state-feedback controllers and the simulation of systems to ensure stabilization within realistic constraints [
30].
Although several meta-heuristic approaches have previously been reported for tuning LQR weights on these kind of systems, the purpose of this study is not to claim a universally superior controller. Instead, the novelty of this work lies in examining the applicability of a multi-objective grey wolf variant for systematic LQR weight selection under practical implementation constraints. In particular, a multi-objective grey wolf optimizer (MOGWO) is used to treat roll, pitch and yaw tracking errors simultaneously as a three-objective problem. Moreover, the tuning and evaluation are carried out within the same closed-loop MATLAB/Simulink environment so that the reported designs reflect feasible implementation conditions.
Accordingly, the contributions of this paper are as follows: (i) a reproducible MOGWO-LQR tuning workflow for the coupled three-DOF hover attitude model; (ii) four independent multi-objective runs based on standard error indices, each defined concurrently for roll, pitch and yaw; and (iii) a Pareto-set and knee-point-based selection of a single implementable design that is subsequently assessed through step and scenario tracking simulations.
The overall workflow of the proposed methodology is summarized in the flowchart presented in
Figure 1.
2. Three-DOF Hover System Description and Modeling
The three-DOF hover system has emerged as the standard experimental testbed for validating flight control algorithms in a safe laboratory environment, enabling systematic investigation of control performance under realistic constraints, as shown in
Figure 2. This electromechanical system accurately models the rotational dynamics characteristic of quadrotor and tandem helicopter aircraft by controlling three critical angular degrees of freedom which are the pitch (
), roll (
) and yaw (
) angles through independent motor-driven propeller mechanisms [
30,
31]. The three-DOF platform represents a significant advancement over earlier two-DOF helicopter systems by enabling comprehensive investigation of three-axis attitude dynamics with their complex interdependencies [
32,
33].
The schematic diagram of the three-DOF hover system is shown in
Figure 3. Throughout this study, the following modeling conventions are adopted to ensure consistent sign definitions: the system is horizontal when
and
. Yaw angle increases,
, when the body rotates in the CCW direction. Pitch angle increases,
, when the body rotates the CCW direction. Roll angle increases,
, when the body rotates the CCW direction.
When a positive voltage is applied to any motor, a positive thrust force is generated and the corresponding propeller assembly rises. The thrust forces produced by the front, back, right and left motors are denoted by , , and , respectively and the applied motor voltages are , , and . Pitch motion is primarily governed by the differential thrust between the front and back motors, while roll motion is driven mainly by the differential thrust between the right and left motors. Accordingly, the pitch angle increases when and the roll angle increases when .
The rotational dynamics around each axis are expressed using the general form given in Equation (
1).
where
is the angular displacement,
L is the distance from the pivot to each motor along the corresponding axis,
J is the equivalent moment of inertia about the axis and
denotes the differential thrust force.
Using the schematic diagram of the pitch axis in
Figure 4, the pitch and similarly roll equations of motion are given in Equations (
2) and (
3), respectively.
where
is the thrust force constant,
is the front motor voltage,
is the back motor voltage,
is the right motor voltage,
is the left motor voltage,
is the pitch angle,
is the roll angle,
is the moment of inertia about the pitch axis and
is the moment of inertia about the roll axis.
The yaw motion is caused by the imbalance of reaction torques in Equation (
4) generated by the two clockwise and two counter-clockwise rotors shown in
Figure 5. Assuming a linear mapping between motor voltage and propeller torque,
, the yaw axis equation of motion can be written in terms of applied motor voltages as in Equation (
5).
The resulting linear model is formulated in the standard state-space form
where the state, input and output vectors are defined in Equations (
8) and (
10); the corresponding matrices
are given in Equations (
11) and (
12).
The mathematical model considered in this study, together with the associated physical parameters, is based on the Quanser three-DOF hover user documentation [
30]. Therefore, the governing equations presented in this section are adopted from this reference and are provided to describe the system model used for controller design and simulation. The physical parameters employed in the model are summarized in
Table 1.
3. LQR Controller Design
The design of a full-state feedback controller for regulating the attitude of the three-DOF hover system is presented in this section. Using the linear state-space model derived in the previous section, an LQR-based gain matrix is computed to track the desired yaw, pitch and roll angles while limiting the required actuator effort.
The motor command vector is chosen as the four motor voltages,
where
x is the state vector defined in (
8) and
is the state-feedback gain. The reference (setpoint) vector is defined as
which specifies the desired yaw, pitch and roll angles with zero desired angular rates. A constant bias voltage is added to each motor,
to keep the propellers spinning and to prevent command voltages from dropping below the cutoff region, thereby improving responsiveness around the hover operating point. In implementation, the non-negativity constraint in (
13) is enforced elementwise and the actuator voltage is additionally limited to the available supply range in Simulink.
Due to the low resistance of the motors, frequent switching between positive and negative voltages may lead to excessive current and can damage the power amplifier. Therefore, only non-negative voltages are applied. This constraint is also consistent with practical VTOL/helicopter systems, where propellers do not reverse direction during normal operation.
The LQR gain is computed based on the standard feedback law [
35]
where the weighting matrices are parameterized as diagonal matrices:
Although the three-DOF hover dynamics are coupled,
Q and
R are parameterized as diagonal matrices in (
17) for a practical and reproducible tuning setup. This choice reduces the number of decision variables and preserves interpretability by assigning independent penalties to each state and each motor-voltage input. It also simplifies enforcing the required properties
and
through non-negative diagonal entries, which is convenient for meta-heuristic search. Importantly, using diagonal weights does not imply a decoupled controller: since coupling is already embedded in the plant matrices
, the resulting LQR gain
K is generally dense and can still coordinate multiple states and inputs.
Allowing off-diagonal terms could further influence the obtained controller by explicitly penalizing state cross-products and input correlations. Nonzero terms would weight coupled state combinations and may change how the controller trades off cross-axis interactions, while nonzero terms could penalize correlated motor-command combinations. Such extensions would increase the search dimension and reduce interpretability and their impact would depend on the selected structure and bounds.
Here,
Q penalizes the state deviations (attitude angles and rates), whereas
R penalizes the control effort in terms of motor voltages. Using
from (
11), the gain
K is obtained by minimizing the quadratic performance index [
35]
4. MOGWO-Based Optimization of LQR Parameters
In this study, the multi-objective grey wolf optimizer (MOGWO) is used to optimize the diagonal entries of the LQR weighting matrices in (
17), following a parameterization similar to that in [
36]. In practice, the diagonal entries are tuned within the predefined bounds given in
Table 2 and for each candidate set of
, the gain matrix
K is computed via the LQR formulation in (
18). The grey wolf optimizer (GWO) was introduced by Mirjalili et al. [
37], inspired by the leadership hierarchy and hunting behavior of grey wolves, where the best three candidate solutions (
,
,
) guide the remaining wolves (
). MOGWO extends GWO to multi-objective problems by employing an external archive to store non-dominated solutions and an adaptive grid mechanism to preserve diversity in the objective space, as shown in
Figure 6 [
38].
Four separate multi-objective optimization studies were performed. In the first study, the objective vector was defined using ITAE for the roll, pitch and yaw channels:
The same three-channel multi-objective formulation was then repeated in three additional independent optimization runs using IAE, ITSE and ISE, respectively. The performance indices for each attitude channel
are defined as
where
denotes the tracking error of the
i-th attitude channel and
T is the simulation horizon. Each run yields a set of
Q,
R; a single implementable solution can then be selected from the Pareto set. The overall optimization control workflow is illustrated in
Figure 7.
The encircling behavior is modeled as
where
t is the iteration index,
is the position of a search agent and
represents the prey (leader) position. The coefficient vectors are
with
decreasing linearly from 2 to 0 and
.
The hunting mechanism updates each agent with respect to the three leaders
,
and
, as shown in Equations (28)–(34):
As highlighted in
Figure 6 of the original MOGWO study, the magnitude of
controls the exploration–exploitation behavior: when
, the agents tend to explore (diverge) and when
, they tend to exploit (converge).
Table 2 reports the baseline values adopted from the system documentation [
30] together with the minimum and maximum bounds used for the decision variables in the optimization, namely the diagonal entries of the LQR weighting matrices
Q and
R.
Randomly generate a set of candidate solutions (wolves), each encoding the selected entries of
within the bounds given in
Table 2.
For every candidate, construct Q and R, compute K via LQR, simulate the closed loop and compute the three objective values.
Identify non-dominated solutions and store them in an external archive. If a new candidate is dominated by an archive member, it is discarded; if it dominates archive members, those are removed; otherwise, it is added to the archive.
When the archive reaches its maximum capacity, apply the grid and segmentation mechanism in the objective space and remove a solution from the most crowded region, then keep/insert solutions to favor less crowded regions.
Choose , and from the archive using a diversity-aware leader-selection strategy biased toward less crowded regions.
Update each wolf position using Equations (28)–(34), then update , and .
Repeat objective evaluation, archive update, diversity control, leader selection and position update until the stopping criterion is met; finally, return the archive as the approximated Pareto set.
To select a single implementable design from the final non-dominated Pareto archive, this study employs the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) [
39,
40]. In each optimization run, the external archive is limited to
non-dominated solutions and TOPSIS is applied to rank these candidates based on their relative closeness to an ideal best and an ideal worst solution defined in the objective space. The highest-ranked candidate is taken as the compromise solution and used in subsequent simulations.
In this study, all objectives are minimization-type and correspond to the roll, pitch and yaw tracking indices for the selected performance measure. The same TOPSIS-based selection procedure was applied consistently for all four optimization runs, yielding one representative design from the final Pareto archive for each case.
All simulations were performed in a MATLAB/Simulink closed-loop hover model using the
ode1 (Euler) solver with a fixed step size of 0.001 s. For the unit-step response analyses, the simulation horizon was set to 4 s and a unit step of amplitude 1 was applied separately to each attitude channel, while the remaining references were kept at zero. For the multi-axis validation scenario, the simulation horizon was 20 s, with reference commands of 4° for roll, 4° for pitch and 5° for yaw. All states were initialized to zero. Each motor voltage was independently constrained within the range of 0–24 V, with negative control commands clipped to 0 V and values above 24 V clipped to 24 V. No explicit fixed random seed was imposed and the stochastic search process followed MATLAB’s current random-number-generator state. To facilitate reproducibility, the MOGWO implementation followed the standard archive-grid mechanism and leader selection strategy described in the original MOGWO study [
38].
Table 3 lists the optimizer parameters used to evaluate each candidate
through closed-loop MATLAB/Simulink runs.
Following the above procedure, four independent MOGWO runs are completed for the ITAE-, IAE-, ITSE- and ISE-based objective formulations. The resulting optimized diagonal entries of the LQR weighting matrices
Q and
R are reported in
Table 4. These values are subsequently used in the closed-loop simulations.
5. Results and Discussion
The results presented in this section summarize the simulation outcomes of the proposed MOGWO-LQR tuning framework and discuss its applicability to LQR weight selection for a coupled three-axis hover-type attitude system under practical actuator constraints. The discussion is structured to first present the step-response behavior of the optimized controllers then report the corresponding error index values and finally compare the obtained responses with baseline settings and representative meta-heuristic approaches reported in the literature.
Figure 8 presents the step responses obtained with the LQR gains computed from the MOGWO-optimized
sets, where the roll, pitch and yaw channels are evaluated under the same voltage saturation and non-negativity constraints used during tuning. These plots provide a direct view of the transient behavior produced by the selected compromise solutions for each objective formulation.
Overall, the optimized designs yield stable tracking responses with channel-dependent trade-offs. While some solutions emphasize faster convergence in a given axis, others exhibit smoother transients with reduced oscillation. This behavior is expected because the knee-point selection balances three coupled attitude channels rather than prioritizing a single axis.
To quantify the trends observed in the step responses, the corresponding error-based performance indices computed over the simulation horizon are summarized in
Table 5. The reported values represent the objective function outcomes associated with the step responses in
Figure 8.
The results confirm that changing the objective formulation leads to different compromise solutions across roll, pitch and yaw. Improvements in one channel or index do not necessarily translate into uniform reductions in the remaining indices, which is consistent with the coupled nature of the plant and the multi-objective selection of .
For a broader perspective,
Figure 9 compares the step responses of the proposed MOGWO-tuned LQR designs, with representative results reported in the literature using GA, PSO, SA and GWO, as well as the baseline LQR settings provided in the system documentation. This comparison is intended to contextualize the obtained responses rather than to claim a universally superior method, since the compared studies may differ in tuning settings and evaluation conditions.
The comparison indicates that the MOGWO-based tuning can produce step responses that are competitive in terms of transient behavior across the three axes. At the same time, differences among methods highlight that the final response characteristics depend strongly on how the objective functions, constraints and selection criteria are defined.
To complement the visual comparison in
Figure 9,
Table 6 reports standard step-response metrics for the considered controllers. The table enables a more direct assessment of transient characteristics such as settling time and overshoot across roll, pitch and yaw.
The numerical metrics support the qualitative observations from
Figure 9. In general, the proposed tuning framework can reduce selected transient measures in some axes while maintaining feasible responses.
Finally,
Figure 10 illustrates the system response under a representative multi-axis scenario adopted from the system documentation, comparing the proposed approach with the baseline and literature-based controllers [
12,
30].
As observed in
Figure 10, axis coupling leads to cross-axis interactions that cannot be fully eliminated by independent tuning of single-axis behavior. The proposed MOGWO-based tuning provides a practical way to search for
choices that maintain acceptable multi-axis behavior in the presence of these interactions. This figure is particularly useful for highlighting coupling effects: when one axis is commanded, the induced motion may act as a disturbance on the other axes.
The motor-voltage profiles corresponding to the step-response evaluations are shown in
Figure 11. The four objective formulations lead to slightly different transient voltage demands, especially during the initial response where the controller reacts to the reference change. Despite these differences, all cases exhibit similar steady-state voltage levels and remain within the enforced actuator constraints, indicating that the selected MOGWO-based designs achieve the desired tracking behavior with comparable voltage usage across the front, back, right and left motors. A quantitative comparison of the motor-voltage signals is also provided in
Table 7, where the minimum, peak, mean and RMS voltage values confirm that the overall actuator effort remains at a comparable level for all four controllers.
In addition to time-domain responses, it is useful to visualize how the multi-objective search distributes candidate solutions in the objective space.
Figure 12 shows the three-dimensional Pareto distributions obtained from the four independent optimization runs, where each run considers the roll, pitch and yaw indices simultaneously under the same simulation settings and actuator constraints. The red points denote the non-dominated solutions retained in the archive and the blue marker indicates the selected compromise (knee-point) solution used for subsequent simulations.
As seen in
Figure 12, each objective formulation leads to a distinct spread of non-dominated solutions, reflecting different trade-offs among roll, pitch and yaw tracking errors. The presence of a well-populated set of alternatives suggests that the optimizer can produce multiple feasible
candidates rather than converging to a single narrow region. The knee-point selection provides a practical way to choose one implementable design from the archive by balancing the three axes, which is particularly relevant for this coupled system where improving one channel may influence the others.
6. Conclusions
This study investigates the applicability of a multi-objective meta-heuristic approach for systematic tuning of LQR weighting matrices in a coupled three-DOF hover attitude control problem. The diagonal entries of the Q and R matrices are treated as decision variables within predefined bounds and a multi-objective grey wolf optimizer is integrated with closed-loop MATLAB/Simulink simulations to generate feasible candidate solutions. Rather than relying on manual trial-and-error, the proposed workflow produced non-dominated Pareto sets and enabled the selection of a single implementable design via a compromise knee-point selection.
The results show that the multi-objective formulation provides a structured way to balance roll, pitch and yaw tracking behavior simultaneously, which is particularly relevant for this system due to axis coupling. Across the independent runs based on standard error indices, the optimizer yielded different trade-off distributions in the objective space and corresponding LQR weight patterns, indicating that the selected objective definition can influence the resulting compromise design. When compared with baseline settings and representative single objective meta-heuristic designs reported in the literature, the MOGWO-tuned solutions exhibited competitive step and scenario-tracking behavior while remaining consistent with the implemented non-negativity and saturation constraints. These outcomes suggest that multi-objective grey wolf optimization can be used as a practical tuning tool for LQR weights in hover-type attitude systems, without claiming universal superiority over alternative methods.
Future work will therefore consider broader operating conditions and additional performance criteria, such as explicit penalties on control activity, saturation frequency and robustness to parameter variations. Moreover, other multi-objective optimizers and selection strategies can be explored and compared within the same framework. Finally, extending the approach to alternative control structures such as gain-scheduled LQR, control, MPC, or non-linear robust controllers and validating the tuned designs through real-time experiments on the three-DOF hover setup could be directions to further examine practicality and generalization.