Next Article in Journal
Test Coverage in Microservice Systems: An Automated Approach to E2E and API Test Coverage Metrics
Previous Article in Journal
Deep Reinforcement Learning for Ecological and Distributed Urban Traffic Signal Control with Multi-Agent Equilibrium Decision Making
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction

1
Department of Aerospace Engineering, University of Illinois Urbana-Champaign, Champaign, IL 61801, USA
2
Department of Mechanical and Aerospace Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(10), 1911; https://doi.org/10.3390/electronics13101911
Submission received: 19 March 2024 / Revised: 17 April 2024 / Accepted: 11 May 2024 / Published: 13 May 2024
(This article belongs to the Special Issue Unmanned Aircraft Systems with Autonomous Navigation, Volume II)

Abstract

:
Electric vertical takeoff and landing (eVTOL) aircraft represent a crucial aviation technology to transform future transportation systems. The unique characteristics of eVTOL aircraft include reduced noise, low pollutant emission, efficient operating cost, and flexible maneuverability, which in the meantime pose critical challenges to advanced power retention techniques. Thus, optimal takeoff trajectory design is essential due to immense power demands during eVTOL takeoffs. Conventional design optimizations, however, adopt high-fidelity simulation models in an iterative manner resulting in a computationally intensive mechanism. In this work, we implement a surrogate-enabled inverse mapping optimization architecture, i.e., directly predicting optimal designs from design requirements (including flight conditions and design constraints). A trained inverse mapping surrogate performs real-time optimal eVTOL takeoff trajectory predictions with no need for running optimizations; however, one training sample requires one design optimization in this inverse mapping setup. The excessive training cost of inverse mapping and the characteristics of optimal eVTOL takeoff trajectories necessitate the development of the regression generative adversarial network (regGAN) surrogate. We propose to further enhance regGAN predictive performance through the transfer learning (TL) technique, creating a scheme termed regGAN-TL. In particular, the proposed regGAN-TL scheme leverages the generative adversarial network (GAN) architecture consisting of a generator network and a discriminator network, with a combined loss of the mean squared error (MSE) and binary cross-entropy (BC) losses, for regression tasks. In this work, the generator employs design requirements as input and produces optimal takeoff trajectory profiles, while the discriminator differentiates the generated profiles and real optimal profiles in the training set. The combined loss facilitates the generator training in the dual aspects: the MSE loss targets minimum differences between generated profiles and training counterparts, while the BC loss drives the generated profiles to share analogous patterns with the training set. We demonstrated the utility of regGAN-TL on optimal takeoff trajectory designs for the Airbus A 3 Vahana and compared its performance against representative surrogates, including the multi-output Gaussian process, the conditional GAN, and the vanilla regGAN. Results showed that regGAN-TL reached the 99.5% generalization accuracy threshold with only 200 training samples while the best reference surrogate required 400 samples. The 50% reduction in training expense and reduced standard deviations of generalization accuracy achieved by regGAN-TL confirmed its outstanding predictive performance and broad engineering application potential.

1. Introduction

Electric vertical takeoff and landing (eVTOL) aircraft represent a disruptive technology in the field of aviation. The implementation of electric propulsion systems and vertical takeoff and landing (VTOL) makes eVTOL aircraft an essential component in future transportation. Compared with traditional propulsion systems, electric propulsion not only reduces carbon emission and operation noise during flight tasks but also alleviates expenses due to less maintenance and no jet fuel consumption.These characteristics make eVTOL aircraft more environment-friendly and cost-effective. Furthermore, the VTOL capability reduces the need for traditional infrastructures, such as long runways, which lowers construction expenses and potentially prevents the over-exploitation of natural resources. In addition, paired with the aforementioned features, the flexible maneuverability and advanced flight control system position eVTOL aircraft as ideal candidates for package delivery, passenger-carrying air taxis, etc. However, one of the bottlenecks prohibiting eVTOL aircraft from broad applications and long-distance tasks is the considerable electrical energy consumption.
Increasingly mature technologies in the autonomous control, battery, and material fields have triggered numerous eVTOL studies [1,2], especially in their implementation as urban transportation. Several emerging concepts have given rise to distinct eVTOL aircraft classifications, including lift+cruise (e.g., Airbus CityAirbus NextGen [3] and Aurora Flight Sciences eVTOL [4]), multicopter (e.g., EHang 216 [5]), tilt-wing (e.g., Airbus A 3 Vahana [6,7] and Dufour Aerospace VTOL Technology Demonstrator [8]), and tilt-rotor (e.g., Joby S4 [9]). Tilt-wing eVTOL aircraft typically possess rotational wings with propellers built on leading edges that as a whole, rotate upwards for VTOL and forwards for conventional cruising flights. This unique feature integrates the advantages of the agility of helicopters during takeoff and landing and the aerodynamic efficiency of traditional fixed-wing airplanes during cruising. Thus, the tilt-wing eVTOL tends to have better overall energy efficiency, which could expand flight ranges. In the meantime, investigating alternative energy-efficient avenues besides hardware configurations is greatly needed.
The transition phase attracts considerable attention by virtue of the high energy cost in this phase [10]. Consequently, optimizing the transition phase plays a crucial role in power retention for tilt-wing eVTOL aircraft. Kubo and Suzuki [11] proposed an innovative design for a tail-sitter VTOL-based unmanned aerial vehicle (UAV) configuration. By adding slats and flaps to the UAV, they achieved an optimal transition from hover to forward flight without obvious altitude change, which saved the energy consumption to reach the required altitudes. Li et al. [12] emphasized the optimal transition trajectory with a minimal energy cost, which allowed the tail-sitter UAV to complete both forward and backward transitions with less altitude change and operational time than using traditional linear transition methods. From the perspective of the critical constraint on the operational time limited by batteries, Pradeep and Wei [13] proposed a formulation of an optimal control problem taking energy consumption as a performance index for multi-rotor eVTOL aircraft. This formulation with numerical simulations enabled eVTOL vehicles to obtain the most energy-efficient trajectory while fulfilling the required arrival time. These results demonstrated the possibility of efficient energy consumption and maintaining safe operations for future eVTOL aircraft. However, there are still relatively few works in the literature concentrating on the takeoff trajectory optimization for tilt-wing eVTOL aircraft considering real-world situations. Specifically, an optimal eVTOL takeoff trajectory must take into consideration the aircraft performance, the payload, and passenger comfort, together with energy consumption.
Among the few studies on optimal takeoff trajectory for eVTOL aircraft, we can identify Chauhan and Martins [14], who optimized takeoff-to-cruise trajectories for the Airbus A 3 Vahana [6] model with the energy consumption as the objective function. They considered passenger comfort via a maximum acceleration constraint and flight performance via a stall constraint on the wing angle of attack. They concluded that optimal takeoff trajectories typically occurred under conditions of wing stall or operations close to the stall angle of attack. Panish and Bacic [15] highlighted the significance of active flow control (AFC) in optimizing the transition trajectory for tilt-wing configurations. Their approach achieved minimum energy consumption at zero altitude variation and the leading-edge AFC effectively prevented wing aerodynamic stall by introducing a constant-altitude constraint during the cruise-to-hover maneuver and improved performance during hover-to-cruise transitions. Conventional design optimizations (including optimal takeoff trajectory designs), however, adopt physics-based simulation models, which prohibits real-time decision making [11,12].
Surrogate models represent a viable avenue for rapid decision making, in lieu of computationally intensive simulation models [16,17,18]. In contrast with conventional simulation-based optimizations, surrogate-based designs demonstrate various advantages, including computational efficiency [19,20], sensitivity analysis [21], uncertainty analysis [22,23], and reduced-space exploitation of high-dimensional design spaces [24]. The Gaussian process (GP) [25] is one of the most commonly used traditional surrogate models due to its capability and flexibility for modeling and predicting unknown functions based on observed data [26]. The concept of multi-output GP (MOGP) has emerged to handle problems with multiple outputs [27,28] by extending the concept of GP. In particular, by jointly modeling the outputs through estimating the covariance matrix (also known as correlation function), MOGP leverages the shared information among the outputs which reduces the overall modeling complexity. For more mathematical and surrogate modeling details, please refer to Section 2.3.
Deep neural networks (DNNs), as the core of deep learning, have substantially advanced state-of-the-art research in the surrogate modeling and design optimization communities. Thelen et al. [29] were able to address the high-dimensional aerodynamic and aeroelastic design cases with design space dimensions ranging from 17 to 321 through two-fold configurations. They first enabled accurate and efficient gradient-based optimization methods using analytically computed design variable derivatives and then further alleviated the computational burden by a multi-fidelity modeling algorithm. Tao and Sun [30] developed a multi-fidelity surrogate-based optimization framework using deep belief networks. Their results showed that the multi-fidelity surrogate achieved remarkable optimization efficiency and effectiveness according to airfoil and wing design under uncertainty results. Renganathan et al. [31] incorporated the predictive power of DNNs and confidence interval of GP for surrogate modeling in surrogate-based aerodynamic design optimization. The proposed surrogate enabled relatively high-dimensional Bayesian optimization on aerodynamic design and outperformed adjoint-based optimization in terms of efficiency.
An alternative branch for addressing the “curse of dimensionality”, which prohibits large-scale practical applications, is dimensionality reduction. O’Leary-Roseberry et al. [32] developed adaptive residual networks on reduced-dimensional space, exploiting principal component analysis to predict optimal designs directly based on design requirements. This work successfully demonstrated the outstanding performance of reduced-space surrogates over full-space surrogates on aerodynamic wing design cases. Goodfellow et al. [33] developed a state-of-the-art generative model, generative adversarial networks (GANs). GANs consist of a generator network and a discriminator network, which compete with each other. Specifically, the generator aims to generate similar shapes to existing training data patterns while the discriminator aims to differentiate between generated and existing shapes. Therefore, by automatically filtering out unrealistic shapes and generating only realistic designs, the use of GANs enabled implicit dimensionality reduction when fed with realistic airfoil or wing design shapes [34]. Du et al. [35] developed a B-spline-based generative adversarial network (BSplineGAN) for airfoil intelligent parameterization with the UIUC airfoil database as training data. The design space can be reduced automatically by BSplineGAN while sustaining sufficient shape variations, which was verified by fitting optimizations towards arbitrary UIUC airfoils. They constructed DNN surrogates on the reduced space exploiting BSplineGAN and realized rapid aerodynamic designs.
In addition to implicit dimensionality reduction, GAN variants also handle regression tasks. Du and Martins [36] developed a new multi-fidelity surrogate modeling architecture using a super-resolution GAN (SRGAN) for predicting airfoil pressure distributions based on low-fidelity counterparts. Specifically, high-fidelity pressure distributions were predicted by the SRGAN generator while the discriminator sought to distinguish between predicted and high-fidelity pressure distributions. Thus, training the generator minimized the difference between the super-resolution predictions and corresponding high-fidelity data and maximized the similarity between the predictions and the high-fidelity training set. Their results showed that SRGAN outperformed low-fidelity simulations and a direct DNN by accurately capturing the locations and magnitudes of strong shocks. Mirza and Osindero [37] incorporated conditioning information together with random inputs into the generative process for regression tasks, creating the conditional GAN (cGAN). cGAN aimed to generate shapes that are not only similar to existing data but also adhered to the specified conditions, which served as labels in regression tasks. Aggarwal et al. [38] successfully demonstrated the capability of a cGAN model for regression problems by conducting experiments on an F-16 ailerons dataset. cGAN was able to predict the aileron control input, which was described by 40 continuous inputs, including climb rates, pitches, and curvatures of the flight trajectory. However, although SRGAN and cGAN demonstrated the potential for regression tasks, neither model directly realized the mapping from original input space (such as design requirements) to output space (such as optimal designs).
In addition, it is important to implement multi-disciplinary design optimization (MDO) and leverage uncertainties and errors in the design process and operation for eVTOL aircraft. For system-level eVTOL designs, Ha et al. [39] assembled a large-scale MDO to minimize the required power. By merging thrust margin as a model uncertainty, significant improvements in robustness were made with a relatively minimal cost in efficiency. Chinthoju et al. [40] implemented a distributed MDO method, Analytical Target Cascading (ATC), which was capable of considering complex interactions between different disciplines and subsystems effectively. This feature further improved the design of the eVTOL aircraft. ATC showed the potential of utilizing customized optimization algorithms for different subproblems, instead of using a general optimization solver. Rostami et al. [41] presented an implementation of a robust and efficient possibility-based design optimization method for the MDO of an eVTOL tilt-wing aircraft in the design phase. To achieve an optimal design, the uncertainties raised from early design phases due to low-fidelity calculation were evaluated and utilized to adjust the final design.
Moreover, as claimed in a document published by the Federal Aviation Administration [42], eVTOL is one of the major aircraft innovations capable of facilitating more frequent travel between desired locations, such as metropolitan areas. Urban air mobility (UAM) corridors are necessary for cooperatively managed operations. However, due to continuous growth, UAM corridors may result in exceeding initial design capacities. Moreover, when also including environmental factors, for instance, weather, obstacles, and other constraint information, UAM operators must have the ability to respond instantaneously to manage real-time operations. Therefore, on-board optimal trajectory prediction becomes crucial for eVTOL aircraft, as it provides the aircraft with efficient operations and real-time decision-making.
In this work, we propose a transfer learning (TL)-enhanced regression GAN (regGAN), termed regGAN-TL, surrogate. In particular, we implement a regGAN surrogate [43] to predict optimal takeoff trajectory designs directly from design requirements [44]. On the one hand, the regGAN surrogate adopts the GAN architecture except that the generator reads design requirements as input and predicts optimal takeoff trajectory designs. Training the generator involves a combined loss function on a contextual loss to minimize the difference between surrogate predictions and corresponding training observations and an adversarial loss for matching the predicted optimal trajectory shapes toward training data patterns. Specifically, the mean squared error (MSE) represents the contextual loss while the binary cross-entropy (BC) constitutes the adversarial loss. On the other hand, TL is a machine learning technique leveraging pre-trained models on different but related tasks [45]. Representative TL works include the multi-fidelity convolutional neural network surrogate model with TL that mapped the relation between shape parameters and aerodynamic performance [46]. Results showed that the TL-based surrogate noticeably reduced computational costs with satisfactory predictive performance compared with corresponding reference surrogate models. In this work, we first train a regGAN with zero weight on the adversarial loss, which is in effect a direct DNN, then implement TL on the trained regGAN while keeping the weight on the contextual loss as one and increasing the weight on the adversarial loss. Please refer to Section 3 for more details. We summarize the contribution of this paper as follows: (1) We implement the new inverse mapping concept for predicting optimal eVTOL takeoff trajectory designs; (2) We propose and develop the novel regGAN-TL surrogate; (3) We realize real-time surrogate-based optimal takeoff trajectory designs for eVTOL aircraft to enrich research studies in this field.
To more effectively introduce our work, Figure 1 demonstrates the prediction of optimal eVTOL takeoff trajectory profiles for design variables. The predictions of the regGAN-TL surrogate model target the match with the reference profile generated by simulation-based optimal designs. An illustration of an Airbus A 3 Vahana is also provided to assist in understanding the system.
The rest of the paper is organized as follows. We elaborate on the MOGP and GAN variant setups for surrogate modeling in Section 2, which also includes the optimization framework, simulation models, and verification metrics. In Section 3, we demonstrate the use of the proposed regGAN-TL on eVTOL optimal takeoff trajectory predictions and compare results with reference surrogates. Section 4 ends this paper with conclusions and future work.

2. Methodology

This section introduces an open-source optimization toolbox, Dymos [47] within the OpenMDAO framework (https://github.com/OpenMDAO/OpenMDAO, accessed on 1 August 2022) [48], optimal control problems, and simulation models. Furthermore, the MOGP, DNN and GAN series models within TensorFlow [49] version 2 for surrogate modeling as well as surrogate verification metrics are detailed.

2.1. Dymos and Optimal Control

An open-source framework for multidisciplinary design, analysis, and optimization (MDAO), OpenMDAO [48], enables researchers to efficiently analyze and optimize complex engineering systems. This framework is fundamentally designed for gradient-based optimization, which is based on efficient coupled derivative computation and problem sparsity exploitation [48]. The characteristic of component-based architecture enables the ease of multidisciplinary model integration, facilitating seamless collaboration and reusability. OpenMDAO transcends in managing data flow, coupling interdisciplinary subsystems, solving equation systems, as well as providing efficient automatic differentiation for gradient-based optimization and sensitivity analysis. Built upon the OpenMDAO framework, Dymos [47] is an advanced software toolkit for optimizing complex aerospace systems. Dymos offers additional specialized tools and algorithms for dynamic system analysis and optimization and excels in solving problems that involve the evolution of systems over time. This feature is beneficial for trajectory optimization, where several objectives and constraints need to be taken into account. In addition, Dymos possesses an intuitive interface and visualization tools for basic problem formulation and manipulation while supporting integration with external models and software. Thus, Dymos represents one of the state-of-the-art MDAO toolkits, allowing for efficient optimal control of complicated aerospace systems.
Optimal control aims to discover a control of a dynamic system such that an objective function is optimized with constraints satisfied. The evolution of a system state is typically governed by an ordinary differential Equation (ODE) or a differential algebraic equation. Dymos characterize all dynamics as ODEs and formulates a general optimal control problem as follows [47].
Minimize J = f obj ( x , t , u , d ) subject to x ˙ = f ode ( x , t , u , d ) , t l b t t u b , x l b x x u b , u l b u u u b , d l b d d u b , g 0 , l b g 0 ( x 0 , t 0 , u 0 , d 0 ) g 0 , u b , g f , l b g f ( x f , t f , u f , d f ) g f , u b , p l b p ( x , t , u , d ) p u b ,
where J is a scalar objective function computed by f obj , x is a state variable vector, t is a time variable, u is a dynamic control vector, d is a fixed design parameter vector, x ˙ is a temporal derivative vector of state variables governed by the ODE f ode , g 0 is an initial condition (at t 0 ) constraint vector, g f is a final condition (at t f ) constraint vector, p is a path constraint vector, subscript l b represents lower bounds, and subscript u b represents upper bounds.
Solving optimal control problems involves a transcription process, i.e., discretizing the continuous problem into a form that a nonlinear optimizer can handle. Dymos supports two types of transcriptions, explicit shooting and implicit collocation. Explicit shooting (such as the Euler method) is a numerical integration technique propagating a current state ( x i ) to the next state ( x i + 1 ) subject to given controls [50,51].
x i + 1 = x i + ( t i + 1 t i ) · f ode ( x i , t i , u i , d ) ,
which is also known as the rectangular rule. Shooting methods are explicit since x t + 1 is not used for computing a discrete value of f ode . In contrast, implicit collocation methods separate control and state trajectories into one or multiple segments over the time range and solve the whole trajectories together. Collocation methods assume that the states ( x ), controls ( u ), and state rates ( x ˙ ) are continuous within each segment [47]. This feature guarantees no instantaneous changes in x , u , and x ˙ and enables polynomial curve representations within each segment. Collocation methods propose state and control discrete points in time within each segment and adopt high-order Gauss–Lobatto rules [52] to reconstruct polynomial curves. The polynomial curves approximate the entire state ( x ˜ ) and control ( u ˜ ) trajectories via interpolation and estimate state rates ( x ˙ ˜ ) via first-order derivatives of the curves. In the meantime, Dymos computes x ˙ using x ˜ and u ˜ via f ode and calculates the difference with x ˙ ˜ at collocation locations (center points of each time segment in this work). In addition, a nonlinear solver or optimizer is in charge of minimizing the differences. We use the Gauss–Lobatto collocation method in this work due to its higher computational efficiency compared to explicit shooting [47].
Gauss–Lobatto rules generalize the trapezoidal rule and Simpson’s rule to higher orders [52]. Specifically, the trapezoidal rule adopts a linear function to approximate f ode and Simpson’s rule adopts a quadratic function, while Gauss–Lobatto enables higher-order approximation functions. Note that u is of the same polynomial order as f ode while x is one order higher due to the integral operation. As suggested by Dymos, we use the quadratic Gauss–Lobatto rule, which is essentially the Simpson’s integration rule, within each segment to represent u and thus, Hermite cubic polynomials for x and compute x c at t c = ( t i + t i + 1 ) / 2 by interpolating the Hermite polynomials
x c = 1 2 ( x ˜ i + x ˜ i + 1 ) + ( t t + 1 t i ) 8 f ode x ˜ i , t i , u ˜ i , d f ode x ˜ i + 1 , t i + 1 , u ˜ i + 1 , d .
Then we can compute discrete values of the governing ODEs at t c by plugging in x ˜ c and u ˜ c . We approximate x ˙ ˜ c by taking the first-order temporal derivatives of the Hermite cubic polynomials and equate x ˙ ˜ c with f ode at t c to achieve
g H S = x ˜ i x ˜ i + 1 + t i + 1 t i 6 f ode ( x ˜ i , t i , u ˜ i , d ) + 4 · f ode x ˜ c , t c , u ˜ c , d + f ode x ˜ i + 1 , t i + 1 , u ˜ i + 1 , d = 0 ,
which are also termed Hermite–Simpson system constraints [52]. Dymos adds the Hermite–Simpson constraints to Equation (1) and we use the IPOPT (Interior Point OPTimizer) [53] within the pyoptsparse [54] to solve the optimal control problem in this work. In this work, we leverage 10 segments within the whole time duration, which results in 21 design variables (quadratic polynomial points) for parameterizing each control input in this work. We force the continuity of x and u within Dymos by setting the corresponding values shared at segment boundaries.

2.2. Simulation Models

In this work, the simulation models include aerodynamics, propulsion, propeller–wing interaction, and dynamics to govern eVTOL aircraft attitudes.

2.2.1. Aerodynamics

The aerodynamics models assume that there is no flow interaction between wings. The forward and rear wings have the same reference area and the rotations of both wings are set to be the same, which leads to the same angle of attack, lift, and drag by the two wings. Separated-flow conditions are considered during the vertical to horizontal transition phase. Tangler and Ostowari [55] developed a model for wing aerodynamics to predict the lift and drag beyond the linear-lift region. The post-stall lift coefficient is expressed as
C L = A 1 sin 2 α + A 2 cos 2 α sin α ,
where
A 1 = C 1 2 ,
A 2 = ( C L s C 1 sin α s cos α s ) sin α s cos 2 α s ,
and
C 1 = 1.1 + 0.018 A R ,
where α is the wing angle of attack, α s is the angle of attack at stall, C L s is the lift coefficient at stall, and A R is the wing aspect ratio.
At a wing angle of attack of between 27.5 and 90 degrees, the drag coefficient is given by
C D = B 1 sin α + B 2 cos α ,
where
B 1 = C D max ,
B 2 = C D s C D max sin α s cos α s ,
and
C D max = 1.0 + 0.065 A R 0.9 + t / c ,
where C D s is the drag coefficient at stall, and t / c is the airfoil thickness-to-chord ratio. For the post-stall drag coefficient at a wing angle of attack below 27.5 degrees, the equation is
C D = 0.008 + 1.107 α 2 + 1.792 α 4 ,
where α is in radians.
The NACA 0012 symmetric airfoil is used as the cross-section due to the configuration of interest. Additionally, the well-known finite wing corrections from lifting-line theory for unswept wings in incompressible flow are utilized to correct the lift-curve slope prior to stalling [14] as
α wing = α airfoil 1 + α airfoil / ( π · A R · e ) ,
where α wing is the finite-wing lift-curve slope, α airfoil is the airfoil lift-curve slope, and e is the span efficiency factor. The pre-stall lift curve is assumed to be linear and the wing stall angle of attack is 15 degrees.
For pre-stall parasite drag coefficients, induced drag has been added with a formula based on lifting-line theory to obtain the total drag of the wing before stall,
C D i = C L 2 π · A R · e ,
where C D i is the induced drag coefficient, C L is the wing’s lift coefficient, and A R = 8 for each wing of the configuration.
For the fixed landing gear and the fuselage, additional drag on an assumed drag area is assumed to be independent of the free-stream angle of attack. The induced drag for the configuration is given by
D induced = 2 1.4 L wing 2 π q b 2 · 0.95 = 2 L wing 2 π q b 2 · 0.68 ,
where L wing is lift per wing, q is the free-stream dynamic pressure, and b is the wing span.
The fuselage is assumed not to contribute any additional lift and the wings are located forward and aft of the center of gravity, so that, without any extra tuning, the moments of wings are automatically balanced. Please refer to Chauhan and Martins [14] for more details.

2.2.2. Propulsion

The Momentum theory [56] calculates the thrust from propellers as a function of power,
P disk = T V + κ T V 2 + V 2 4 + T 2 ρ A disk ,
where P disk is the power supplied to the propeller disk excluding profile power, T is the thrust, V is the free-stream velocity component normal to the propeller disk, ρ is the air density, A disk is the disk area of the propeller, and κ is the correction factor utilized to incorporate induced power losses associated with non-uniform flow, tip effects, and other simplifications made in momentum theory ( κ = 1 for ideal power). Power is chosen as a design variable in the optimization problems. Moreover, the Newton–Raphson method [57] is used to solve the nonlinear equation for thrust with power as an input.
McCormick [58] and Leishman [59] both applied blade-element theory to a rotor operating in nonaxial forward flight and estimated the profile power coefficient as
C P p = σ C d 0 p 8 ( 1 + 4.6 μ 2 ) ,
where C P p is defined as P p / ρ A disk R 3 Ω 3 , P p is the profile power, R is the radius of the propeller, Ω is the angular speed, σ is the solidity, C d 0 p is a representative constant profile drag coefficient, and μ is defined as
μ = V Ω R ,
where V is the free-stream velocity component parallel to the disk. For our cases, we assume that Ω = 181 rad/s for R = 0.75 m, σ = 0.13 , and C d 0 p = 0.012 .
With electric powers as input, a factor k elec is used to account for mechanical and electrical loss related to electric systems, batteries, and motors. Then, P disk is given by
P disk = k elec P elec P p ,
where P elec is the power from the batteries. The limit of maximum available electrical power is set to 311 kW. We take the loss factor k elec , ranging from 0.7 to 0.9, as one of our design requirement inputs for surrogate modeling.
Furthermore, when the free-stream flow is not normal to the propeller disks, the normal force N is estimated as
N = 4.25 σ e sin ( β + 8 ) f q A disk 1 + 2 σ e tan α i n ,
where q is the dynamic pressure based on the free-stream velocity component normal to the propeller disk, β is the blade pitch angle at 0.75 R and is assumed to change linearly from 10 degrees at a flight speed of 0 m/s to 35 degrees at the cruise speed of 67 m/s, A disk is the propeller area, and α i n is the incidence angle. The remaining terms are given in the following manner. σ e is the effective solidity and given by,
σ e = 2 B c b 3 π R ,
where B is the number of blades per propeller, c b is the average chord length of the blades, and R is the propeller radius. We compute the thrust factor as
f = 1 + 1 + T c 1 2 + T c 4 ( 2 + T c ) ,
where T c is a thrust coefficient defined as
T c = T q A disk ,
and T is the thrust.

2.2.3. Propeller–Wing Interaction

In order to model the interaction between a wing and the flow induced by propellers, momentum theory is implemented. The induced speed at the disk v i is calculated as    
v i = V 2 + V 2 4 + T 2 ρ A disk .
According to momentum theory, the increase in the chord-wise component of the free-stream velocity due to the propeller–wing interaction is quantified by an empirical induced factor k in . The effective k in for the wing is anticipated to be close to but slightly less than 1, especially at low-speed conditions for the aircraft. Moreover, if k in is too low the propeller–wing interaction would not match realistic scenarios, hence, we vary k in [ 0.3 , 1.0 ] in this paper.

2.2.4. Dynamics

The aircraft trajectory simulation uses a two-degrees-of-freedom (DOF) representation [60]. Figure 2 displays the forces and the angles on the aircraft. The horizontal and vertical components of the aircraft velocity are solved as functions of time, considering the control variables, namely, the electrical power and wing-tilt angle. The horizontal and vertical components of velocity rate (i.e., acceleration) at each time step are given as
v ˙ x = T sin θ D fuse sin ( θ + α ) D wings sin ( θ + α EFS ) L wings cos ( θ + α EFS ) N cos θ m ,
and
v ˙ y = T cos θ D fuse cos ( θ + α ) D wings cos ( θ + α EFS ) + L wings sin ( θ + α EFS ) + N sin θ m g m ,
where θ is a wing angle relative to the vertical, α is a free-stream angle of attack, and α EFS is the effective free-stream angle of attack experienced by the wings due to propeller influence, m is the mass of the aircraft, T is the total thrust, D fuse is the drag of the fuselage, D wings is the total drag of the two wings, L wings is the total lift of the two wings, N is the total normal force generated by the propellers, and g is the gravitational acceleration. We use implicit collocation methods to solve for states as described in Section 2.1.

2.3. Multi-Output Gaussian Processes

GP assumes a model response is a realization of a Gaussian process indexed by input parameters ( x ) [61]
y G P ( x ) = β f ( x ) + σ 2 GP ,
where β is an unknown coefficient vector to be determined, f is a user-selected basis function (typically a polynomial function), σ 2 is the variance of the GP, and GP is a zero-mean, unit-variance, stationary GP represented by a correlation function. Specifically, when applied for predictions, GP assumes that the quantity to be predicted ( y ^ ) at a query point ( x ) and training observations ( y ) follow a multi-variate normal (MVN) distribution defined by [62]
y ^ ( x ) y MVN f ( x ) · β F · β , σ 2 1 r ( x ) r ( x ) R ,
where F is a basis function matrix with F i j = f j ( x i ) , i = 1 , , N ; j = 0 , , P , N is number of training samples, P is the number of polynomial terms in f , and r ( x ) is a cross-correlation vector between x and each of the input vectors in training data with r i = C x , x i ; θ , i = 1 , , N , R is a correlation matrix between the training inputs with R i , j = C x i , x j ; θ , i , j = 1 , , N , and C is a user-selected correlation function.
The training GP is estimating unknown parameters θ using optimization algorithms, such as the maximum-likelihood estimation (MLE) in this work. MLE determines θ by maximizing the likelihood of the observations y since each observation is a GP realization
L ( θ ; y ) = ( Det ( R ) ) 1 / 2 ( 2 π σ 2 ) N / 2 exp 1 2 σ 2 ( y F β ) R 1 ( y F β ) ,
where Det ( · ) is determinant of a matrix, exp is the exponential operation, and β and σ 2 are functions of θ and can be approximated as
β ^ = ( F R 1 F ) 1 F R 1 y ,
σ ^ 2 = 1 N ( y F β ) R 1 ( y F β ) .
Thus, determining θ results in the following form
θ ^ = arg min θ [ log L ( θ ; y ) ] = arg min θ 1 2 log ( Det ( R ) ) + N · log ( 2 π σ 2 ) + N .
In the prediction phase, GP provides at a query point not only the prediction but also a variance, which enables confidence intervals [62]
y ^ ( x ) = f β ^ + r ( x ) R 1 y F β ^ ,
σ y ^ 2 ( x ) = σ 2 1 r ( x ) R 1 r ( x ) + u ( x ) ( F R 1 F ) 1 u ( x ) ,
where
u ( x ) = F R 1 r ( x ) f ( x ) .
In this work, we apply a constant basis function f ( x ) = 0 and a linear basis function f ( x ) = x . The correlation function C x , x captures the relationship between the function values at different input points x and x . We implement the Matérn 3/2 kernel and the square exponential (SE) kernel. The Matérn covariance between two points ( x and x ) is
C ν ( τ ) = 2 1 ν Γ ( ν ) 2 ν τ θ K ν 2 ν τ θ ,
where Γ is the gamma function, K ν is the modified Bessel function of the second kind, and θ is the unknown kernel length scale, ν is a smoothness parameter ( ν = 3 / 2 representing a once differentiable function in this work), and τ = | x x | is the Euclidean distance. The Matérn kernel is stationary since the covariance only depends on distances between points. We also test the SE kernel, which is an example of a radial basis function
C S E = exp 1 2 | x x | θ .
MOGP extends GP to handle multiple outputs simultaneously, especially when dealing with correlated outputs which enables dependent modeling. MOGP introduces additional complexity due to modeling correlations between multiple outputs, but MOGP remains efficient and more capable than training individual GPs for each output. Thus, a key feature of MOGP is to harness applicable information across outputs to provide more accurate predictions than separately modeling correlated outputs [28]. Mathematically, MOGP extends Equation (29) to a multi-output prediction ( y ^ R T × 1 ) at a query point ( x )
y ^ ( x ) y M MVN F y ^ · β F M · β , σ M 2 R R M R M R M M ,
where y M R N T × 1 is a training response vector containing multiple outputs, N is the number of training samples, and T is the dimension of multiple outputs at one query point; F y ^ R P × T has T columns of f basis functions as defined above, F M R N T × P has blocks of F y ^ , i for i = 1 , , N , and P is the number of polynomial terms, which also denotes the number of components in β ; R R T × T is a correlation matrix between two arbitrary model response at x
R = r 11 ( x , x ) r 1 T ( x , x ) r T 1 ( x , x ) r T T ( x , x ) ,
r t 1 , t 2 corresponds to the correlation between output y t 1 ( x ) and y t 2 ( x ) ; R M R N T × T has blocks of R t 1 , t 2 ( X , x ) = r t 1 , t 2 ( x i , x ) for i = 1 , , N , and t 1 , t 2 = 1 , , T ; and similarly, R M M R N T × N T has blocks of R t 1 , t 2 ( X , X ) = r t 1 , t 2 ( x i , x j ) for i , j = 1 , , N , and t 1 , t 2 = 1 , , T . MOGP makes predictions following the same principle as a single-output GP (Equations (31)–(36)) but using the generalized multi-output variables [28].
We use the MOGPTK [27] MOGP toolkit to establish the MOGP models. MOGPTK is an open-source Python package that provides a natural way to train and use MOGP. MOGPTK is built upon GPflow [63], an extensive GP framework with a wide variety of implemented kernels, likelihoods, and training strategies. The main components of MOGPTK include MOGP modeling, data handling, parameter initialization, and parameter interpretation (please refer to de Wolff et al. [27] for more details).

2.4. Deep Neural Networks

2.4.1. DNN Model Setup

DNNs [64] are composed of multiple layers of interconnected nodes, also known as neurons, to read input data and make predictions. In each layer, a DNN transforms the data from its previous layer utilizing a combination of linear operations and activation functions. Neurons in the same layer share a weight matrix ( W ) and a bias vector ( b ) to be tuned. The operation within each layer follows o = σ ( W x + b ) , where x is a data vector from the previous layer, σ ( · ) represents an activation function used for injecting nonlinearity, and o is an output vector. During model training, W and b are tuned to optimize DNN performance. In this work, we use two activation functions, (1) the rectified linear unit (relu) [65] activation function, σ relu ( x ) = max ( 0 , x ) , which replaces all negative input values with zeros while passing positive input values unchanged, and (2) the sigmoid [66] activation function, σ sigmoid ( x ) = 1 / ( 1 + e x ) , which maps input values to a range between zeros and ones, making it suitable for problems where the output is normalized within this range.
To complete model training, a loss function is used to measure the difference being minimized between predictions and training observations. We implement two commonly used loss functions in this work: (1) Mean square error (MSE) [67] loss, which measures the average squared difference.
L MSE = 1 N i = 1 N y i y ^ i 2 2 ,
where N is the number of training samples, y i and y ^ i are true values and predictions of the ith training sample, respectively, and | · | 2 is the L 2 norm operation. (2) Binary cross-entropy (BC) [68], which measures the dissimilarity between predictions and actual binary labels.
L BC = 1 N i = 1 N y i · log y ^ i + 1 y i · log 1 y ^ i ,
where y i is the true binary label (0 or 1) for the data point, and y ^ i is the predicted probability (between 0 and 1) generated by the model. Thus, unknown parameters in the DNN are tuned through the gradient-based Adam optimizer (see the following section) enabled by backpropagation within Tensorflow [49].

2.4.2. Adam Optimizer

As an efficient stochastic optimization algorithm, the Adam optimizer operates with moderate memory usage and first-order gradients [69]. Adam has proved to be powerful and effective by combining the principles of gradient descent with the momentum [70] and root-mean-square propagation (RMSP) [71] algorithms.
The gradient descent algorithm has been enhanced by the momentum algorithm [70] by incorporating the weighted average of the gradients. The update is expressed as
w i + 1 = w i α m i ,
where
m i = β m i 1 + ( 1 β ) L w i ,
where the subscripts i 1 , i, and i + 1 are the indices of previous, current, and next optimization steps, respectively, initially m 0 = 0 , w is an unknown to be determined, α is a user-defined learning rate, L w i is the derivative of the loss function with respect to the unknown at the current optimization step, and β is a constant moving average parameter.
By employing the exponential moving average, RMSP [71] is an adaptive learning algorithm and is expressed as
w i + 1 = w i α ( v i + ϵ ) 1 / 2 L w i 2 ,
where
v i = β v i 1 + ( 1 β ) L w i .
A small positive constant ( ϵ = 10 7 ) is utilized. The remaining variables remain the same as in Equation (44).
The Adam optimizer integrates the momentum and RMSP algorithms in the following formulation:
w i + 1 = w i m ^ i α v ^ i + ϵ ,
where
m ^ i = m i 1 β 1 , v ^ i = v i 1 β 2 ,
where m t and v t follow the updating process as described above. We utilize the Adam optimizer within Tensorflow and set β 1 = 0.9 , β 2 = 0.999 for all neural network training in this study.

2.5. Generative Adversarial Network Models

GANs are a state-of-the-art generative modeling strategy with an innovative DNN architecture. In this section, we introduce GANs along with their variations, namely, cGAN and the regGAN-TL.

2.5.1. Generative Adversarial Networks

As a type of generative model, GANs consist of two neural networks, a generator and a discriminator (Figure 3) [33]. Initially, the generator produces random outputs according to random input variables, however, as it receives feedback from the discriminator, it learns to generate more realistic samples over training. Conversely, the discriminator is trained on generated samples by the generator and an existing training dataset. The discriminator distinguishes between existing and generated shapes and provides feedback to the generator by assigning probabilities to each sample, indicating how likely the shape is from the existing data. During the training process, the generator and the discriminator compete with each other in an adversarial manner, which encourages the improvement of both models. Once trained, the generator generates samples so similar to the existing data that the discriminator cannot differentiate between them.
Training a GAN model mathematically follows a minimax setup [33]
min G max D L GAN ( D , G ) = E x P data log D ( x ) + E z P z log 1 D G z ,
where x is sampled from the existing data distribution P data , z is sampled from random variable distributions P z , and G and D are the generator and discriminator, respectively. Hence, a trained GAN model is able to generate realistic designs with ample shape variability within prior noise variable distributions.

2.5.2. Conditional Generative Adversarial Networks

As an extension to the vanilla GAN, cGAN [37] integrates additional conditioning information into the generative process (Figure 4). In cGAN, both the generator and the discriminator receive random noise samples as input as well as additional conditioning information d . The d condition provides guidance to the generator to produce samples that correspond with the specified conditions. The d condition can be represented in various forms—for example, class labels and text descriptions—that depend on the implementation. The generator combines the prior input z and d in a joint hidden representation that takes advantage of the adversarial training framework’s flexibility in composing. In addition, the discriminator takes d and y pairs as well as generated shapes by the generator to accomplish the adversarial competition with the generator. Similar to the GAN loss function (Equation (49)), the cGAN loss function is given by a minimax problem involving the d condition,
min G max D L cGAN ( D , G ) = E y P data ( y ) log D ( y | d ) + E z P z ( z ) log ( 1 D ( G ( z | d ) ) ) .
Furthermore, cGAN is capable of dealing with regression tasks encoding the regression labels (i.e., model observations) as additional conditioning information (Figure 4). Similar to the original GAN, the cGAN generator takes random variables combined with arbitrary model observations as an additional group of inputs. The purpose is to create data associated with model observations. Attempting to distinguish between generated and real data, the discriminator takes both generated samples g and real data pairs ( d and y ) as inputs. The cGAN generator learns to map the random noise variables and model observations to corresponding data samples through training, assisting in regression tasks by generating samples consistent with the given aim values. This technique provides a means to handle problems with complex mappings and incorporate the adversarial feature of GAN into the regression process. In our work, y are the predicted eVTOL optimal takeoff trajectory control points, d includes design requirements, and we set z with only one element and assign the Uniform(0, 1) distribution to z . For regression tasks, an average of the predictions over a fixed set of d and 100 randomly sampled z is considered as the predicted model response [37], which is the optimal takeoff trajectory in this work. The network architecture settings for the generator and the discriminator within cGAN will be discussed in Section 2.5.3.

2.5.3. Regression Generative Adversarial Networks with Transfer Learning

The regGAN model has a similar structure to cGAN (Figure 4) but only takes design requirements ( d ) as the input to the generator (Figure 5). Moreover, regGAN incorporates a combined loss function of a contextual loss via L MSE (Equation (41)) and an adversarial loss via L BC (Equation (42)) for model training. Thus, regGAN couples the predictive feature of surrogate models and the generative feature of GAN models. The combined loss function is as follows.
min G max D L regGAN ( D , G ) = w MSE · min G L MSE ( G ) + w BC · min G max D L BC ( D , G )
= w MSE · min G 1 N i = 1 N y i G ( d i ) 2 2
+ w BC · min G max D E y P data log D ( y ) + E d P d log ( 1 D ( G ( d ) ) ) ,
where w MSE and w BC are constant weights on L MSE and L BC , respectively.
Similar to conventional surrogates, the L MSE navigates the predictions of regGAN to match ground truth observations. Meanwhile, L BC arrives at similar patterns between observations and predicted results because of competitive training through the GAN architecture. In this work, the regGAN generator reads design requirements ( d ) and predicts corresponding optimal takeoff trajectories ( y ). The characteristics of the eVTOL takeoff trajectory prediction make it necessary to develop and introduce the regGAN surrogate. First, regGAN adopts the predictive ability of DNNs through the generator. Second, the optimal takeoff trajectory profiles typically follow realistic patterns, such as ascending optimal power and wing-to-vertical angles in general. In this manner, the adversarial loss based on the GAN architecture facilitates the training by automatically filtering out unrealistic trajectory profiles.
The regGAN-TL method consists of two key steps for implementation. First, we train regGAN models with only the L MSE by setting w MSE = 1 and w BC = 0 that are in effect direct DNN surrogates. We save the best model with the highest validation accuracy on 200 validation samples during the training. We then integrate the L BC into L regGAN by setting w BC = 0.01 , 0.001 , and 0.0001 , respectively. regGAN-TL models are then trained starting with the pre-trained best regGAN model for an enhanced predictive performance. The core ingredient of regGAN-TL is to harness the predictive power of DNNs and fine-tune the trained model by matching the predicted shapes further with training data patterns. The observations in training datasets by the generator are scaled within the range of [0, 1] via MinMaxScaler within Scikit-learn.
To summarize, the key difference between regGAN-TL and conventional surrogates (such as MOGP) lies in the introduction of a discriminator to facilitate the model training. Within the GAN-series variants, regGAN-TL directly captures the mapping between input and output spaces rather than relying on the conditioning information of cGAN and enhances vanilla regGAN performance via the TL strategy. For fair comparisons, we implement the same architectures and setups (Table 1) to train cGAN, regGAN, and regGAN-TL surrogates and predict the power profile, wing angle profile, and total takeoff time, separately. Please refer to Section 3 for a detailed comparison between the proposed regGAN-TL and the reference surrogates.

2.6. Verification Metrics

Mean L 1 relative error ( ϵ ¯ L 1 ) is commonly used to evaluate surrogate predictive performance. The L 1 relative error, also known as the mean absolute percentage error, is a metric used to measure the relative L 1 norm difference between predictions compared with actual values.
ϵ ¯ L 1 = 1 N test i = 1 N test y pred , i y true , i y true , i × 100 % ,
where N test is the number of testing samples, y true , i and y pred , i are the true observations and predicted values of the ith set of design requirements, and | · | is the L 1 norm operation. The relative accuracy is calculated as
A C C L 1 = 1 ϵ ¯ L 1 .
The L 1 relative accuracy measures the average relative match between predicted values and true observations, expressed as a percentage. It gives an indication of the degree to which surrogate predictions agree with true observations, relative to true observations, providing deeper insights through the errors relative to model response magnitudes. In this work, we compute relative accuracy for each testing sample and quantify the predictive performance using the mean and standard deviation of the computed accuracy.

3. Results and Discussion

In this section, we demonstrate the performance of regGAN-TL on the optimal takeoff trajectory predictions and compare its performance against MOGP, cGAN, and vanilla regGAN surrogates. MOGP represents a promising traditional surrogate method while cGAN and vanilla regGAN both adopt GAN architectures. We elaborate on the problem formulation and result discussion as follows.

3.1. Problem Formulation

Table 2 presents the formulation of the takeoff trajectory optimization problem. We aim to minimize the electrical energy consumed to reach a minimum vertical displacement of 305 m and a minimum horizontal speed of 67 m/s. The design requirements (i.e., design constraints and flight conditions) consist of angle of attack range constraints α lim [ 10 , 15 ] deg, the maximum acceleration magnitude constraint a max [ 0.2 g , 0.4 g ] (g is the gravitational acceleration), the propeller-induced velocity factor k in [ 0.3 , 1.0 ] , the electrical and mechanical loss factor k elec [ 0.7 , 0.9 ] , and the wing size factor S ref [ 0.9 , 1.0 ] . Note that the a max constraint addresses passenger comfort for future air transportation systems, although eVTOL vehicles currently have not been widely utilized for these applications. The design variables are the time-sequence electrical power ( P ) and wing angle to vertical ( θ ), each of which is parameterized using 21 quadratic curve control points, and the total takeoff time ( t flight ). As mentioned in Section 2, we solve this takeoff trajectory design problem using the open-source Dymos package within OpenMDAO.

3.2. Surrogate-Base Inverse Mapping

We use 1000 random Latin hypercube sampling (LHS) points [72] as training data for the MOGP and cGAN surrogates and 300 LHS testing samples to verify predictive performance. Additionally, we use 200 LHS validation samples to evaluate the predictive performance and save the model with the lowest validation error during the training. We integrate MOGP with Matérn 3/2 and SE correlation functions paired with constant and linear basis functions (Section 2.3). Results show that the SE correlation function provides higher testing accuracy ( μ test ) than the Matérn 3/2 function, while the basis function exerts only minor effects (Table 3). As the Matérn parameter ν approaches zero, the Matérn function converges to the SE function, which is infinitely differentiable. Thus, a smooth mapping via MOGP fits well with the eVTOL takeoff trajectory predictions. In addition, the MOGP M2 surrogate (SE correlation and linear basis function) in Table 3 significantly reduces the standard deviations ( σ test ) in testing accuracy compared with the Matérn-based MOGP surrogates. A lower σ test represents consistent predictive accuracy over the testing dataset, which further confirms the better performance of the SE-based MOGP. Moreover, we train the cGAN models through MSE loss or BC loss, both of which improve the μ test to above 98%, while the best MOGP μ test on P and θ are 92.5% and 94.7%, respectively, (Table 4). The cGAN surrogates are able to further reduce the σ test for more robust predictive performance.
For a clearer and more convincing comparison, we select an arbitrary case from the testing data for visualization (Table 5). The predictive performances of MOGP M2 and cGAN surrogates (Table 6) agree well with testing accuracy counterparts. Under the design requirement setup in Table 5, we complete simulation-based design as a reference (Figure 6). The simulation-based optimal design shows that the P and θ profiles follow ascending trends in general. The eVTOL aircraft tends to start with lower powers at low wing angles (close to vertical displacement direction). Especially in the first 7.5 s, the optimal trajectory design gradually increases power to the maximum and almost linearly turns the wings towards the forward direction, which maintains the acceleration at the upper bound constraint. From 7.5 s to 12.5 s, the aircraft maintains the maximum power and minor changes of wing angle and achieves a decreasing acceleration until the latter is close to zero. The vertical displacement plot shows that the aircraft completes most of the vertical displacement in the first half of the trajectory, i.e., 0∼12.5 s. In the second half of the trajectory, i.e., 12.5∼25 s, the power remains at the maximum level while the wings almost linearly turn forward until reaching the horizontal direction. The increasing acceleration and minor change in vertical displacement reveal that the second phase is mainly concerned with reaching the required horizontal speed constraint. The surrogate predictions show that the best MOGP model has obvious differences with the reference optimal while the cGAN surrogates could match the optimal P and θ profiles well. These results confirm that cGAN outperforms MOGP, meaning that the GAN-architecture-enabled surrogates are capable of regression tasks. However, the computed quantities, such as acceleration, exhibit unreasonable wiggles and even violate the acceleration constraint during the flight task, which would lead to poor passenger comfort experience. Therefore, we must seek novel strategies that more effectively leverage the GAN architecture for predictions.
To further improve the predictive performance, we implement the proposed regGAN-TL (Section 2.5.3). Table 7 indicates that the vanilla regGAN with w MSE = 1 and w BC = 0 , which is in effect a feed-forward neural network, achieves over 99% accuracy using only 200 training samples and 99.5% accuracy using 400 samples. As introduced in Section 2.5.3, we aim to integrate the predictive power of deep neural networks and the pattern-matching feature of GAN architectures. The regGAN-TL surrogates enhance the predictive performance of the CL1 model (regGAN with w BC = 0 and CL means combined loss) to achieve over 99.5% accuracy on P and θ predictions (CL4 TL in Table 7) using only 200 training samples. In view of the outstanding predictive capability of the model, we completed a parametric study with respect to fewer training samples while using the same validation and testing datasets. The results show that CL4 TL extensively and consistently improves the predictive performance. Specifically, CL4 TL reaches over 95% accuracy using only 10 samples, 98% using 50 samples, and 99% using 100 training samples. Since the cGAN prediction results reveal that over 98% accuracy is not sufficient to satisfy constraints, we set the accuracy threshold to 99% or 99.5%. The regGAN-TL CL4 surrogate reduces the training cost by half to achieve either accuracy threshold, 99% or 99.5%, compared with regGAN CL1. In addition, regGAN-TL CL4 consistently and extensively lowers the σ test at different numbers of training samples compared with the regGAN CL1 counterparts. We also conducted a parametric study by setting the weighting factor w BC to be 0.01, 0.001, or 0.0001 (Table 7). The results show that the regGAN-TL surrogates possess increasing accuracy with decreasing w BC . This is due to the fact that the GAN architecture tends to match data patterns rather than predict observations such that a relatively low w BC better respects the prediction feature of feed-forward neural networks.
To achieve a more thorough comparison, we completed another parametric study on the vanilla regGAN surrogates by varying w BC from 0.01, 0.001, to 0.0001 (Table 8). The results indicate that regGAN with w BC = 0.001 achieves over 99% accuracy using 200 training samples while only regGAN with w BC = 0.01 can achieve 99.5% accuracy on P and θ using 400 samples. This further confirms the outstanding predictive performance of regGAN-TL and a 50% computational cost reduction on reaching either the 99% or 99.5% accuracy thresholds. Note that the accuracy comparison is mainly on P and θ since t flight exerts lower effects and regGAN-TL handles the model response with data patterns, such as ascending trends of trajectory profiles.
Similar to the comparison between the MOGP and cGAN surrogates, we select an arbitrary case in the testing dataset for visualization (Table 9). Visualization case 2 has a lower a max of ∼0.29 m2/s and a higher k elec of 0.819 while visualization case 1 has a max = 0.388 m2/s and k elec = 0.761 . A lower a max demands a more smooth and gradual transition and a higher k elec allows higher power efficiency, which saves energy. The simulation-based optimal design shows that the eVTOL aircraft starts with a lower power than visualization case 1, which is due to the lower a max constraint and higher k elec (Figure 7). We note a smoother θ transition from the vertical to the horizontal direction, with the immediate acceleration staying as a max for the whole takeoff process. Visualization case 1 exhibits two “separate” phases, with the first phase focused on vertical displacement and the second on horizontal speed. In contrast, visualization case 2 presents no obvious separation, rather, the optimal takeoff designs exhibit a smooth transition, especially on θ . In both visualization cases, we note that the optimal takeoff designs tend to proceed at a max , which makes a max essential for considering passenger comfort.
Table 10 shows that regGAN-TL CL4 is able to achieve 99.5% on P and θ , which agree with the mean testing accuracy (Table 7). Figure 7 shows that all predicted trajectories by the regGAN-TL CL4 surrogates match the simulation-based reference optimal design with almost no noticeable difference. Even for the reconstructed acceleration profile, which cGAN struggles to deal with, regGAN-TL CL4 matches very well with this almost straight-line profile. In contrast, the vanilla regGAN surrogates using 200 samples cannot perform as well as regGAN-TL CL4 with 200 training samples. We thus conclude that the proposed regGAN-TL method does exhibit outstanding predictive potential.

4. Conclusions

In this paper, we investigated surrogate-based optimal takeoff trajectory predictions with minimum energy consumption for electric vertical takeoff and landing (eVTOL) aircraft. We proposed and demonstrated the regGAN-TL surrogate, i.e., regression generative adversarial network (regGAN) enhanced by transfer learning (TL). The proposed regGAN-TL outperformed reference surrogates to achieve over 99.5% accuracy with a lower training cost. The main contribution of this work can be summarized as follows.
Firstly, we introduced the surrogate-based inverse mapping optimization architecture into eVTOL optimal trajectory design. In particular, we leverage surrogate models to predict optimal takeoff trajectories directly from design requirements, including flight conditions and design requirements. Well-trained surrogates enabled real-time eVTOL takeoff trajectory designs with no need to execute any optimization. However, each training sample requires a simulation-based trajectory design optimization, which necessitates developing novel surrogate modeling strategies.
Secondly, we proposed the regGAN-TL surrogate to further enhance regGAN predictive performance. Results showed that regGAN-TL achieved a 99.5% accuracy in predicting optimal design variables using only 200 training samples. In addition, visualization cases verified that the regGAN-TL trajectory exhibited no noticeable differences with simulation-based optimal designs. Moreover, regGAN-TL extensively outperformed representative reference surrogate models, including the multi-output Gaussian process, conditional generative adversarial networks, and a vanilla regGAN. The best reference surrogate required at least 400 samples to achieve a 99.5% accuracy, denoting a 50% computational cost reduction by the use of regGAN-TL.
Thirdly, the successful implementation of regGAN-TL in eVTOL aircraft optimal takeoff trajectory design enriches the literature in this field. We introduced various surrogate modeling strategies into eVTOL takeoff trajectory design, with the proposed regGAN-TL exhibiting the best predictive performance. The energy efficiency achieved in this work pushes eVTOL aircraft another step forward toward broad, real-world applications. The inverse mapping architecture and the regGAN-TL method are also extendable toward other relevant engineering optimizations.
At the same time, it is acknowledged that, despite the promising performance by regGAN-TL, the simulation models used in this work are not high-fidelity but simply effective for describing the physics. We will increase the fidelity of the simulation models in future work, which may lead to a more challenging mapping and higher training cost for surrogate modeling, especially when strong nonlinear characteristics (such as strong shocks in high-speed flow mechanics) exist. In addition, the inverse mapping optimization architecture requires each training sample to be an optimal design through design optimization, which may make the data acquisition process computationally more intensive. So, we plan to investigate and develop other advanced deep learning architectures within the structure of regGAN-TL. Finally, we would also like to compare our method with more reference approaches and validate its performance via flight tests with real drones.

Author Contributions

Conceptualization, X.D.; Methodology, S.-T.Y. and X.D.; Software, X.D.; Validation, S.-T.Y.; Investigation, S.-T.Y.; Resources, X.D.; Data curation, S.-T.Y.; Writing—original draft, S.-T.Y.; Writing—review & editing, X.D.; Supervision, X.D.; Project administration, X.D. All authors have read and agreed to the published version of the manuscript.

Funding

The authors did not receive support from any organization for the submitted work.

Data Availability Statement

The datasets and surrogate models generated during the current study are available from the corresponding author upon reasonable request for the purpose of replication of results.

Acknowledgments

The authors would like to thank NASA for making available the open-source Dymos framework on eVTOL takeoff trajectory design for data acquisition.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Nomenclature

C user-selected correlation function
GP zero-mean, unit-variance, stationary GP
L maximum-likelihood estimation
L MSE Mean square error loss function
L BC Binary cross-entropy loss function
L GAN GAN model loss function
L cGAN cGAN model loss function
M V N multi-variate normal distribution
α wing angle of attack in aerodynamics
α airfoil airfoil lift-curve slope
α EFS effective free-stream angle of attack
α in incidence angle
α free-stream angle of attack
α lim angle of attack constraint value
α s wing angle of attack at stall
α wing finite-wing lift-curve slope
β blade pitch angle
β unknown coefficient vector
Γ gamma function
ϵ small positive constant
ϵ ¯ L 1 Mean L1 relative error
θ wing angle to vertical
θ unknown kernel length scale
κ correction factor in momentum theory
μ test testing accuracy
ν smoothness parameter
π the ratio of the circumference of a circle to its diameter
ρ air density
σ solidity
σ 2 variance of the GP
σ e effective solidity
σ relu Rectified linear unit (relu) activation function
σ sigmoid sigmoid activation function
σ test standard deviations of testing accuracy
τ Euclidean distance
Ω angular speed

References

  1. Johnson, W.; Silva, C.; Solis, E. Concept Vehicles for VTOL Air Taxi Operations. In Proceedings of the Proceedings of the AHS technical conference on Aeromechanics Design for Transformative Vertical Flight, San Francisco, CA, USA, 16–19 January 2018. [Google Scholar]
  2. Bacchini, A.; Cestino, E. Electric VTOL Configurations Comparison. Aerospace 2019, 6, 26. [Google Scholar] [CrossRef]
  3. Electric VTOL News. Airbus CityAirbus NextGen (Technology Demonstrator). 2021. Available online: https://evtol.news/airbus-cityairbus-nextgen (accessed on 24 September 2023).
  4. Electric VTOL News. Aurora Flight Sciences Pegasus PAV. 2019. Available online: https://evtol.news/aurora/ (accessed on 27 June 2023).
  5. Wikipedia. EHang. 2023. Available online: https://en.wikipedia.org/wiki/EHang (accessed on 27 June 2023).
  6. Wikipedia. Airbus A³ Vahana. 2022. Available online: https://en.wikipedia.org/wiki/Airbus_A%C2%B3_Vahana (accessed on 27 June 2023).
  7. Yeh, S.T.; Yan, G.; Du, X. Inverse Machine Learning Prediction for Optimal Tilt-Wing eVTOL Takeoff Trajectory. In Proceedings of the AIAA AVIATION 2023 Forum, San Diego, CA, USA, 12–16 June 2023. [Google Scholar] [CrossRef]
  8. Electric VTOL News. Dufour Aerospace VTOL Technology Demonstrator. 2020. Available online: https://evtol.news/dufour-aerospace-vtol-technology-demonstrator (accessed on 24 September 2023).
  9. Electric VTOL News. Joby Aviation S4 2.0 (Prototype). 2023. Available online: https://evtol.news/joby-s4 (accessed on 27 June 2023).
  10. Yang, X.G.; Liu, T.; Ge, S.; Rountree, E.; Wang, C.Y. Challenges and key requirements of batteries for electric vertical takeoff and landing aircraft. Joule 2021, 5, 1644–1659. [Google Scholar] [CrossRef]
  11. Kubo, D.; Suzuki, S. Tail-Sitter Vertical Takeoff and Landing Unmanned Aerial Vehicle: Transitional Flight Analysis. J. Aircr. 2008, 45, 292–297. [Google Scholar] [CrossRef]
  12. Li, B.; Sun, J.; Zhou, W.; Wen, C.Y.; Low, K.H.; Chen, C.K. Transition Optimization for a VTOL Tail-Sitter UAV. IEEE/Asme Trans. Mechatronics 2020, 25, 2534–2545. [Google Scholar] [CrossRef]
  13. Pradeep, P.; Wei, P. Energy Efficient Arrival with RTA Constraint for Urban eVTOL Operations. In Proceedings of the 2018 AIAA Aerospace Sciences Meeting, Kissimmee, FL, USA, 8–12 January 2018. [Google Scholar]
  14. Chauhan, S.S.; Martins, J.R.R.A. Tilt-wing eVTOL takeoff trajectory optimization. J. Aircr. 2020, 57, 93–112. [Google Scholar] [CrossRef]
  15. Panish, L.; Bacic, M. Transition Trajectory Optimization for a Tiltwing VTOL Aircraft with Leading-Edge Fluid Injection Active Flow Control. In Proceedings of the AIAA SCITECH 2022 Forum, San Diego, CA, USA, 3–7 January 2022. [Google Scholar]
  16. Queipo, N.V.; Haftka, R.T.; Shyy, W.; Goel, T.; Vaidyanathan, R.; Kevin Tucker, P. Surrogate-based analysis and optimization. Prog. Aerosp. Sci. 2005, 41, 1–28. [Google Scholar] [CrossRef]
  17. Zhang, X.; Xie, F.; Ji, T.; Zhu, Z.; Zheng, Y. Multi-fidelity deep neural network surrogate model for aerodynamic shape optimization. Comput. Methods Appl. Mech. Eng. 2021, 373, 113485. [Google Scholar] [CrossRef]
  18. Li, J.; Du, X.; Martins, J.R. Machine learning in aerodynamic shape optimization. Prog. Aerosp. Sci. 2022, 134, 100849. [Google Scholar] [CrossRef]
  19. Shen, Y.; Huang, W.; Yan, L.; tian Zhang, T. Constraint-based parameterization using FFD and multi-objective design optimization of a hypersonic vehicle. Aerosp. Sci. Technol. 2020, 100, 105788. [Google Scholar] [CrossRef]
  20. Alba, C.; Elham, A.; German, B.J.; Veldhuis, L.L.M. A surrogate-based multi-disciplinary design optimization framework modeling wing-propeller interaction. Aerosp. Sci. Technol. 2018, 78, 721–733. [Google Scholar] [CrossRef]
  21. Raul, V.; Leifsson, L. Surrogate-based aerodynamic shape optimization for delaying airfoil dynamic stall using Kriging regression and infill criteria. Aerosp. Sci. Technol. 2021, 111, 106555. [Google Scholar] [CrossRef]
  22. Li, M.; Wang, Z. Surrogate model uncertainty quantification for reliability-based design optimization. Reliab. Eng. Syst. Saf. 2019, 192, 106432. [Google Scholar] [CrossRef]
  23. Du, X.; Leifsson, L.; Koziel, S.; Bekasiewicz, A. Airfoil Design Under Uncertainty Using Non-Intrusive Polynomial Chaos Theory and Utility Functions. Procedia Comput. Sci. 2017, 108, 1493–1499. [Google Scholar] [CrossRef]
  24. Iuliano, E.; Quagliarella, D. Aerodynamic shape optimization via non-intrusive POD-based surrogate modelling. In Proceedings of the 2013 IEEE Congress on Evolutionary Computation, Cancun, Mexico, 20–23 June 2013; pp. 1467–1474. [Google Scholar] [CrossRef]
  25. Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; Adaptive Computation and Machine Learning; MIT Press: Cambridge, MA, USA, 2006; p. 248. [Google Scholar]
  26. Liu, M.; Chowdhary, G.; Castra da Silva, B.; Liu, S.Y.; How, J.P. Gaussian Processes for Learning and Control: A Tutorial with Examples. IEEE Control. Syst. Mag. 2018, 38, 53–86. [Google Scholar] [CrossRef]
  27. de Wolff, T.; Cuevas, A.; Tobar, F. MOGPTK: The Multi-Output Gaussian Process Toolkit. Neurocomputing 2020, 424, 49–53. [Google Scholar] [CrossRef]
  28. Liu, H.; Cai, J.; Ong, Y. Remarks on multi-output Gaussian process regression. Knowl. Based Syst. 2018, 144, 102–121. [Google Scholar] [CrossRef]
  29. Thelen, A.S.; Bryson, D.E.; Stanford, B.K.; Beran, P.S. Multi-Fidelity Gradient-Based Optimization for High-Dimensional Aeroelastic Configurations. Algorithms 2022, 15, 131. [Google Scholar] [CrossRef]
  30. Tao, J.; Sun, G. Application of Deep Learning Based Multi-Fidelity Surrogate Model to Robust Aerodynamic Design Optimization. Aerosp. Sci. Technol. 2019, 92, 722–737. [Google Scholar] [CrossRef]
  31. Renganathan, S.A.; Maulik, R.; Ahuja, J. Enhanced data efficiency using deep neural networks and Gaussian processes for aerodynamic design optimization. Aerosp. Sci. Technol. 2021, 111, 106522. [Google Scholar] [CrossRef]
  32. O’Leary-Roseberry, T.; Du, X.; Chaudhuri, A.; Martins, J.R.; Willcox, K.; Ghattas, O. Learning high-dimensional parametric maps via reduced basis adaptive residual networks. Comput. Methods Appl. Mech. Eng. 2022, 402, 115730. [Google Scholar] [CrossRef]
  33. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q., Eds.; Curran Associates, Inc.: New York, NY, USA, 2014; pp. 2672–2680. [Google Scholar]
  34. Chen, W.; Chiu, K.; Fuge, M.D. Airfoil design parameterization and optimization using bézier generative adversarial networks. Aiaa J. 2020, 58, 4723–4735. [Google Scholar] [CrossRef]
  35. Du, X.; He, P.; Martins, J.R.R.A. A B-Spline-based Generative Adversarial Network Model for Fast Interactive Airfoil Aerodynamic Optimization. In Proceedings of the AIAA SciTech Forum, AIAA, Orlando, FL, USA, 5 January 2020. [Google Scholar] [CrossRef]
  36. Du, X.; Martins, J.R. Super Resolution Generative Adversarial Networks for Multi-Fidelity Pressure Distribution Prediction. In Proceedings of the AIAA SCITECH 2023 Forum, National Harbor, MD, USA, 23–27 January 2023; p. 0533. [Google Scholar]
  37. Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. Available online: http://arxiv.org/abs/1411.1784 (accessed on 10 December 2022).
  38. Aggarwal, K.; Kirchmeyer, M.; Yadav, P.; Keerthi, S.S.; Gallinari, P. Regression with conditional gan. Technical Report. arXiv 2019, arXiv:1905.12868. [Google Scholar]
  39. Ha, T.H.; Lee, K.; Hwang, J.T. Large-scale multidisciplinary optimization under uncertainty for electric vertical takeoff and landing aircraft. In Proceedings of the AIAA Scitech 2020 Forum, Orlando, FL, USA, 6–10 January 2020. [Google Scholar]
  40. Chinthoju, P.; Lee, Y.H.; Das, G.K.; James, K.A.; Allison, J.T. Optimal Design of eVTOLs for Urban Mobility using Analytical Target Cascading (ATC). In Proceedings of the AIAA SCITECH 2024 Forum, Orlando, FL, USA, 8–12 January 2024; p. 2235. [Google Scholar]
  41. Rostami, M.; Bardin, J.; Neufeld, D.; Chung, J. EVTOL Tilt-Wing Aircraft Design under Uncertainty Using a Multidisciplinary Possibilistic Approach. Aerospace 2023, 10, 718. [Google Scholar] [CrossRef]
  42. Office of NextGen. Urban Air Mobility (UAM) Concept of Operations 2.0; Technical Report; Federal Aviation Adminstration: Columbia, WA, USA, 2023.
  43. Ye, K.; Wang, Z.; Chen, P.; Piao, Y.; Zhang, K.; Wang, S.; Jiang, X.; Cui, X. A novel GAN-based regression model for predicting frying oil deterioration. Sci. Rep. 2022, 12, 10424. [Google Scholar] [CrossRef]
  44. Yeh, S.T.; Du, X. Optimal Tilt-Wing eVTOL Takeoff Trajectory Prediction Using Regression Generative Adversarial Networks. Mathematics 2023, 12, 26. [Google Scholar] [CrossRef]
  45. Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 1–40. [Google Scholar] [CrossRef]
  46. Liao, P.; Song, W.; Du, P.; Zhao, H. Multi-fidelity convolutional neural network surrogate model for aerodynamic optimization based on transfer learning. Phys. Fluids 2021, 33, 127121. [Google Scholar] [CrossRef]
  47. Falck, R.; Gray, J.S.; Ponnapalli, K.; Wright, T. dymos: A Python package for optimal control of multidisciplinary systems. J. Open Source Softw. 2021, 6, 2809. [Google Scholar] [CrossRef]
  48. Gray, J.S.; Hwang, J.T.; Martins, J.R.R.A.; Moore, K.T.; Naylor, B.A. OpenMDAO: An open-source framework for multidisciplinary design, analysis, and optimization. Struct. Multidiscip. Optim. 2019, 59, 1075–1104. [Google Scholar] [CrossRef]
  49. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
  50. Jerray, J.; Fribourg, L.; Andre, E. Robust optimal periodic control using guaranteed Euler’s method. In Proceedings of the 2021 American Control Conference (ACC), New Orleans, LA, USA, 25–28 May 2021; pp. 986–991. [Google Scholar] [CrossRef]
  51. Le Coent, A.; Alexandre Sandrett, J.; Chapoutot, A.; Fribourg, L.; De Vuyst, F.; Chamoin, L. Distributed Control Synthesis Using Euler’s Method. In Proceedings of the Reachability Problems, London, UK, 7–9 September 2017; Hague, M., Potapov, I., Eds.; Springer: Cham, Switzerland, 2017; pp. 118–131. [Google Scholar]
  52. Herman, A.L.; Conway, B.A. Direct optimization using collocation based on high-order Gauss–Lobatto quadrature rules. J. Guid. Control. Dyn. 1996, 19, 592–599. [Google Scholar] [CrossRef]
  53. Wächter, A.; Biegler, L.T. On the Implementation of an Interior Point Filter Line-Search Algorithm for Large-Scale Nonlinear Programming. Math. Program. 2006, 106, 25–57. [Google Scholar] [CrossRef]
  54. Wu, N.; Kenway, G.; Mader, C.A.; Jasa, J.; Martins, J.R.R.A. pyOptSparse: A Python framework for large-scale constrained nonlinear optimization of sparse systems. J. Open Source Softw. 2020, 5, 2564. [Google Scholar] [CrossRef]
  55. Tangler, J.L.; Ostowari, C. Horizontal axis wind turbine post stall airfoil characteristics synthesization. In Proceedings of the Horizontal-Axis wind Turbine Technology Conference, Cleveland, OH, USA, 8–10 May 1984. [Google Scholar]
  56. Glauert, H. Airplane propellers. In Aerodynamic Theory; Springer: Berlin/Heidelberg, Germany, 1935. [Google Scholar]
  57. Ypma, T.J. Historical development of the Newton–Raphson method. SIAM Rev. 1995, 37, 531–551. [Google Scholar] [CrossRef]
  58. McCormick, B.W. Aerodynamics of V/STOL Flight, 1st ed.; Academic Press: New York, NY, USA, 1967. [Google Scholar]
  59. Leishman, J.G. Principles of Helicopter Aerodynamics, 1st ed.; The Press Syndicate of the University of Cambridge: Cambridge, UK, 2000. [Google Scholar]
  60. Biswas, B.; Chatterjee, S.; Mukherjee, S.; Pal, S. A discussion on Euler method: A review. Electron. J. Math. Anal. Appl. 2013, 1, 2090–2792. [Google Scholar]
  61. Sacks, J.; Schiller, S.B.; Welch, W.J. Designs for Computer Experiments. Technometrics 1989, 31, 41–47. [Google Scholar] [CrossRef]
  62. Lataniotis, C.; Wicaksono, D.; Marelli, S.; Sudret, B. UQLab User Manual—Kriging (Gaussian Process Modeling); Technical Report; Chair of Risk, Safety and Uncertainty Quantification; ETH Zurich: Zurich, Switzerland, 2022; Report UQLab-V2.0-105. [Google Scholar]
  63. de G. Matthews, A.G.; van der Wilk, M.; Nickson, T.; Fujii, K.; Boukouvalas, A.; León-Villagrá, P.; Ghahramani, Z.; Hensman, J. GPflow: A Gaussian Process Library using TensorFlow. J. Mach. Learn. Res. 2017, 18, 1–6. [Google Scholar]
  64. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  65. Agarap, A.F. Deep learning using rectified linear units (relu). arXiv 2018, arXiv:1803.08375. [Google Scholar]
  66. Han, J.; Moraga, C. The influence of the sigmoid function parameters on the speed of backpropagation learning. In International Workshop on Artificial Neural Networks; Springer: Berlin/Heidelberg, Germany, 1995; pp. 195–201. [Google Scholar]
  67. Allen, D.M. Mean square error of prediction as a criterion for selecting variables. Technometrics 1971, 13, 469–475. [Google Scholar] [CrossRef]
  68. Ho, Y.; Wookey, S. The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling. IEEE Access 2019, 8, 4806–4813. [Google Scholar] [CrossRef]
  69. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  70. Qian, N. On the Momentum Term in Gradient Descent Learning Algorithms. Neural Netw. 1999, 12, 145–151. [Google Scholar] [CrossRef]
  71. Tieleman, S.; Hinton, G. Lecture 6.5—RMSProp: Divide the Gradient by a Running Average of Its Recent Magnitude. COURSERA Neural Netw. Mach. Learn. 2012, 4, 26–31. [Google Scholar]
  72. McKay, M.D.; Beckman, R.J.; Conover, W.J. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics 1979, 21, 239–245. [Google Scholar]
Figure 1. The test case considered in this work: (a) Airbus A 3 Vahana drone (https://www.airbus.com/, accessed on 10 April 2024); (b) optimal takeoff trajectory profile of design variables: electrical power ( P ) and wing angle to vertical ( θ ), comparing regGAN-TL predictions with simulation-based optimal design.
Figure 1. The test case considered in this work: (a) Airbus A 3 Vahana drone (https://www.airbus.com/, accessed on 10 April 2024); (b) optimal takeoff trajectory profile of design variables: electrical power ( P ) and wing angle to vertical ( θ ), comparing regGAN-TL predictions with simulation-based optimal design.
Electronics 13 01911 g001
Figure 2. Definition of angles and forces on the aircraft.
Figure 2. Definition of angles and forces on the aircraft.
Electronics 13 01911 g002
Figure 3. GANs contain a discriminator to compete with the generator. The generator generates shapes ( g ) based on random variables ( z ), typically following user-defined distributions (uniform distributions in this work). The discriminator distinguishes between the existing data ( e ) and the generated data ( g ) by predicting probabilities. During training, the discriminator adjusts its weights ( w d ) to make the probability p g (i.e., g being existing data) approach zero, while increasing the probability p e (i.e., e being existing data) towards one. In contrast, the generator adjusts its weights ( w g ) to increase the probability of p g towards one. The generator after training generates shapes similar to the existing data.
Figure 3. GANs contain a discriminator to compete with the generator. The generator generates shapes ( g ) based on random variables ( z ), typically following user-defined distributions (uniform distributions in this work). The discriminator distinguishes between the existing data ( e ) and the generated data ( g ) by predicting probabilities. During training, the discriminator adjusts its weights ( w d ) to make the probability p g (i.e., g being existing data) approach zero, while increasing the probability p e (i.e., e being existing data) towards one. In contrast, the generator adjusts its weights ( w g ) to increase the probability of p g towards one. The generator after training generates shapes similar to the existing data.
Electronics 13 01911 g003
Figure 4. cGAN has similar structures to the vanilla GAN (Figure 3). cGAN take design requirements ( d ) and random variables ( z ) as input for the generator network. The true optimal takeoff trajectories ( y ), corresponding d , and generated data ( g ) are considered as inputs for the discriminator. The discriminator outputs the probabilities of y and g being existing data.
Figure 4. cGAN has similar structures to the vanilla GAN (Figure 3). cGAN take design requirements ( d ) and random variables ( z ) as input for the generator network. The true optimal takeoff trajectories ( y ), corresponding d , and generated data ( g ) are considered as inputs for the discriminator. The discriminator outputs the probabilities of y and g being existing data.
Electronics 13 01911 g004
Figure 5. regGAN includes a generator and a discriminator in common with the vanilla GAN and cGAN. The generator generates time-sequence design variables on corresponding design requirements through a DNN. The discriminator takes generated optimal designs and true time-sequence optimal trajectories in order to distinguish the differences. This results in an adversarial competition, which causes an adversarial loss function via L BC to train the generator and the discriminator. Additionally, we minimize the contextual loss via L MSE between generated and true design variables to explicitly train the generator for regression tasks.
Figure 5. regGAN includes a generator and a discriminator in common with the vanilla GAN and cGAN. The generator generates time-sequence design variables on corresponding design requirements through a DNN. The discriminator takes generated optimal designs and true time-sequence optimal trajectories in order to distinguish the differences. This results in an adversarial competition, which causes an adversarial loss function via L BC to train the generator and the discriminator. Additionally, we minimize the contextual loss via L MSE between generated and true design variables to explicitly train the generator for regression tasks.
Electronics 13 01911 g005
Figure 6. Optimal takeoff trajectory profile verification against simulation-based ground truth shows that cGAN surrogates capture the general trends well and outperform MOGP. However, cGAN predictions still miss minor features and exhibit unreasonable shapes on computed quantities such as acceleration.
Figure 6. Optimal takeoff trajectory profile verification against simulation-based ground truth shows that cGAN surrogates capture the general trends well and outperform MOGP. However, cGAN predictions still miss minor features and exhibit unreasonable shapes on computed quantities such as acceleration.
Electronics 13 01911 g006
Figure 7. Optimal takeoff trajectory profiles predicted by regGAN-TL CL4 using 200 training samples reach almost no visualizable differences with simulation-based ground truth and outperform the other surrogates. In addition, the regGAN-TL CL4 surrogate has over 99.5% accuracy and is able to match the computed quantities to guarantee constraint satisfaction.
Figure 7. Optimal takeoff trajectory profiles predicted by regGAN-TL CL4 using 200 training samples reach almost no visualizable differences with simulation-based ground truth and outperform the other surrogates. In addition, the regGAN-TL CL4 surrogate has over 99.5% accuracy and is able to match the computed quantities to guarantee constraint satisfaction.
Electronics 13 01911 g007
Table 1. Neural architectures and training setups of the generator and the discriminator within regGAN.
Table 1. Neural architectures and training setups of the generator and the discriminator within regGAN.
GeneratorDiscriminator
Input layer5 neurons, no activation21 neurons, no activation
Hidden layer 1100 neurons, relu activation100 neurons, relu activation
Hidden layer 2100 neurons, relu activation100 neurons, relu activation
Output layer21 neurons/1 neuron, sigmoid activation1 neurons, sigmoid activation
Training algorithmAdam optimizerAdam optimizer
Training parameters β 1 = 0.9 , β 2 = 0.999 β 1 = 0.9 , β 2 = 0.999
Learning rate0.0010.001
Batch size2020
Epochs10001000
Table 2. Takeoff trajectory optimization problem formulation.
Table 2. Takeoff trajectory optimization problem formulation.
Function or VariableDescriptionQuantity
minimizeEElectrical energy consumed
w.r.t.
P Electrical power using 21 quadratic curve control points21
θ Wing angle to vertical using 21 quadratic curve control points21
t flight Takeoff time1
Total design variables43
subject
y final 305 mFinal vertical displacement constraint1
to
x final 1400 mFinal horizontal displacement constraint1
v x = 67 m/sFinal horizontal speed constraint1
y 0 mVertical displacement constraint1
a a max Acceleration constraint1
α α lim Positive stall-angle constraint1
α α lim Negative stall-angle constraint1
Total constraints7
Conditions
k in Propeller-induced velocity factor
α lim Angle of attack constraint value
a max Maximum acceleration magnitude
k elec Electrical and mechanical loses factor
S ref Wing size factor
Table 3. MOGP model configurations and mean ± standard deviation of testing accuracy on the testing data set. SE stands for SE kernel function.
Table 3. MOGP model configurations and mean ± standard deviation of testing accuracy on the testing data set. SE stands for SE kernel function.
ModelCorrelation FunctionBasis Function ACC L 1 , t flight (%) ACC L 1 , P (%) ACC L 1 , θ (%)
M1SEconstant99.6 ± 0.31392.5 ± 2.9394.7 ± 2.30
M2SElinear99.5 ± 0.37292.5 ± 2.8994.7 ± 2.32
M3Matérnconstant96.7 ± 2.6588.7 ± 4.8594.6 ± 2.20
M4Matérnlinear96.7 ± 2.6191.4 ± 2.7394.6 ± 2.20
Table 4. cGAN models exhibit better predictive performance than MOGP (Table 3) in terms of mean and standard deviation of testing accuracy.
Table 4. cGAN models exhibit better predictive performance than MOGP (Table 3) in terms of mean and standard deviation of testing accuracy.
ModelLoss Function ACC L 1 , t flight (%) ACC L 1 , P (%) ACC L 1 , θ (%)
cGANBC L BC 98.4 ± 0.85899.0 ± 0.65398.6 ± 0.424
cGANMSE L MSE 98.3 ± 0.84398.8 ± 0.74598.7 ± 0.426
Table 5. Design requirements for visualization case 1.
Table 5. Design requirements for visualization case 1.
Design RequirementsValues
Propeller-induced velocity factor, k in 87.75 (%)
Angle of attack constraint, α lim ±12.25833333 (deg)
Acceleration, a max 0.38766667 (g)
Electrical and mechanical loses factor, k elec 0.761
Wing size, S ref 0.9285
Table 6. Predictive accuracy of visualization case 1 using MOGP M2 and a cGAN surrogate with L BC and L MSE agree well with the mean testing accuracy.
Table 6. Predictive accuracy of visualization case 1 using MOGP M2 and a cGAN surrogate with L BC and L MSE agree well with the mean testing accuracy.
Model ACC L 1 , t flight (%) ACC L 1 , P (%) ACC L 1 , θ (%)
MOGP M299.894.791.0
cGANBC98.899.898.7
cGANMSE98.999.798.7
Table 7. Parametric study for regGAN-TL models further confirms the outstanding performance of regGAN-TL in terms of the mean and standard deviation of testing accuracy with respect to the number of training samples for each design variable group. Note that we only vary w BC while keeping w MSE as 1. CL represents a combined loss for model training.
Table 7. Parametric study for regGAN-TL models further confirms the outstanding performance of regGAN-TL in terms of the mean and standard deviation of testing accuracy with respect to the number of training samples for each design variable group. Note that we only vary w BC while keeping w MSE as 1. CL represents a combined loss for model training.
Model w BC Samples ACC L 1 , t flight (%) ACC L 1 , P (%) ACC L 1 , θ (%)
CL101094.4 ± 3.9692.1 ± 5.1596.9 ± 1.13
2093.6 ± 3.8096.2 ± 1.9197.3 ± 1.19
5096.3 ± 2.3397.7 ± 1.2098.0 ± 0.771
10098.4 ± 0.74898.8 ± 0.78599.1 ± 0.377
20099.2 ± 0.46899.3 ± 0.43499.5 ± 0.226
40099.5 ± 0.31499.5 ± 0.38999.6 ± 0.167
CL2 TL0.011093.7 ± 3.6293.7 ± 2.5296.6 ± 1.21
2093.4 ± 3.5796.0 ± 2.2496.8 ± 1.34
5095.3 ± 0.99397.4 ± 1.3797.3 ± 0.835
10098.4 ± 0.52898.2 ± 1.2198.2 ± 0.323
20099.0 ± 0.32698.2 ± 0.71298.6 ± 0.301
CL3 TL0.0011094.3 ± 3.1395.4 ± 3.1297.5 ± 1.16
2095.7 ± 1.9997.8 ± 1.2798.1 ± 0.908
5097.3 ± 0.87198.4 ± 1.2198.6 ± 0.476
10099.0 ± 0.47198.7 ± 0.83099.3 ± 0.274
20099.1 ± 0.34799.0 ± 0.44799.3 ± 0.213
CL4 TL0.00011095.1 ± 3.3695.4 ± 3.0197.5 ± 1.20
2096.8 ± 1.9597.8 ± 1.2798.1 ± 0.950
5098.1 ± 1.0598.5 ± 1.0698.7 ± 0.526
10099.2 ± 0.53099.1 ± 0.82899.4 ± 0.267
20099.3 ± 0.36499.5 ± 0.29699.6 ± 0.179
Table 8. A parametric study on w BC of the vanilla regGAN in terms of testing accuracy mean ± standard deviation confirms that regGAN-TL (Table 7) possess better performance when provided with the same training data. We keep w MSE as 1.
Table 8. A parametric study on w BC of the vanilla regGAN in terms of testing accuracy mean ± standard deviation confirms that regGAN-TL (Table 7) possess better performance when provided with the same training data. We keep w MSE as 1.
Model w BC Samples ACC L 1 , t flight (%) ACC L 1 , P (%) ACC L 1 , θ (%)
CL20.0120099.3 ± 0.44397.8 ± 0.98098.9 ± 0.427
40099.6 ± 0.25199.6 ± 0.25799.6 ± 0.145
CL30.00120099.3 ± 0.38099.0 ± 0.55799.1 ± 0.308
40099.6 ± 0.34399.4 ± 0.37899.4 ± 0.230
CL40.000120099.2 ± 0.38498.8 ± 0.70698.9 ± 0.494
40099.4 ± 0.34699.4 ± 0.35899.3 ± 0.249
Table 9. Design requirements for visualized case 2.
Table 9. Design requirements for visualized case 2.
Design RequirementsValues
Propeller-induced velocity factor, k in 82.38333333 (%)
Angle of attack constraint, α lim ±12.85833333 (deg)
Acceleration, a max 0.29166667 (g)
Electrical and mechanical loses factor, k elec 0.819
Wing size, S ref 0.92916667
Table 10. Testing accuracy for visualization case 2 of regGAN model CL1 and CL4 TL.
Table 10. Testing accuracy for visualization case 2 of regGAN model CL1 and CL4 TL.
Model w BC Samples ACC L 1 , t flight (%) ACC L 1 , P (%) ACC L 1 , θ (%)
CL1010098.498.799.2
CL1020099.399.199.7
CL4 TL0.000110098.498.899.6
CL4 TL0.000120099.499.599.7
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yeh, S.-T.; Du, X. Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction. Electronics 2024, 13, 1911. https://doi.org/10.3390/electronics13101911

AMA Style

Yeh S-T, Du X. Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction. Electronics. 2024; 13(10):1911. https://doi.org/10.3390/electronics13101911

Chicago/Turabian Style

Yeh, Shuan-Tai, and Xiaosong Du. 2024. "Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction" Electronics 13, no. 10: 1911. https://doi.org/10.3390/electronics13101911

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop