Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models

Bacharidis, Emmanouil; Seventekidis, Panagiotis; Arailopoulos, Alexandros

doi:10.3390/applmech7010009

Open AccessArticle

Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models

by

Emmanouil Bacharidis

¹,

Panagiotis Seventekidis

^1,2

and

Alexandros Arailopoulos

^1,*

¹

Department of Mechanical Engineering, University of Western Macedonia, GR-50100 Kozani, Greece

²

Department of Mechanical Engineering, Aristotle University of Thessaloniki, GR-54124 Thessaloniki, Greece

^*

Author to whom correspondence should be addressed.

Appl. Mech. 2026, 7(1), 9; https://doi.org/10.3390/applmech7010009

Submission received: 3 December 2025 / Revised: 14 January 2026 / Accepted: 16 January 2026 / Published: 22 January 2026

(This article belongs to the Special Issue Early Career Scientists 2025 (ECS 2025) Contributions to Applied Mechanics (3rd Edition))

Download

Browse Figures

Versions Notes

Abstract

This work investigates a data-driven approach for detecting structural damage in the wing of a Cessna 172 aircraft using reduced-order finite element (FE) models. This study focuses on the ability of machine learning methods to generalize across different structural conditions, aiming to support reliable Structural Health Monitoring (SHM) in aeronautical applications. The wing was first modeled in detail using the FiniteElement Method, followed by the development of a simplified FE model to reduce computational cost while maintaining accuracy. The similarity between the two models was evaluated through modal analysis and the Modal Assurance Criterion (MAC). Dynamic excitation representing turbulence effects was applied to simulate healthy and damaged conditions, producing acceleration data used to train one-dimensional and two-dimensional neural network classifiers. The 1D models processed raw vibration signals, while the 2D models used image representations of the same data. Both architectures were tested against results from the detailed FE model to assess their generalization capability. The 2D networks achieved higher classification accuracy, demonstrating improved robustness in identifying both minor and severe damage. The findings highlight the potential of combining reduced FE models with data-driven methods for efficient and accurate aircraft wing damage detection.

Keywords:

structural health monitoring (SHM); FEM; machine learning (ML); artificial neural networks (ANNs); deep neural networks (DNNs); convolutional neural networks (CNNs); modal assurance criterion (MAC)

Graphical Abstract

1. Introduction

In recent years, the aviation industry has increasingly focused on technologies that reduce environmental impact and operational costs. One of the most effective ways to achieve this goal is by minimizing maintenance time and frequency, which can lead to lower overall transportation costs and increased aircraft availability. Within this direction, Structural Health Monitoring (SHM) has emerged as a key technological approach aimed at assessing the condition of aircraft structures in service.

SHM is defined as a continuous or periodic automated process used to monitor the condition of a component, following the principles of condition-based maintenance described in DIN ISO 17359 [1,2]. The method relies on permanently installed or embedded sensors that record physical responses of the structure, such as accelerations, strains, displacements, temperature variations, or acoustic/ultrasonic signals, depending on the application and the type of damage expected [1,2,3,4,5,6,7]. The objective is to detect damage at an early stage, enabling timely intervention and preventing critical failures. A commonly used analogy to illustrate SHM is the human skin, where pain receptors provide rapid and localized feedback about external loads and potential injury.

A crucial aspect of SHM is the processing and interpretation of sensor measurements. These signals can be analyzed using a wide range of techniques, from conventional statistical indicators such as RMS, kurtosis, and spectral entropy [8] to more advanced approaches involving Machine Learning (ML) and Deep Learning (DL) [9]. Such methods can identify hidden patterns and irregularities in structural response signals, enabling not only damage detection but also localization and classification [10]. In some cases, hybrid strategies combine physics-based models, such as finite element simulations, with data-driven algorithms to enhance accuracy and reliability [11].

A complete SHM system consists of sensors, data acquisition and transmission units, storage infrastructure, and diagnostic software. Its output may range from basic stress monitoring to precise assessment of damage type, location, and severity, as well as its effect on structural integrity and safety [1].

In aeronautics, SHM is particularly important for critical load-bearing components such as landing gear, wings, tail structures, spars, and rotary wing systems [12,13]. These elements experience continuous stress due to aerodynamic forces, vibrations, and thermal fluctuations, and therefore require reliable monitoring to ensure flight safety. Traditional scheduled inspections, although effective, can miss early damage and lead to unnecessary downtime. In contrast, SHM enables real-time assessment of structural conditions, reducing uncertainty and inspection time while improving safety and maintenance efficiency [13].

The implementation of data-driven predictors trained on numerical simulations is a cornerstone of modern Operational Load Monitoring (OLM). This methodology is critical for maximizing the in-service life of aerospace components while ensuring structural integrity. For instance, Dziendzikowski et al. [14] demonstrated how OLM systems can map strain sensor measurements to fatigue damage accumulation in safety-critical components, such as the landing gear attachment frames of military aircraft. Furthermore, hybrid strategies utilize high-fidelity Finite Element (FE) models to simulate complex damage scenarios that are difficult to replicate experimentally. As shown in the comparison of various machine learning algorithms for state prediction in OLM [15], these model-based training approaches allow for the development of robust classifiers capable of operating under variable load conditions, forming the theoretical basis for the dual-model investigation presented in this study.

In this context, the present work aims to develop and evaluate a data-driven Structural Health Monitoring methodology for an aircraft wing using reduced-order finite element (FE) models. This study investigates whether simplified FE representations can reliably support damage detection when combined with neural network-based classification. To this end, acceleration responses from healthy and damaged wing configurations are generated and used to train two distinct deep-learning architectures: a one-dimensional convolutional neural network operating directly on raw acceleration signals, and a two-dimensional convolutional neural network trained on image-formatted representations of the same data. By comparing their performance and generalization capability against results from a high-fidelity analytical FE model, the work provides a clear assessment of the benefits and limitations of using reduced models for efficient, accurate, and computationally affordable damage detection. The goal is to demonstrate how the combination of simplified physics-based modeing and machine learning can lead to reliable SHM tools for aeronautical structures.

2. Materials and Methods

2.1. Wing Configuration and Material

Detailed geometric data for the Cessna 172 wing, including key design parameters and internal structural schematics, are widely available in the literature. The Cessna 172 was selected due to its global popularity as a training and light aircraft, which ensures abundant technical documentation and reliable reference data. This availability supports the accurate reconstruction and analysis of the wing geometry.

Based on the collected data, the primary geometric characteristics were identified, including wingspan, wing area, mean aerodynamic chord, and the wing airfoil profile [16]. Additionally, the surface areas of the ailerons and flaps were recorded, as these components directly influence aerodynamic behavior and are essential for both aerodynamic and structural analysis. In Table 1 below, the geometric characteristics used are provided.

The wing model was developed using a surface-modeling approach in which all components were generated as surfaces rather than solid bodies. This method facilitates geometry cleanup during preprocessing in ANSA [17] and minimizes intersection issues. A mid-surface extraction technique was applied to ensure a clean, consistent, and easily manageable model.

The wing model includes eleven ribs, a main spar carrying the primary bending loads, a secondary spar connected to the flaps, a rear spar supporting the ailerons, stringers for additional stiffness, and the external aluminum skin. All structural components were modeled using aluminum alloy 2024-T3. The mechanical properties for the Al 2024-T3 alloy used in the finite element simulations were defined with a Young’s modulus of 73.1 GPa, a Poisson’s ratio of 0.33, and a mass density of 2780 kg/m³. Movable surfaces such as flaps and ailerons were excluded to avoid unnecessary computational cost and modeling complexity, as their representation is not required for the structural objectives of this study. The resulting model contains only the fixed structural components, providing an accurate and efficient representation of the wing’s mechanical behavior. The model is illustrated in Figure 1.

2.2. Analytical and Simplified Model Development

The wing geometry was imported in STEP format to initiate the finiteelement model setup. Individual Property IDs (PID) were assigned to each component to reflect their respective thicknesses: the spars were modeled with a thickness of 3 mm, the ribs and stringers with 2 mm, and the wing skin with 1.5 mm. All parts were defined using aluminum alloy 2024-T3.

The mesh was generated using 8 mm quad shell elements to achieve an optimal balance between computational efficiency and model accuracy, given the overall size of the wing. The structural assembly was performed within ANSA following standard aerospace modeling practices. The wing structure includes approximately 2100 rivet connections, which were represented using RBE2 elements to accurately simulate load transfer between components.

Boundary conditions were applied by defining two support points near the wing root to represent fuselage attachment locations. These points were constrained with SPC1 boundary conditions restricting all six degrees of freedom and connected to the main and secondary spars using RBE2 elements to ensure proper load introduction into the structure. The meshed analytical model is shown in Figure 2, and the boundary conditions and the connection are represented in Figure 3.

For the simplified wing model, the original wing geometry from the detailed model was retained, with the only modification being an increase in the mesh size from 8 mm quad elements to 20 mm quad elements. The detailed model consisted of 404,152 finite elements, whereas the simplified model included 84,966 elements—a reduction of approximately five times. This reduction significantly decreased computational cost, making the simplified model suitable for the training and validation of neural networks. Subsequent comparison of mode shapes demonstrated high correlation, confirming the suitability of this simplified configuration. The meshed simplified model is illustrated in Figure 4.

To evaluate the reliability of the simplified model relative to the detailed wing model, a comparison of their eigenmodes was carried out using SOL 103 modal analysis. The correlation assessment was performed using the Modal Assurance Criterion (MAC), a widely adopted method in structural dynamics for quantifying the similarity between mode shapes. MAC values range from 0 to 1, where values close to 1 indicate a strong match between modal vectors. In practice, MAC values above 0.9 denote excellent agreement, while values between 0.8 and 0.9 indicate acceptable similarity with minor deviations [18,19].

In this study, the MAC results demonstrate a high level of correlation between the simplified and detailed models across all critical modes, confirming that the simplified model accurately reproduces the dynamic characteristics of the full-detail wing despite its significantly lower element count. This validates its suitability for subsequent dynamic analyses and for generating the training dataset for neural network-based modeling.

Figure 5 presents the MAC matrix for the two models. The dominant diagonal values approaching unity indicate strong correspondence between the respective mode shapes, while lower off-diagonal values reflect minor modal interactions without affecting the primary matching. Overall, the simplified model captures the essential modal behavior of the detailed wing geometry and offers a computationally efficient alternative for the purposes of this work.

2.3. Dynamic Loading of the Aircraft Wing

A dynamic load case for the simplified aircraft wing was developed to simulate atmospheric turbulence. The primary aerodynamic forces acting on the wing are lift and drag, expressed by

L = \frac{1}{2} \cdot ρ \cdot V^{2} \cdot S \cdot C l

(1)

D = \frac{1}{2} \cdot ρ \cdot V^{2} \cdot S \cdot C d

(2)

where

ρ

is the air density,

V

the flight velocity,

S

the wing reference area, and

C_{L}

,

C_{D}

the lift and drag coefficients. The parameter values were taken from literature [16]. A cruising velocity of 124 kt (≈64 m/s) and a wing area of 8.1 m² were assumed. CFD analysis for angles of attack of 2–4°, representing steady-level flight, yielded

C_{L} = 0.4

and

C_{D} = 0.0319

. The air density at 10,000 ft was taken as

ρ \approx 1 {k g / m}^{3}

. From these values:

(1) L = 6635.52 N & (2) D = 529.18 N

Drag was applied uniformly along the leading edge. Lift was assumed to follow an elliptical distribution along the span, consistent with classical aerodynamics:

L (y) = \frac{4 L_{t o t a l, r i g h t}}{π b_{h a l f}} \sqrt{1 - {(\frac{y}{b_{h a l f}})}^{2}}

(3)

where

L (y)

= lift per unit span at position

y

(N/m),

L_{r i g h t}

= total lift on the right-wing half (N),

b_{h a l f}

= half-span of wing (m),

y

= spanwise coordinate from the root (m). By applying Equation (3), the lift distribution along the right wing is obtained as shown in Figure 6:

The aerodynamic center of the lift distribution is given by

y_{c p} = \frac{4}{3 π} b_{h a l f}

(4)

where

y_{c p}

= center of pressure location on half-span (m),

b_{h a l f}

= wing half-span (m). For

b_{h a l f} = 5.5 m

, the center of pressure is located at 2.33 m using Equation (4).

To simulate atmospheric turbulence, a time-varying aerodynamic load was generated and applied to the wing model over 10 s. A white-noise signal was first produced in MATLAB R2020b to represent random atmospheric disturbances and then filtered using a 0.01–10 Hz band-pass filter to retain only the frequency content associated with realistic turbulence [20], as shown in Figure 7.

To ensure that the sampling rate and the total number of timesteps were sufficient for reliable damage detection, two main criteria were adopted:

Nyquist–Shannon Criterion and Physical Relevance: The turbulence excitation was bandpass filtered between 0.01 and 10 Hz to represent realistic atmospheric conditions. The final sampling rate of 204.8 Hz is more than 20 times the highest frequency of interest (10 Hz), which effectively prevents aliasing and ensures a high-fidelity representation of the wing’s low-frequency vibration modes.
Frequency Resolution: A 10 s signal duration was selected to achieve a frequency resolution of 0.1 Hz. This high resolution is critical for the CNN architectures to detect the subtle shifts in spectral signatures caused by structural damage, such as spar cracks or fastener loosening, which might otherwise be lost with shorter sampling windows.

The filtered signal was used to dynamically modulate the lift force over time, instead of applying a constant value, allowing the model to capture fluctuating aerodynamic effects. Drag was assumed to remain constant, since its variation under moderate turbulence is significantly smaller. To reduce computational cost while preserving the essential dynamic characteristics, the filtered signal (10,240 timesteps) was then subsampled, resulting in a compact time series (2048 timesteps) suitable for finite element simulation and neural network dataset creation, as shown in Figure 8.

This process enabled the generation of a realistic and computationally efficient turbulence-induced loading scenario for the wing. Table 2 summarizes the loading scenario applied to the wing model, including the lift and drag magnitudes, their application locations, and the temporal characteristics of the turbulence-induced excitation.

2.4. Dataset Generation

Following the development of the simplified wing model and the definition of the dynamic loading conditions, the subsequent phase focused on generating the dataset required for training and evaluating the neural networks. To emulate a structural-health-monitoring configuration, 15 nodes on the wing were selected to function as virtual sensors. At each of these locations, the vertical (y-axis) acceleration response was recorded during the dynamic simulation, resulting in time-series data that represent the vibration behavior of the wing under aerodynamic excitation. The definition of the virtual acceleration sensor channels, consisting of 15 nodes on the simplified wing model, is illustrated in Figure 9.

For each sensor, an

n \times t

matrix was constructed, where the

n

rows correspond to the responses obtained from the individual simulations, and the

t

columns represent the time samples of each response signal. After all dynamic analyses were completed, each matrix was normalized with respect to the maximum absolute value of its corresponding channel. This procedure produced 15 normalized matrices, each containing the time-domain responses of the selected sensors across all

n

simulations. For the purpose of signal processing, the channels were organized according to the schematic layout illustrated in Figure 10.

To establish a baseline response for subsequent damage assessment, the simplified wing model was first examined in its healthy (undamaged) configuration. The aerodynamic loading scenario defined in the previous section was applied as follows:

The drag load was introduced as a static distribution along the leading edge, applied at 11 nodes aligned with the wing ribs.
The time-varying lift load was applied at 2.33 m from the wing root along the main spar.

Figure 11 illustrates the wing loading induced by turbulence, showing both the static drag forces applied to the ribs (Figure 11a) and the transient lift load on the main spar (Figure 11b).

To assess the system’s ability to detect structural deterioration, two damage scenarios were introduced in addition to the healthy configuration:

Damage 1 (Severe—Spar Crack): Real-world fatigue cracks create a physical separation of material that prevents the transfer of tensile and shear stresses across the fracture surface. This was modeled using node decoupling (releasing meshed lines), which creates a geometric discontinuity in the mesh. This approach accurately captures the localized reduction in the structural stiffness matrix and the resulting shift in natural frequencies, which is physically representative of a breathing crack in a primary structural member like a wing spar.
Damage 2 (Moderate—Fastener Failure): Fasteners in aerospace structures are the primary load-transfer mechanism between the skin and the spar. A failure or loss of integrity in these joints results in a discrete loss of connectivity. This was simulated by deleting the RBE2 (Rigid Body) elements that define the rivet connections. This method accurately represents the loss of a discrete load path, leading to localized changes in structural damping and stiffness, which is physically representative of joint degradation in metallic wing assemblies.

The structural modifications for the severe and moderate damage scenarios are illustrated in Figure 12a and Figure 12b, respectively.

Accordingly, three structural states were considered:

Healthy;
Damage 1—crack in main spar;
Damage 2—removed fastener connections.

To generate the vibration-response data required for training and evaluating the neural networks, the simplified wing model was subjected to dynamic simulations under prescribed aerodynamic excitation. SOL 112 (Modal Frequency Response) analyses were conducted to obtain acceleration time histories, enabling the emulation of a structural health-monitoring scenario.

To efficiently manage the large number of simulation cases and ensure consistent data formatting, the analysis workflow was fully automated using MATLAB R2022b. The automation procedure, illustrated in Figure 13, generated the excitation files, assigned structural health conditions (Healthy, Damage 1, Damage 2), launched the Nastran solver, and extracted the nodal accelerations from the HDF5 output. Responses were subsequently normalized with respect to their maximum absolute amplitude and exported in .npy format.

Three datasets were produced, corresponding to the healthy and damaged configurations. Each dataset is structured as a 3D tensor of dimensions

(1200 \times 2048 \times 15)

, representing 1200 simulations, 2048-time samples, and 15 sensor channels, respectively.

2.5. Data Pruning and Preprocessing Strategy

To optimize model generalization and training efficiency, a data-pruning procedure was implemented prior to neural network training. Although the initial objective was to use the full dataset, the large number of channels (sensors) and time samples, coupled with the presence of regions with negligible signal variation between healthy and damaged states, led to severe overfitting tendencies. The network tended to memorize low-information patterns rather than learning meaningful discriminatory features. Therefore, a feature-selection and temporal reduction strategy was applied to retain only the most informative data [21].

To quantify the sensitivity of each sensor to structural damage, a comparative analysis was performed between healthy and damaged responses. Specifically, the absolute difference between the healthy signal

H_{r, t, s}

and each damaged scenario

D_{r, t, s}^{(\cdot)}

was computed across all runs

R

, timesteps

T

, and sensors

S

:

Δ_{r, t, s}^{(\cdot)} = ∣ H_{r, t, s} - D_{r, t, s}^{(\cdot)} ∣

(5)

The mean absolute difference for each sensor was then evaluated as

{\underline{Δ}}_{s}^{(\cdot)} = \frac{1}{R \cdot T} \sum_{r = 1}^{R} \sum_{t = 1}^{T} Δ_{r, t, s}^{(\cdot)}

(6)

These mean values formed a sensor-sensitivity bar diagram (Figure 14), providing a direct measure of each sensor’s ability to distinguish health from damaged conditions. Based on this criterion, the five most discriminative sensors (4, 7, 9, 14, 15) were selected for subsequent modeling, effectively reducing noise and improving learning stability.

In addition to spatial pruning, temporal reduction was also applied [22]. A heat-map analysis of the average damage-induced deviation per timestep and sensor revealed that the most informative region lies within the first ~400 timesteps, where the excitation response is strongest and damage effects are most pronounced. Beyond this region, damping dominates, and the signal-to-noise ratio decreases markedly. Thus, only the first 400 timesteps were retained (Figure 15), further reducing data dimensionality while preserving discriminative content.

Consequently, the original dataset of dimensions

(1,200,204,815)

was reduced to

(12,004,005)

, achieving a substantial reduction in computational cost and mitigating overfitting risk. This refined dataset retains the essential structural health information needed for robust classification of healthy and damaged states, ensuring improved model accuracy and generalization.

2.6. Binary Damage Classification Using 1D and 2D CNNs

Following preprocessing, two types of neural network architectures were trained and evaluated: one-dimensional convolutional neural networks (1D-CNNs) operating directly on the filtered time-series signals [22,23], and two-dimensional convolutional neural networks (2D-CNNs) [24,25] trained on image-formatted representations of the acceleration data [26]. This dual approach enabled a comparative assessment of temporal and image-based learning strategies for structural state classification. To facilitate supervised learning, the dataset generated from the simplified FE model was partitioned into training and validation sets using a stratified 70/30 split. Specifically, of the 1200 samples generated per structural state (Healthy, Damage 1, and Damage 2), 840 were allocated for training and 360 for validation. This split rationale was chosen to provide a sufficiently diverse training base to capture the stochastic nature of the excitation while maintaining a robust validation set for hyperparameter tuning and early stopping.

The selection of the CNN architectures followed a structured grid search process, where various hyperparameters were iteratively evaluated to ensure stable convergence and high classification performance. This procedure allowed for the identification of optimized layer configurations tailored to the specific temporal features of the wing’s vibration data. Although the current models achieved robust generalization, more advanced techniques such as Genetic Algorithms [27] for architecture selection or Broad Bayesian Learning (BBL) [28] for optimized configuration represent the state of the art in systematic network refinement. These methodologies provide a promising framework for further enhancing the robustness of SHM systems in future research.

2.6.1. Training for Wing_Spar_Damage_1D_CNN

The 1D-CNN was trained using the preprocessed dataset, consisting of five selected sensors and 400 time steps per sample, with 1200 healthy and 1200 damaged simulations. The data were normalized per sensor and assigned binary labels (0 = healthy, 1 = damaged). Training was conducted for up to 100 epochs with a batch size of 64, employing the Adam optimizer with a learning rate of 0.0001 (1 × 10⁻⁴). The learning rate and batch size were selected through iterative trial-and-error to ensure stable convergence and minimize binary cross-entropy loss. Early stopping (patience = 20) was employed to avoid overfitting. To ensure robustness, the training process was repeated five times with different random weight initializations, and the dataset was split into 70% for training and 30% for validation while maintaining balanced class distribution. The convolutional layer utilized a kernel size of 9 and a stride of 1, followed by ReLU activation for hidden layers and a Sigmoid function for the output layer to facilitate binary classification. The network architecture and parameters are summarized in Figure 16. As illustrated in Figure 17, the Wing_Spar_Damage_1D_CNN model achieved stable accuracy and loss results for both the training (Figure 17a) and validation (Figure 17b) sets.

The results indicate that the model achieved 100% accuracy on both the training and validation sets.

2.6.2. Training for Wing_Rivet_Damage_1D_CNN

The H–D2 model was trained using the same framework as H–D1, with 2400 preprocessed samples (1200 healthy and 1200 rivet-damage cases), normalized per sensor and labeled for binary classification. A 1D-CNN was trained using Adam and binary cross-entropy with early stopping, repeated five times with a 70/30 train–validation split. The H–D2 architecture was expanded to include two convolutional layers with a smaller kernel size of 3 and a stride of 1, incorporating Batch Normalization and L2 regularization (1 × 10⁻⁴) to enhance feature extraction and prevent overfitting in the more complex rivet-damage scenario. ReLU activation was applied to all hidden layers. Learning curves were used to evaluate performance, and the final model was exported in HDF5 format. The hyperparameters, including the 0.0001 (1 × 10⁻⁴) learning rate and batch size of 64, were kept consistent with H–D1 to maintain a controlled comparison between the two damage detection tasks. After defining the Wing_Rivet_Damage_1D_CNN network architecture (Figure 18), the model was trained and evaluated. The resulting training and validation performance metrics are illustrated in Figure 19.

The model achieved 93% accuracy on both the training and validation sets, indicating consistent performance without evident overfitting.

2.6.3. Training of 2D Convolutional Neural Networks

To enable 2D-CNN training, the preprocessed time-series data (1200 healthy and 1200 damaged samples, 5 sensors, 400 time steps) were converted into grayscale images. Each sample was reshaped into a

5 \times 400

matrix, where rows represent sensors and columns represent time.

To evaluate the effect of input resolution on model performance, the original

5 \times 400

images were resized to

64 \times 64

and

32 \times 128

pixels, ensuring compatibility with standard CNN architectures while reducing computational cost.

Conversion steps:

Load healthy/damaged signals;
Keep 5 sensors and first 400 time steps;
Apply z-score normalization per sensor, using the global mean (μ) and standard deviation (σ) calculated across the entire training dataset to ensure consistent feature scaling;
Reshape to $5 \times 400$ (sensor × time);
Resize to $5 \times 400$ , $64 \times 64$ , and $32 \times 128$ ;
Scale to $[0, 255]$ using a sample-wise linear min–max mapping rule. This linear approach was chosen to preserve the physical proportionality of the vibration amplitudes within each image and save as grayscale PNG;
Organize into class directories (Healthy, Damage 1, Damage 2).

This process produced structured image datasets for comparative 2D-CNN training across three input resolutions, maintaining the integrity of the structural response energy through linear intensity mapping. Figure 20 provides a visual comparison of the various image formats utilized for the 2D-CNN input channels.

2.6.4. Training for Wing_Damage_2D_CNN_5_400

For both the spar-damage and rivet-damage cases, the 5 × 400 pixel images generated from the five most informative sensors (1200 healthy and 1200 damaged samples per scenario, normalized to [0, 1]) were used to train individual 2D-CNN models. The network structure included convolutional and pooling stages with regularization, followed by a dense classification layer with sigmoid activation for binary output (see Figure 21 for architecture and hyperparameters). Specifically, for the spar-damage case (H–D1), the model utilized two convolutional layers with a kernel size of (1, 15). For the rivet-damage case (H–D2), the kernel size was adjusted to (1, 5) to better capture localized signal variations. In both architectures, a stride of (1, 1) was maintained, and ReLU activation was applied to all hidden layers to facilitate non-linear feature mapping.

Training was performed with binary cross-entropy loss and the Adam optimizer for up to 100 epochs (batch size 64), applying early stopping to prevent overfitting. The learning rate was set to 0.001 for H–D1 and 0.0003 for H–D2, with values selected through an iterative tuning process to achieve a balance between convergence speed and stability. Additionally, the H–D2 training incorporated a ReduceLROnPlateau callback (factor = 0.5, patience = 3) to refine the learning process as the validation loss plateaued. A stratified 70/30 train–validation split was used, and each experiment was repeated across 10 runs with different random seeds to account for training stochasticity. Model performance was evaluated through accuracy and loss curves for both scenarios, enabling assessment of convergence behavior and robustness. In all cases, the best-performing models were saved in HDF5 format for subsequent testing on the high-fidelity wing model. The performance of the Wing_Spar_Damage_2D_CNN_5_400 model and the Wing_Rivet_Damage_2D_CNN_5_400 model is illustrated through their respective accuracy and loss curves in Figure 22 and Figure 23.

The network attained 100% accuracy on the training set and 98% accuracy on the validation set, demonstrating strong classification performance.

The network achieved 94% training accuracy and 93% validation accuracy.

2.6.5. Training for Wing_Damage_2D_CNN_64_64

For the 2D-CNN experiments, the time-series data from the five selected sensors (1200 healthy and 1200 damaged samples) were converted into grayscale images and normalized to [0, 1]. These images were resized to 64 × 64 pixels to serve as input for separate models trained for spar-damage and rivet-damage classification. A common 2D-CNN structure was used, consisting of convolutional and pooling blocks with regularization, followed by a dense layer and sigmoid output for binary classification (see Figure 24 for full architecture and hyperparameters). Specifically, the H–D1 spar-damage model utilized a kernel size of 3 × 3, while the H–D2 rivet-damage model employed a 2 × 2 kernel size with a stride of (1, 1) to capture different scales of structural features. ReLU activation was applied to all hidden convolutional and dense layers.

Training was performed with binary cross-entropy and the Adam optimizer for up to 100 epochs (batch size 64), using early stopping to avoid overfitting. The learning rate was set to 0.001 for H–D1 and 0.0003 for H–D2, with values selected through an iterative tuning process to achieve a balance between convergence speed and stability. For the H–D2 scenario, a ReduceLROnPlateau callback was also integrated (factor = 0.5, patience = 3) to refine the optimization as the validation loss plateaued. Each case was trained over 10 runs with a stratified 70/30 train–validation split to account for variability due to random initialization. Performance was assessed via accuracy and loss curves, and the best models were saved in HDF5 format for subsequent validation on the high-fidelity simulation model. The accuracy and loss curves for the 64 × 64 models are shown in Figure 25 (spar damage) and Figure 26 (rivet damage).

The network achieved 99% training accuracy and 98% validation accuracy.

The model attained 76% accuracy on the training set and 71% on the validation set.

2.6.6. Training for Wing_Damage_2D_CNN_32_128

For the 2D-CNN experiments, images generated from the five-sensor time-series signals were resampled to 32 × 128 pixels and normalized to the range [0, 1]. Each dataset comprised 1200 healthy and 1200 damaged images, used separately for the spar- and rivet-damage classification tasks. A unified 2D-CNN architecture was employed, consisting of convolutional and pooling blocks with dropout regularization, followed by a dense layer and a sigmoid output neuron for binary classification (see Figure 27 for architecture and hyperparameters). In the spar-damage scenario (H–D1), the model utilized two convolutional layers with a kernel size of 3 × 3. For the more complex rivet-damage scenario (H–D2), the kernel size was reduced to 2 × 2 with a stride of (1, 1) to improve the detection of localized structural changes. ReLU activation was maintained across all hidden layers for both models.

Training was conducted using binary cross-entropy and the Adam optimizer for up to 100 epochs (batch size 64), with early stopping to prevent overfitting. The learning rate was optimized to 0.001 for the H–D1 model and 0.0003 for the H–D2 model. These values, alongside the batch size, were selected through iterative trial-and-error to ensure stable convergence. Furthermore, the H–D2 training process utilized a ReduceLROnPlateau callback (factor = 0.5, patience = 3) to dynamically adjust the learning rate and avoid local minima. Each model was trained over 10 runs with different random seeds and a stratified 70/30 train–validation split. Model performance was evaluated through training and validation accuracy/loss curves, and the best models for both damage scenarios were stored in HDF5 format for subsequent testing. The accuracy and loss curves for the 32 × 128 models are presented in Figure 28 (spar damage) and Figure 29 (rivet damage).

The network achieved 99% training accuracy and 99% validation accuracy.

The model attained 88% accuracy on the training set and 83% on the validation set.

3. Results and Discussion

The performance of the proposed data-driven damage-detection methodology was assessed through a structured set of numerical experiments using both the simplified and the high-fidelity finite element (FE) wing models. The analysis focused on (a) evaluating the effectiveness of different data representations for CNN-based classification and (b) assessing the generalization capability of the trained networks when applied to the more realistic analytical model.

3.1. Effect of Image Resizing on CNN Performance

A preliminary comparison was conducted to evaluate the influence of image resizing on the performance of the 2D-CNN models. While the base representation of the data as 5 × 400 (sensor × time) greyscale images preserved the natural temporal structure of the signals, the resampled formats (64 × 64 and 32 × 128 pixels) exhibited a notable loss in classification accuracy. This degradation is attributed to the distortion of the temporal axis and the compression of information occurring during the resizing process, which weakens the discriminative patterns associated with structural damage. Consequently, only the natural image format (5 × 400) and the raw 1D signals were retained for final evaluation.

3.2. Selection of Candidate Models for Final Testing

Based on the results obtained from training on the simplified FE model, two architectures were identified as the most promising:

1D-CNN, trained directly on the filtered acceleration time histories (five sensors, 400 timesteps);
2D-CNN, trained on the unresized 5 × 400 greyscale images.

These two models were selected because they maintained the physical structure of the data—either in raw temporal form (1D-CNN) or as a minimally processed time–sensor image (2D-CNN)—thus providing a fair basis for comparison between sequence-based and image-based representations.

3.3. Generalization Testing Using the High-Fidelity Model

To evaluate generalization capability, both selected models were tested on an independently generated dataset obtained from the analytical FE wing model, which features greater structural detail and more realistic dynamic responses. The same structural conditions—Healthy, Damage 1 (Main Spar Damage), and Damage 2 (Rivet Removal)—were reproduced following the procedure of Section 2.4.

A compact dataset was generated for testing, consisting of 50 runs per condition (50 × 400 × 5), preserving the same sensor configuration and temporal window as in the simplified model. The rationale for this specific testing strategy was to rigorously assess the generalization capability of the networks across different modeling fidelities. By evaluating the CNNs on a high-fidelity representation that was entirely excluded from the training phase, we ensured that the detection accuracy was based on physical damage signatures rather than numerical artifacts or ‘data leakage’ specific to the simplified model’s mesh. This approach provides a realistic measure of how the diagnostic tool would perform when transitioned from a simulation-based training environment to a high-fidelity structural representation.

The observed performance discrepancy is primarily attributed to Domain Shift resulting from the transfer between the Simplified and Detailed FE environments. Rather than standard overfitting, this phenomenon represents a data distribution mismatch, where the features extracted from the low-fidelity mesh do not fully generalize to the high-fidelity domain [8]. This challenge is central to the Transfer Learning paradigm in model-based SHM, requiring architectures that can identify damage-sensitive features that remain invariant to modeling discretization and systematic numerical artifacts.

The observed discrepancy between the near-perfect accuracy on the simplified model’s validation set and the reduced performance on the detailed (analytical) model highlights the impact of model error. This performance gap serves as evidence that the DL classifiers did not simply overfit to the data; rather, they successfully learned the physics of the simplified representation.

Performance was assessed through confusion matrices, enabling a detailed quantification of correct/incorrect class assignments for each damage type.

3.4. Performance on Main Spar Damage

For the Main Spar Damage scenario, both models maintained high accuracy in identifying the healthy state, achieving 100% true-positive detection. However, classification of the damaged condition proved more challenging. The performance comparison for main spar damage detection is summarized in Table 3, with the corresponding confusion matrices for the 1D-CNN and 2D-CNN (5 × 400) models provided in Figure 30.

The discrepancy between validation accuracy on the simplified model (≈100%) and testing accuracy on the analytical model (−14% to −20% drop) indicates the presence of model error, stemming from structural and dynamic differences between the two FE models. Nevertheless, the 2D-CNN consistently outperformed the 1D-CNN by approximately 4%, suggesting that the image-based representation enhances feature extraction for large-scale structural defects.

3.5. Performance on Rivet Damage

The models were next evaluated on the Rivet Damage scenario, which exhibits subtler dynamic signatures due to the localized nature of the defect. The performance comparison for rivet damage detection is summarized in Table 4, while the resulting confusion matrices for the 1D-CNN and 2D-CNN (5 × 400) models are shown in Figure 31.

Compared with Main Spar Damage, both networks demonstrated reduced accuracy in detecting the rivet-related defect, confirming the increased difficulty of discerning localized perturbations. Once again, the 2D-CNN yielded slightly superior performance (+3%), reinforcing the advantage of the 2D representation in capturing distributed spatiotemporal patterns.

The observed misclassifications are primarily driven by the fidelity gap between the Simplified training domain and the Detailed testing domain. This error manifests as a shift in structural eigenvalues and damping characteristics, creating a feature shift where the dynamic differences between the models are misinterpreted by the network [29]. As discussed in recent surveys on Domain Generalization, this type of data shift occurs when a model is required to generalize to an unseen domain with different distribution properties. Consequently, the CNN’s accuracy is bounded by the numerical discretization and modeling assumptions inherent in the FE framework, where inherent modeling discrepancies can either mask actual faults or be incorrectly identified as structural damage.

To provide a rigorous assessment of the model performance beyond classification accuracy, F1-scores were calculated from the test data confusion matrices. For the Main Spar damage, the 1D-CNN and 2D-CNN achieved F1-scores of 0.837 and 0.889, respectively. In the more complex Rivet Damage scenario, the models maintained robust performance with F1-scores of 0.791 (1D-CNN) and 0.828 (2D-CNN). These results confirm that the proposed architectures effectively balance precision and recall, minimizing both missed detections and false alarms under stochastic excitation.

3.6. Limitations and Future Research

Despite the high performance of the 1D and 2D-CNN models, some limitations remain that define the scope for future research. While this study successfully demonstrates the transferability of DL classifiers from reduced-order to high-fidelity models, it is acknowledged that the current training set utilized fixed damage parameters, such as location and severity. This approach was chosen to isolate the impact of ‘model error’ between different finite element (FE) representations. Future work will incorporate a stochastic approach to damage modeling—varying crack lengths, orientations, and multi-site damage—to ensure the networks learn robust physical ‘damage features’ rather than specific ‘damage templates’ tied to a single structural state.

The proposed framework assumes ideal sensor synchronization. While electronic noise was not explicitly added, it is important to emphasize that the primary barrier to ‘realism’ in model-based SHM remains the systematic modeling errors rather than stochastic sensor noise. This study addresses this challenge by testing the architectures across different FE model fidelities (Simplified vs. Detailed), which introduces structural ‘noise patterns’ and spectral shifts that are physically more representative of real-world discrepancies than simple Gaussian augmentation. This ensures that the methodology learns robust features invariant to numerical discretization and modeling assumptions, which are the dominant factors limiting the simulation-to-reality transition in aeronautical diagnostics.

Furthermore, the transition to real-world deployment introduces limitations related to data acquisition and environmental variability. The proposed framework currently assumes ideal sensor synchronization and the absence of electronic noise. In practice, sensor synchronization errors could disrupt the spatial–temporal features captured by the 2D-CNN images. Additionally, the impact of Environmental and Operational Conditions (EOC), such as temperature fluctuations, remains a critical area for investigation. Future iterations of the model will incorporate ‘EOC-robust’ features or domain adaptation techniques to ensure that changes in the vibration profile are correctly attributed to structural damage rather than varying flight conditions.

4. Conclusions

In the present study, a methodology for structural damage detection in an aircraft wing was developed and evaluated using reduced finite element models combined with deep learning techniques. A comparative assessment was carried out between two neural network architectures: a 1D-CNN operating directly on vibration time-series data and a 2D-CNN trained on image-based representations of sensor responses to investigate their ability to generalize across different structural conditions. The following sections summarize and discuss the outcomes of this comparative evaluation.

Across all experiments, several consistent trends emerge, revealing clear differences in model behavior and representation effectiveness. Resized image inputs (64 × 64 and 32 × 128) systematically underperform, indicating that distortion of the temporal axis compromises the discriminative patterns required for reliable damage identification. Models trained on the original 5 × 400 representation exhibit superior behavior, with 2D-CNNs outperforming 1D-CNNs by approximately 3–4%, particularly in the more challenging damage scenarios. Both architectures classify the healthy state with near-perfect accuracy (>95%), demonstrating strong robustness in detecting the absence of damage, while damaged-state classification remains comparatively lower. This reduction is attributed to discrepancies between the simplified and analytical finite element models, introducing a clear domain shift. The observed gap between validation (simplified model) and testing (high-fidelity model) further underscores the influence of this shift on generalization. Overall, the findings confirm that although both CNN architectures show substantial capability for structural damage detection, the 2D-CNN achieves more reliable performance—especially when the full spatiotemporal structure of the original signals is preserved.

Author Contributions

Conceptualization, P.S.; methodology, E.B., P.S. and A.A.; software, E.B.; validation, E.B. and P.S.; formal analysis, E.B.; investigation, E.B., P.S. and A.A.; resources, E.B.; data curation, E.B., P.S. and A.A.; writing—original draft preparation, E.B.; writing—review and editing, P.S. and A.A.; visualization, E.B., P.S. and A.A.; supervision, P.S. and A.A.; project administration, P.S. and A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to their large size and the associated storage and transfer limitations.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FE	Finite Element
FEM	Finite Element Method
SHM	Structural Health Monitoring
ML	Machine Learning
DL	Deep Learning
CNN	Convolutional Neural Network
FFT	Fast Fourier Transform
RMS	Root Mean Square
CFD	Computational Fluid Dynamics
MAC	Modal Assurance Criterion
PID	Property ID (Individual Property IDs in the FE model)

References

Sause, M.G.R.; Jasiūnienė, E. Structural Health Monitoring Damage Detection Systems for Aerospace; Springer Aerospace Technology: Singapore, 2021; Available online: http://www.springer.com/series/8613 (accessed on 2 January 2025).
DIN ISO 17359:2018-05; Condition Monitoring and Diagnostics of Machines—General Guidelines (ISO 17359:2018). Deutsches Institut für Normung: Berlin, Germany, 2018.
Seventekidis, P.; Karyofyllas, G.; Giagopoulos, D. Parametric Study on Structural Damage Classification with Numerically Simulated Vibration Data. J. Phys. Conf. Ser. 2024, 2647, 022004. [Google Scholar] [CrossRef]
Alvarez-Montoya, J.; Carvajal-Castrillón, A.; Sierra-Pérez, J. In-flight and wireless damage detection in a UAV composite wing using fiber optic sensors and strain field pattern recognition. Mech. Syst. Signal Process. 2020, 136, 106526. [Google Scholar] [CrossRef]
Cianci, E.; Civera, M.; de Biagi, V.; Chiaia, B. Physics-informed machine learning for the structural health monitoring and early warning of a long highway viaduct with displacement transducers. Mech. Syst. Signal Process. 2026, 242, 113659. [Google Scholar] [CrossRef]
Tunca, E.; Köksal, E.S.; Çetin Taner, S. Calibrating UAV thermal sensors using machine learning methods for improved accuracy in agricultural applications. Infrared Phys. Technol. 2023, 133, 104804. [Google Scholar] [CrossRef]
Yuan, Y.; Lu, Y.; Sun, J.; Wang, C. Research on the optimal selection method of sensors/actuators in active structural acoustic control for helicopter based on machine learning. Measurement 2025, 245, 116631. [Google Scholar] [CrossRef]
West, B.M.; Locke, W.R.; Andrews, T.C.; Scheinker, A.; Farrar, C.R. Applying concepts of complexity to structural health monitoring. Conf. Proc. Soc. Exp. Mech. Ser. 2019, 6, 205–215. [Google Scholar] [CrossRef]
Seventekidis, P.; Giagopoulos, D.; Arailopoulos, A.; Markogiannaki, O. Structural Health Monitoring using deep learning with optimal finite element model generated data. Mech. Syst. Signal Process. 2020, 145, 106972. [Google Scholar] [CrossRef]
Lin, M.; Guo, S.; He, S.; Li, W.; Yang, D. Structure health monitoring of a composite wing based on flight load and strain data using deep learning method. Compos. Struct. 2022, 286, 115305. [Google Scholar] [CrossRef]
Kambrath, S.; Selvakumar, I. Machine Learning Applications in Aircraft Structural Health Monitoring. Int. J. Innov. Res. Technol. 2024, 11, 1331–1348. [Google Scholar]
Tserpes, K.I.; Karachalios, V.; Giannopoulos, I.; Prentzias, V.; Ruzek, R. Strain and damage monitoring in CFRP fuselage panels using fiber Bragg grating sensors. Part I: Design, manufacturing and impact testing. Compos. Struct. 2014, 107, 726–736. [Google Scholar] [CrossRef]
Zhu, Z. Physics-informed machine learning for near real-time stress prediction and DT-enabled SHM systems in safety-critical aerospace structures. Composite Structures. Eng. Appl. Artif. Intell. 2025, 162, 112532. [Google Scholar] [CrossRef]
Dziendzikowski, M.; Niedbała, P.; Kurnyta, A.; Dragan, K.; Leski, A. Application of Operational Load Monitoring System for Fatigue Estimation of Main Landing Gear Attachment Frame of an Aircraft. Materials 2021, 14, 6564. [Google Scholar] [CrossRef]
Sbaruffatti, N.; Corbetta, M.; Giglio, M.; Cadini, F. Comparison of Machine Learning Algorithms for Structure State Prediction in Operational Load Monitoring. Appl. Sci. 2021, 20, 7087. [Google Scholar] [CrossRef]
Marek, C. Cessna 172—Flight Simulation Data (Rev. 9); Technical Report; Air Force Institute of Technology: Warsaw, Poland, 2019. [Google Scholar] [CrossRef]
BETA CAE Systems. ANSA—Pre-Processing Software for Finite Element Analysis (Version 24.0). BETA CAE Systems S.A. 2024. Available online: https://www.beta-cae.com/ansa.htm (accessed on 2 January 2025).
SVIBS. Generic Modal Assurance Criterion Window. ARTeMIS Modal Help. Available online: https://www.svibs.com/resources/ARTeMIS_Modal_Help/Generic%20Modal%20Assurance%20Criterion%20Window.html (accessed on 2 January 2025).
Pastor, M.; Binda, M.; Harčarik, T. Modal assurance criterion. Procedia Eng. 2012, 48, 543–548. [Google Scholar] [CrossRef]
Balatti, D.; Haddad Khodaparast, H.; Friswell, M.I.; Manolesos, M.; Castrichini, A. Aircraft turbulence and gust identification using simulated in-flight data. Aerosp. Sci. Technol. 2021, 115, 106805. [Google Scholar] [CrossRef]
Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
Zhang, W.; Su, C.; Zhang, Y.; Zhang, Y.; Yuan, P.; Gao, W. Damage identification of honeycomb sandwich structures based on Lamb waves and 1D-CNN. Mater. Today Commun. 2024, 40, 109717. [Google Scholar] [CrossRef]
Zhou, Y.; Zheng, Y.; Liu, Y.; Pan, T.; Zhou, Y. A hybrid methodology for structure damage detection uniting FEM and 1D-CNNs: Demonstration on typical high-pile wharf. Mech. Syst. Signal Process. 2022, 168, 108738. [Google Scholar] [CrossRef]
Huang, M.; Xu, Z.; Cai, C.; Hu, C.; Wei, X.; Yin, W.; He, X. Multi-structure transfer damage detection for composite materials based on singular value decomposition and 2D-CNN network. Structures 2025, 74, 108656. [Google Scholar] [CrossRef]
Yan, X.; Mohammadian, A.; Ao, R.; Liu, J.; Yang, N. Two-dimensional convolutional neural network outperforms other machine learning architectures for water depth surrogate modeling. J. Hydrol. 2023, 616, 128812. [Google Scholar] [CrossRef]
Bharali, K.; Saharia, M.; Roy, M.; Debnath, N. Structural health monitoring of building structures using novel acceleration-based signal-to-image technique and 2D convolutional neural networks. Appl. Soft Comput. 2025, 181, 113457. [Google Scholar] [CrossRef]
Domashova, J.V.; Emtseva, S.S.; Fail, V.S.; Gridin, A.S. Selecting an optimal architecture of neural network using genetic algorithm. Procedia Comput. Sci. 2021, 190, 263–273. [Google Scholar] [CrossRef]
Shao, H.; Lin, J.; Xia, M.; Zhao, X. Broad Bayesian learning (BBL) for nonparametric probabilistic modeling with optimized architecture configuration. Comput. Aided Civ. Infrastruct. Eng. 2021, 36, 1270–1287. [Google Scholar] [CrossRef]
Wang, J.; Lan, C.; Liu, C.; Ouyang, Y.; Qin, T.; Lu, W. Generalizing to Unseen Domains: A Survey on Domain Generalization. IEEE Trans. Knowl. Data Eng. 2023, 35, 8052–8072. [Google Scholar] [CrossRef]

Figure 1. Wing structure: (a) wing configuration without skin; (b) wing configuration with skin.

Figure 2. Meshed analytical model of airplane wing (404,152 CQUAD4 elements).

Figure 3. Meshed analytical model of airplane wing: (a) boundary conditions; (b) rivet connections.

Figure 4. Meshed simplified model of airplane wing (84,966 CQUAD4 elements).

Figure 5. Mode shape correlation between the detailed and simplified models using MAC.

Figure 6. Elliptical lift distribution along the span of the aircraft wing.

Figure 7. Turbulence excitation signals: (a) filtered vs. unfiltered lift-load time history; (b) frequency-domain comparison (FFT) of filtered and unfiltered signals.

Figure 8. Lift excitation signal processing results: (a) time-domain comparison between filtered and subsampled lift-load signals; (b) frequency-domain (FFT) comparison demonstrating preservation of spectral content after subsampling.

Figure 9. Definition of virtual acceleration sensor channels (15 nodes) on the simplified wing model.

Figure 10. Schematic representation of channels.

Figure 11. Turbulence-induced wing loading: (a) application of static drag forces along ribs; (b) application of transient lift load on main spar.

Figure 12. Damage modeling scenarios for the wing structure: (a) severe damage case with a crack introduced at the root of the main spar; (b) moderate damage case with removal of fastener connections near the wing root.

Figure 13. Automation algorithm for dynamic analyses.

Figure 14. Sensor sensitivity bar diagram.

Figure 15. Heatmaps illustrating the mean absolute difference between healthy and damaged structural responses across sensors and time: (a) Healthy vs. Main Spar Damage; (b) Healthy vs. Rivet Damage.

Figure 16. Wing_Spar_Damage_1D_CNN network architecture.

Figure 17. Accuracy and loss curves for (a) Training and (b) Validation sets for the Wing_Spar_Damage_1D_CNN model.

Figure 18. Wing_Rivet_Damage_1D_CNN network architecture.

Figure 19. Accuracy and loss curves for (a) Training and (b) Validation sets for the Wing_Rivet_Damage_1D_CNN model.

Figure 20. Visualization of training image formats used for 2D-CNN input:

5 \times 400

,

64 \times 64

, and

32 \times 128

pixels.

Figure 20. Visualization of training image formats used for 2D-CNN input:

5 \times 400

,

64 \times 64

, and

32 \times 128

pixels.

Figure 21. Architecture of the neural networks: (a) Wing_Spar_Damage_2D_CNN_5_400; (b) Wing_Rivet_Damage_2D_CNN_5_400.

Figure 22. Accuracy and loss curves for (a) Training and (b) Validation sets for the Wing_Spar_Damage_2D_CNN_5_400 model.

Figure 23. Accuracy and loss curves for: (a) Training and (b) Validation sets for the Wing_Rivet_Damage_2D_CNN_5_400 model.

Figure 24. Architecture of the neural networks; (a) Wing_Spar_Damage_2D_CNN_64_64; (b) Wing_Rivet_Damage_2D_CNN_64_64.

Figure 25. Accuracy and loss curves for: (a) Training and (b) Validation sets for the Wing_Spar_Damage_2D_CNN_64_64 model.

Figure 26. Accuracy and loss curves for: (a) Training and (b) Validation sets for the Wing_Rivet_Damage_2D_CNN_64_64 model.

Figure 27. Architecture of the neural networks: (a) Wing_Spar_Damage_2D_CNN_32_128; (b) Wing_Rivet_Damage_2D_CNN_32_128.

Figure 28. Accuracy and loss curves for (a) Training and (b) Validation sets for the Wing_Spar_Damage_2D_CNN_32_128 model.

Figure 29. Accuracy and loss curves for (a) Training and (b) Validation sets for the Wing_Rivet_Damage_2D_CNN_32_128 model.

Figure 30. Confusion matrices for the Main Spar Damage scenario for (a) the 1D-CNN and (b) the 2D-CNN (5 × 400).

Figure 31. Confusion matrices for the Rivet Damage scenario for (a) the 1D-CNN and (b) the 2D-CNN (5 × 400).

Table 1. Geometric Characteristics of Cessna 172 Wing.

Characteristic	Wingspan	Wing Area	Airfoil	Flap Area	Aileron Area	Mean Aerodynamic Cord
Value	11.00 m	16.17 m²	NACA 2412	1.98 m²	1.70 m²	1.49 m

Table 2. Parameters of the turbulence-induced wing Loading.

Parameter	Description	Value
Lift force (L)	Nominal vertical aerodynamic load	6635.52 N
Drag force (D)	Longitudinal aerodynamic load	529.18 N
Lift application	Applied at main spar, 2.33 m from wing root	Dynamic
Drag application	Uniformly distributed along leading edge	Static
Turbulence frequency band	Frequency range of lift variation	0.01–10 Hz
Simulation duration	Total time of dynamic load	10 s
Time steps	Final number after subsampling	2048

Table 3. Performance comparison of 1D-CNN and 2D-CNN (5 × 400) on Main Spar Damage detection.

Metric	1D CNN	2D CNN
Overall Accuracy	86%	90%
Healthy Detection	100%	100%
Damage Detection	72%	80%

Table 4. Performance comparison of 1D-CNN and 2D-CNN (5 × 400) on Rivet Damage detection.

Metric	1D CNN	2D CNN
Overall Accuracy	82%	85%
Healthy Detection	96%	98%
Damage Detection	68%	72%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bacharidis, E.; Seventekidis, P.; Arailopoulos, A. Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models. Appl. Mech. 2026, 7, 9. https://doi.org/10.3390/applmech7010009

AMA Style

Bacharidis E, Seventekidis P, Arailopoulos A. Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models. Applied Mechanics. 2026; 7(1):9. https://doi.org/10.3390/applmech7010009

Chicago/Turabian Style

Bacharidis, Emmanouil, Panagiotis Seventekidis, and Alexandros Arailopoulos. 2026. "Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models" Applied Mechanics 7, no. 1: 9. https://doi.org/10.3390/applmech7010009

APA Style

Bacharidis, E., Seventekidis, P., & Arailopoulos, A. (2026). Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models. Applied Mechanics, 7(1), 9. https://doi.org/10.3390/applmech7010009

Article Menu

Study on the Generalization of a Data-Driven Methodology for Damage Detection in an Aircraft Wing Using Reduced FE Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Wing Configuration and Material

2.2. Analytical and Simplified Model Development

2.3. Dynamic Loading of the Aircraft Wing

2.4. Dataset Generation

2.5. Data Pruning and Preprocessing Strategy

2.6. Binary Damage Classification Using 1D and 2D CNNs

2.6.1. Training for Wing_Spar_Damage_1D_CNN

2.6.2. Training for Wing_Rivet_Damage_1D_CNN

2.6.3. Training of 2D Convolutional Neural Networks

2.6.4. Training for Wing_Damage_2D_CNN_5_400

2.6.5. Training for Wing_Damage_2D_CNN_64_64

2.6.6. Training for Wing_Damage_2D_CNN_32_128

3. Results and Discussion

3.1. Effect of Image Resizing on CNN Performance

3.2. Selection of Candidate Models for Final Testing

3.3. Generalization Testing Using the High-Fidelity Model

3.4. Performance on Main Spar Damage

3.5. Performance on Rivet Damage

3.6. Limitations and Future Research

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI