1. Introduction
According to Mueller et al. [
1], the approach for performance assessment of a structural health monitoring (SHM) system can be described in pyramidal form with four main contribution factors:
- (1)
The structure itself as the monitoring object;
- (2)
The requirements definition, including the level of SHM (e.g., detection or localization) as well as the environmental and operational conditions (EOCs) in which the structure is used;
- (3)
The setup with the sensing approach, including data analytics;
- (4)
The probability of detection (POD) assessment and evaluation of the receiver operating characteristic (ROC).
Each level is a prerequisite for reaching the next level. The POD is the final level, in which the quality, capability, reliability, and applicability of an SHM system are estimated.
As known from conventional non-destructive testing (NDT), POD curves indicate the size of any type of damage, also known as the flaw size, which can be detected with a given probability [
2]. The use of the POD in various NDT methods was first published in 1974 by Rummel et al. [
3]. Since then, further approaches have been developed, adapted, and expanded for diverse applications [
4].
There are different POD methods, and the choice may depend on the specific application. They are listed and explained in
Table 1 based on the review by Tai et al. [
5]. In binary hit/miss analysis, either damage (hit) or no damage (miss) can be detected when damage is present [
4]. The
versus
a representation uses a signal response function for the POD assessment. A linear, semi-logarithmic, or double-logarithmic regression model is often applied to represent the signal response. Adjustments are necessary, depending on the NDT method, the scattering of the measurement data, and time-dependent correlations, as well as environmental and operational changes [
2].
There are four common functions for transforming the POD into a generalized linear model depending on the flaw size: log-odds (logit), log-normal (probit), complementary loglog (cloglog) and loglog [
6]. Typically, a POD of
with a
confidence level is required for damage detection [
4]. For SHM systems that enable damage localization, the probability of localization can be determined within a tolerance radius [
2].
In ROCs, the POD is plotted versus the probability of false alarms (PFAs). Hits and misses are therefore directly compared in order to qualify an SHM-capable system [
4]. An optimal classifier for damage classification, e.g., a threshold value, is characterized by maximizing the POD and minimizing the PFAs [
7]. Maes et al. [
8] used the definition of the Youden index (YI) to evaluate the binary damage classification of a railway bridge in the context of vibration-based SHM. This metric is also used in this work for optimal decision making.
Composite materials have an increasing impact in many applications, because they are lightweight, cost-effective, and have a high stiffness [
9]. Studies on the POD relating to SHM of carbon fiber reinforced polymer (CFRP) structures have been conducted, in particular, using ultrasonic methods. Tschöke et al. [
10] investigated a CFRP plate with piezo sensors and tested the feasibility of a model-assisted POD approach. A special case of a CFRP structure is the omega stringer for airframes, which was investigated by Mueller et al. [
1] in a climatic chamber with applied artificial damage. For the POD estimation of delamination sizes in CFRP, Kim et al. [
6] used ultrasound imaging and Falcetelli et al. [
11] optical fiber sensors.
An interesting approach is predictive POD, applied by Orellana et al. [
12] to estimate the POD of two setups using analytical models without considering damaged states: SHM of a polyamide cuboid using contact ultrasonic testing and SHM of a CFRP plate using air-coupled ultrasonic testing. The work of Jiang et al. [
13] presents a logistic regression model for the visual inspection of low-velocity impact damage on laminates, whose performance is evaluated using ROC curves and compared with that of machine learning architectures. The baseline-free identification and localization of delamination in a simulated glass fiber reinforced polymer (GFRP) plate is proposed by Jagadeeshwar et al. [
14] in order to reduce the sensor density evaluated through ROCs.
Broadband microwave and millimeter-wave radar systems are useful for contact and non-contact inspections of composite structures [
15]. Their first application in the SHM field as interferometers in the 1990s was aimed at detecting structural changes in civil infrastructure, e.g., a traffic bridge, the Leaning Tower of Pisa, and a wind turbine [
16]. SHM of wind turbine blades (WTBs) is of interest in current research. They generally consist of GFRP. Due to the transparency of microwaves through GFRP, structural defects can be detected inside the composite layers [
9].
Embedded frequency modulated continuous wave (FMCW) radars showed promising results during a full-scale fatigue test of a
long WTB performed by Simon et al. [
17]. The WTB in the test hall at Fraunhofer IWES is shown in
Figure 1. In a further study by Streser et al. [
18], the measurement data was used to train, validate, and test a convolutional neural network. Rao et al. [
19] performed measurements with a FMCW radar on a GFRP sandwich with a modeled delamination. Based on the extracted physical properties, a damage model consisting of solid rigid foam with a thickness of
and erosion protection tape was developed. This damage model was applied to a WTB section and detected with a FMCW radar mounted on the main web [
9].
Delamination thicknesses are typically specified in fractions of millimeters. Mandell and Cairns [
20] designed a skin-stiffener specimen for WTB fatigue loading in order to plot a displacement curve for delamination produced under different loads. The final separation of orthophthalic polyester 63-AX-051 was achieved at an actuator displacement of approximately
. Li et al. [
21] investigated delamination on a finite element model of a WTB spar cap with depths of
,
, and
. Fang et al. [
22] performed measurements with a vector network analyzer on GFRP plates with thicknesses of
and
. The delamination thicknesses were
,
,
, and
.
In CFRP, Wallentine et al. [
23] monitored unidirectional CFRP matrix composite laminate plates using ultrasonic testing and serial sectioning microscopy. The delamination thickness was verified in a micrograph to be less than
. Li et al. [
24] created a finite element model of a CFRP structure in order to simulate delamination due to buckling with depths of
,
, and
. The detection of hollow and material-filled holes in E-glass epoxy composites with thicknesses between
and
was investigated by Gokul et al. [
25] with a vector network analyzer.
Wind turbines have rotor blades ranging in length from
to
. WTBs with greater lengths can generate more energy due to their larger frontal area for incident wind. However, this also increases the load levels, which leads to greater damage sizes. The literature refers to the lateral dimensions in this context [
26]. In the full-scale fatigue test by Simon et al. [
17] mentioned above, a
long crack spans almost completely the lateral length from the trailing edge to the leading edge. In the full-scale fatigue test by Al-Khudairi et al. [
27] of a
long WTB, the length of an induced crack along the web was 1 m, and the length of the delamination was
. Samareh-Mousavi et al. [
28] investigated fatigue delamination growth in a
long WTB. The total delaminated area in the spar cap increased to
. Desmond et al. [
29] tested actuator displacements on two
long WTBs with different loads. The first WTB, consisting of a fiberglass spar cap, reached a displacement range of
. The second WTB, consisting of a carbon fiber spar cap, reached a displacement range up to
.
Apart from radar systems, the POD is plotted as a function of flaw size in other electromagnetic (EM) approaches. Pulsed thermography was used by Liu et al. [
30] for the detection of artificial flat-bottom holes in a CFRP specimen. In the numerical simulation performed by Bao et al. [
31], a coil installed over a conductive plate was detected using eddy current. Guided EM waves were coupled into a long pipe by Chen et al. [
32] to characterize pipe wall thinning. Xu et al. [
33] analyzed in their work the POD as a function of the signal-to-clutter ratio for weak target detection. Moreover, Memmolo et al. [
34] studied omega stringer debondings through a microwave leakage approach.
The main contribution of this work is the successful implementation of a nonlinear POD approach for the performance assessment of a radar-based SHM methodology. Damage indicators (DIs) are calculated from experimental data using a FMCW radar at , according to the root mean square deviation (RMSD) and Mahalanobis distance (MD) method. Numerical simulations are performed to compare DI trends with experimentally determined DIs.
The numerical model and experimental setup are both a sandwich of two rectangular GFRP plates, which are shifted from each other from to in steps to simulate typical delamination thicknesses over the entire plates. Based on a specified threshold, POD curves are calculated with confidence bounds in order to obtain statistical information about the minimum detectability of a delamination using the DI approaches. To optimize the threshold, the maximum YI is taken into account, which is determined from the respective ROC curve.
The following aspects describe the novelty of this article:
POD assessment of a radar-based SHM technique for delamination detection in composite structures using an idealized delamination model for GFRP plates;
Comparison of different nonlinear regression models for reproducing the DI trends with a finer step width of the flaw size;
Presentation and discussion of different methods for optimal threshold decision and their practical applicability;
Physical explanation of the POD curves for minimal delamination detection, according to the slope, horizontal shift and distance to 95% confidence bounds in the context of electromagnetic testing;
Outlook for future investigations with the delamination model on more complex composite structures.
The remainder of this paper is structured as follows:
Section 2 provides an introduction to the mathematical formalism of the nonlinear POD, taking into account the ROC curves, as well as the numerical model and the experimental setup.
Section 3 discusses the experimental POD results in relation to the accuracy of binary damage classification with optimally selected thresholds and
confidence bounds. Finally,
Section 4 provides a summary of this work and an outlook for future research.
3. Results and Discussion
The procedure for calculating and interpreting the POD curves is shown in
Figure 6. First, simulation data for one waveguide port distance and experimental data with four different radar distances
L are recorded. Afterwards, DIs are calculated after the RMSD and MD to receive the signal responses of the signals in the time domain. To find the optimal threshold
for each measurement series, ROC curves are plotted, and the YIs are calculated. The maximum YI indicates the best threshold. After applying a suitable fit function, the standard deviation of the signal response
and the normal PDF per structural state are calculated from the DIs.
Using , and the regression points , the POD is obtained with the parameters and . These parameters indicate a detected delamination with a POD and an additional confidence. Finally, the predicted structural conditions are compared with the true label to determine the accuracy of the damage identification using the proposed DI approach.
3.1. Representation of the Signal Response Through Damage Indicators
The DIs are calculated according to Equations (
1) and (
4) and are plotted in
Figure 7 for the simulation and experimental data as a function of the delamination thickness
d. The DIs of a structural state are shown within one bar. Overall, the DI trends are similar between the simulation, experiment and the applied DI methods, and increase from the undamaged state to
delamination thickness. Plateaus for a small
d can be recognized in the simulation due to the absence of statistical deviations and random noise per measurement. In addition, lower modes propagate in this one-dimensional problem due to the much smaller geometry that leads to different slopes. For comparability, all DIs are normalized to one.
In the undamaged case, the experimentally determined DIs fluctuate more, suggesting that the plates were not completely parallel to each other in reality. In particular, for
, the
values of the reference state and damaged states until
overlap, meaning that this state cannot be clearly assigned to a damaged state. Therefore higher thresholds must be chosen for the POD assessment. When comparing
Figure 7a,b, the MD is more robust to these fluctuations.
3.2. Threshold Decision
The number of regression points is set to
= 100,000. Since the normalized DIs range within
, the threshold
is increased in steps of
within this interval. The DIs are classified as shown in
Table 2. The TPR and FPR are calculated according to Equations (
5) and (
6) in order to plot the ROC curves. These are shown in
Figure 8 for the RMSD and MD method separately. An intersection with the diagonal represents only random processes and is referred to as a POD of
. A perfect classification is given by a constant TPR value of
.
Afterwards, the YIs are calculated using Equation (
8). This formulation was used due to the small amount of DIs of reference (20) compared to the amount for damaged states (2000). The YIs are plotted in
Figure 9 as a function of
. The optimal threshold that is used for calculating the POD is derived from the maximum YI.
For
,
occurs for one
, which indicates perfect classification. For the measurement series with
, the YI trends look quite similar for both DI methods. Compared to
and
, more misclassifications that are recognized in the respective ROC curves lead to a smaller maximum of the YI. In addition, a higher threshold has to be selected.
produces the worst damage classification results. The physical reasons are discussed later in
Section 3.4.
Since the same number of ramps was measured for all structural states, the ratio between intact and damaged structures is highly unbalanced. To counteract this, difference signals relative to a super-baseline were used according to Equation (
1), and the baseline signals were considered as noise in the POD assessment. To determine only positive predictions for unbalanced datasets in more detail, precision–recall analysis is often performed. The positive predictive value (PPV) or precision is defined by [
36]
In
Figure 10, the curves for all radar distances and DI methods are shown. In particular, for the RMSD and measurement series with
, it is striking that a higher recall strongly decreases the precision due to the increase in FPs.
3.3. Regression Model
The regression model uses a fit function that covers the measurement points as continuously as possible. In addition to the fit curve, upper and lower
confidence bounds represent the acceptance range of a two-sided Gaussian test, which correspond to approximately
times the standard deviation [
38]. A well-chosen fit function is characterized by upper and lower confidence bounds being as close as possible to the fit curve.
Looking at the DI trends in
Figure 7a,b, it becomes apparent that the trend is nonlinear and closely resembles a saturation function. Four regression models were tested empirically and compared via the mean squared error (MSE)
for suitability: a linear, ninth order polynomial, hyperbolic tangent and logistic regression function.
Figure 11 plots the different regression lines in a joint graph for comparison. It has been found that the polynomial function is the most suitable fit with the smallest MSE. Due to the low interpretability and the choice of a saturation function, the hyperbolic tangent is selected according to the following equation:
The result of the nonlinear regression with
confidence bounds is shown in
Figure 12 for
and
values with a radar distance of
. In addition, the normal PDF per structural state, calculated according to Equation (
11), and the optimal threshold are also plotted. For viewing purposes, DIs for eleven structural states are plotted, and the Gaussian bell curves are normalized to
. Only the width of the distribution is important in order to estimate the scattering of the measurement points. The Gaussian bell curves are broader for the DIs calculated after the MD method. This means that the regression function is more suitable for the RMSD method. Since the threshold
is below all DIs of the damaged states, zero FNs are counted.
3.4. Probability of Detection Assessment
The experimental POD curves with lower
confidence bounds are plotted in a joint graph in
Figure 13. They are calculated using Equations (
10)–(
13). The mathematical formulation of the regression model gives negative values for the flaw size
a, which are non-physical. The area below
is shaded gray in the POD graphs. The first physical value is
for the intact condition. The POD levels of
for random processes and
for the minimum detectability of damage are plotted as well. If the intersections of the POD curves with these levels are below the physical limit, this means that there are no random processes, and damage detection is possible from the first damaged state.
Two aspects need to be discussed in order to assess the quality of the measurements:
The slope of the respective POD curve and the position of the confidence bounds depend on the suitability of the regression model.
The shift along the horizontal axis depends on the suitability of the threshold and measured flaw sizes.
The simulation does not show any scattering of the DIs within a structural state and is therefore excluded in this section. Nevertheless, the threshold can be set close to . This enables unambiguous classification, and the POD curve resembles a step function.
The regression model seems to be a good approximation for both DI methods and all measurement series. Therefore, the lower
confidence bound does not differ significantly from the fit curves. It can be figured out that the RMSD and MD methods result in similar POD curves. Looking at
Figure 13a, the POD curve for
is shifted along the horizontal axis, since the structural state
is still classified as undamaged with the selected threshold. The first state classified as damage is
. The POD curve for the DIs determined with
points out some misclassifications compared to smaller radar distances
and
.
The comparison of the relevant parameters
,
and
for the experimental POD curves can be found in
Table 4 and
Table 5. The results of the damage classification with accuracy determined according to Equation (
7) are listed in
Table 6 and
Table 7. The overall accuracy ranges from
to
and the minimal detectable delamination thickness from
to
. The increase in
for the largest radar distance to the first interface
is striking. To avoid reflections at aluminum profiles and unevenness of the GFRP plates, among other things, a stronger focusing with the radar becomes important.
For both the RMSD and MD method, shows perfect accuracy to distinguish between the intact and damaged structure due to the presence of delamination in the GFRP sandwich model. The accuracies below for the other POD curves are explained by the overlap of DIs of damaged states with the intact structure. Strong fluctuations in the reference signal for lead to higher overlaps, which degrade damage detection at an early stage.
One reason for the large fluctuations, in particular the DIs of the intact structure, is the suboptimal calibration of the zero point when the micrometer screw gauges are reset for the next measurement series. The four-point mounting on the aluminum profiles results in partial unevenness of the GFRP plates. Therefore, the setup needs to be optimized to improve reproducibility, e.g., by using precise step motors. Another reason for the increase in FNs with a larger L lies in trigonometric considerations. With a larger L and a constant radiation pattern of the radar, the covered area on the GFRP plate increases. Unless the plate is completely flat, greater random scattering affects the detectability of the delamination model.
Interestingly, the classification results for and are equal for both DI methods, for , they are similar, and for , they are different. Discrepancies between the methods may appear due to different assumptions. The similarity between RMSD and MD is the use of a difference signal, but the fundamental difference and advantage of the MD method is the use of variances, which dampen strong fluctuations in the reference signal for .
The strength of the RMSD is in the simple implementation, short computation time, and intuitive interpretability through successive differentiation from a baseline signal. However, assigning equal weight to all signals, associated with neglecting noise, has a disadvantageous effect on SHM. For varying EOCs, this can lead to misclassification of damage. The MD is used in the literature as a distance measure for identifying outliers in multivariate statistics under variable EOCs [
35]. For statistically fluctuating signals that correlate with other signals of the same structural state, the MD is superior to the RMSD due to the use of a covariance matrix. Statistical stability is thus given greater weight, which also increases the gain by increasing the number of measurements. However, this can be a disadvantage in real-time applications, as it increases the computational load.
This laboratory study has limitations when the problem is addressed in real-world scenarios. The primary focus is on validating an idealized damage model for the qualification of SHM systems without damaging the composite structure itself. Full-scale fatigue tests or loaded specimens in a more controlled environment are two options for characterizing real damage. As soon as signal changes caused by EOCs play a role, a more in-depth analysis is necessary. Alipek et al. [
39] demonstrate the application of GuidedGradCAM, which is an explainable machine learning technique for classifying local changes in radar reflections caused by wind speed, rotational speed, pitch angle, or nacelle orientation in radar images. The main task in that work was ensuring robust differentiation in all three rotor blades of a wind turbine using a tower radar.
4. Conclusions
This work deals with the radar-based identification of a simulated delamination with thicknesses from to in a sandwich model with GFRP. Data was obtained experimentally with a FMCW radar under laboratory conditions and numerically in CST Microwave Studio at frequencies ranging from to . DIs were calculated in the time domain using the RMSD and MD method. The simulations were only performed to verify the similarities of DI trends. The POD framework was used to evaluate the detectability of a delamination with the proposed DI algorithms.
The innovation was the combination of a nonlinear regression function and optimal thresholding methods. The DI trends exhibited a periodic, nonlinear behavior. For the POD analysis, the DI trend up to the first maximum was taken into account to unambiguously assign the DIs to a delamination thickness. Within this local interval, the trend could be approximated by a saturation function. Moreover, the hyperbolic tangent produced fit results with a low MSE to calculate high-resolution POD curves for experimentally acquired datasets. The most suitable method for optimal thresholding was determining the maximum of the YI, since data was recorded for each structural state with the same number of ramps under laboratory conditions.
With a POD of and additional lower confidence bounds, the minimum detectability of damage is given in the versus a representation. Different slopes, standard deviations, and shifts on the horizontal axis were obtained for the normal CDFs. Detectability became more difficult with increasing radar distances from delamination thicknesses to . Potential reasons are the unevenness of the GFRP plates, mechanical inaccuracies of the experimental setup and statistical fluctuations in the radar measurements. The accuracy of the binary damage classification in the hit/miss analysis ranges from to .
Based on these laboratory studies and findings, the delamination model will be investigated in a climate chamber under variable but controllable temperature and humidity settings in future research. Since the radar signals are affected by changes in the permittivity of the materials or by changes in the resistance of the electronics, the implementation of compensation methods is essential.
Furthermore, a field study on a wind turbine is planned, which involves a realistic variation in EOCs. A wide radar network will be installed inside two rotor blades to acquire data on the intact structure, with a lightweight version of the delamination model. Since wind turbine operating companies prohibit damaging the structure itself for the qualification of SHM systems, the conception, testing and evaluation of the damage model is crucial. The functional dependence of radar signals on numerous time-varying environmental and operational parameters is a highly complex issue and requires, in many cases, the use of machine learning to classify structural states from EOCs.