Data Enhancement via Low-Rank Matrix Reconstruction in Pulsed Thermography for Carbon-Fibre-Reinforced Polymers

Pulsed thermography is a commonly used non-destructive testing method and is increasingly studied for the assessment of advanced materials such as carbon fibre-reinforced polymer (CFRP). Different processing approaches are proposed to detect and characterize anomalies that may be generated in structures during the manufacturing cycle or service period. In this study, matrix decomposition using Robust PCA via Inexact-ALM is investigated as a pre- and post-processing approach in combination with state-of-the-art approaches (i.e., PCT, PPT and PLST) on pulsed thermography thermal data. An academic sample with several artificial defects of different types, i.e., flat-bottom-holes (FBH), pull-outs (PO) and Teflon inserts (TEF), was employed to assess and compare defect detection and segmentation capabilities of different processing approaches. For this purpose, the contrast-to-noise ratio (CNR) and similarity coefficient were used as quantitative metrics. The results show a clear improvement in CNR when Robust PCA is applied as a pre-processing technique, CNR values for FBH, PO and TEF improve up to 164%, 237% and 80%, respectively, when compared to principal component thermography (PCT), whilst the CNR improvement with respect to pulsed phase thermography (PPT) was 77%, 101% and 289%, respectively. In the case of partial least squares thermography, Robust PCA results improved not only only when used as a pre-processing technique but also when used as a post-processing technique; however, this improvement is higher for FBHs and POs after pre-processing. Pre-processing increases CNR scores for FBHs and POs with a ratio from 0.43% to 115.88% and from 13.48% to 216.63%, respectively. Similarly, post-processing enhances the FBHs and POs results with a ratio between 9.62% and 296.9% and 16.98% to 92.6%, respectively. A low-rank matrix computed from Robust PCA as a pre-processing technique on raw data before using PCT and PPT can enhance the results of 67% of the defects. Using low-rank matrix decomposition from Robust PCA as a pre- and post-processing technique outperforms PLST results of 69% and 67% of the defects. These results clearly indicate that pre-processing pulsed thermography data by Robust PCA can elevate the defect detectability of advanced processing techniques, such as PCT, PPT and PLST, while post-processing using the same methods, in some cases, can deteriorate the results.


Introduction
Due to the unique features of Carbon-fibre-reinforced polymers (CFRP)-low-density and high-performance physico-chemical properties-the interest in using these lighter products and thus replacing the conventional materials (Steel, aluminum, etc.) has increased. The increasing demand for CFRP structures in the aerospace industry is leading to

Literature Review
The presence of excessive noise in raw thermal data always urges researchers to develop new IRNDT processing approaches. Although limited research work has been done on the improvement of PCA methods to deal with corrupted data, RPCA has been the most promising approach in recent years. RPCA is widely used in separating dynamic variations from the static feature of interest, such as video surveillance data analysis to extract foreground and background [13]. Infrared dim small target detection has been a hot and difficult research topic in infrared search and tracking systems. Later, Fan et al. [14] introduced a novel detection algorithm based on RPCA to solve the difficulty of small target detection.
Substantial progress has been made in moving object detection, for which RPCA has been demonstrated to be very effective. The RPCA has been used in infrared moving target tracking [15] and hyper-spectral image processing for anomaly detection [16]. Moreover, RPCA has been used for pre-processing in the machine learning method proposed by Zhu et al. [17]. They utilized RPCA to detect regions of interest (ROIs) in a novel classification model based on the CNN model in eddy current testing (ECT), and the percentage of defects correctly identified have increased to almost 100%. Draganov et al. [18] used several decomposition techniques, such as RPCA with Go implementation (GoDec), to estimate the wild animal population using videos captured by thermographic cameras. They reported promising results in terms of accuracy and execution times. Later, they carried out a comparative analysis of the performance of several tensor decomposition algorithms, including high-order robust principal component analysis solved by the Singleton model (HoRPCA-S) [19]. They reported that among the selected methods, HoRPCA-S has a lower detection rate but high precision. Furthermore, Liang et al. [20] have demonstrated the feasibility of sparse tensor decomposition theory on an ECPT data sequence, and they concluded that Tensor RPCA (TRPCA) can extract defects with high accuracy. The same year, Li et al. [21] introduced the weighted contraction IALM (WIALM) algorithm based on low-rank matrix recovery for online applications. It has been used for tire inspection on radiographic images captured by tire X-ray inspection machines. They improved the efficiency of the algorithm by optimizing the incremental multiplier parameter. Wu et al. [22] proposed a novel hierarchical low-rank and sparse tensor decomposition method to detect anomalies in the induction thermography stream. This approach can suppress the interference of a strong background and sharpens the visual features of defects. Furthermore, it overcame the over-and under-sparseness problem suffered by similar state-of-the-art methods. Surface defect detection is important for product quality control. A visual detection method was based on low-rank and sparse matrices extracted from the RPCA approach for surface defect detection of the wind turbine blade [23]. This method in terms of robustness and accuracy outperformed several state-of-the-art methods. Recently, Wang et al. [24] proposed a methodology based on RPCA that can separate anomalies in a sparse matrix from a low-rank background for photovoltaic systems using thermography imaging. They successfully overcame the difficulties arising from real data and built an automatic online monitoring system for anomaly detection. Ebrahimi et al. [10] proposed the orthogonal inexact augmented lagrange multiplier (OIALM). This study demonstrates its efficiency for defect enhancement capabilities over mixed and various types of defects typically addressed in IRT in composite materials. In addition, Kaur et al. [25] conducted a comparative study between PCA and RPCA to evaluate their effectiveness in defect detection. They demonstrated that although PCA proved to be better in detection capability, the sparse matrix provides better detectability than the data reconstructed from the low-rank matrix. In the medical field, for 3D segmentation of lungs, Sun et al. [26] achieved good segmentation results for lungs with juxta-pleural tumours by the active shape model (ASM) based on RPCA.
Many research works have reported the applicability of IRNDT approaches, including PCT, PPT and PLST. The first implementation of the PCT was introduced by Rajic [27] for defect detection in composite materials. Lara et al. expressed that optical effects, such as heating non-uniformities, surface reflection and emissivity variations, appear on the first component, and the thermal effect will be retrieved on one of the secondary components [28]. Furthermore, the PCA is a linear decomposition function that is sensitive to over-illumination and non-uniform heating more than other types of noise. In our previous research, we proved that Robust PCT [10] can improve the detectability of deeper defects in composites. Moreover, the PLST is sensitive to gradient. Having an approach that is less sensitive to noise and applicable to other IRNDT approaches in order to improve the defect detection is always interesting. As indicated from the literature, low-rank matrices from RPCA have less noise, and in this study, we study the use of this matrix on different IRNDT approaches.
The following section introduces the methods and materials regarding this study.

Robust Principal Component Analysis (RPCA)
The Robust PCA problem can be solved via convex optimization that minimizes a combination of the nuclear norm and the 1 -norm. The augmented Lagrange multiplier (ALM) is a method to solve this convex program. Equation (1) introduces the general method of ALM for solving constrained optimization problems [29]: where f : R n → R and h : R n → R m . Candès et al. [30] used a convex optimization; the formulation they have used is known as PCP. The observation matrix D is assumed to be a combination of the low-rank (A) and sparse matrix (E): To minimize the energy function, 0 -norm is used.
where λ is a positive and arbitrary balanced parameter to determine the contribution of A and E in minimizing the objective function. Since Equation (3) is an NP-hard problem, i.e., at least as hard as the hardest problems in non-deterministic polynomial (NP) time, Candès et al. [30] reformulated this equation into a similar convex optimization problem as follows: where A * , E 1 are the nuclear norm of A and l 1 -norm of E, respectively. The balance parameter λ is defined as: The low-rank minimization due to the correlation between the frames provides a framework for background modelling. Lin et al. [31] solved Equation (4) using a generic ALM method. The Lagrange function can be defined as: The Lagrange function of Equation (4) is defined as: where Y is the Lagrange multiplier and the penalty parameter µ is a positive scalar parameter. The inexact augmented Lagrange multiplier (IALM) method used to solve the RPCA problem is shown in Algorithm 1. Y 0 has been initialized to Y 0 = D/J(D) [32], making the objective function value Y 0 , D reasonably large. In addition, where . ∞ is the maximum absolute value of the input matrix.
In Step 1 of Algorithm 1, ρ is the learning rate, and µ 0 is the initialization of the penalty parameter that influences the convergence speed. In [31], it is proven that the objective function of the RPCA problem (Equation (4)), which is non-smooth, has an excellent convergence property. In addition, it has been proven that to converge to an optimal solution (A * , E * ) of the RPCA problem, it is necessary for µ k to be non-decreasing and ∑ +∞ k=1 µ −1 k = +∞. The proposed algorithm steps are detailed in the following table.

State-of-the-Art
Pulsed thermography has been extensively investigated as a mean to detect defects for a wide variety of applications. Several processing techniques have been proposed and have been thoroughly reported. References [33][34][35] provide a detailed review of various methods. Principal component thermography (PCT) [27], pulsed phase thermography (PPT) [9] and the partial least squares thermography (PLST) [11] are among the most effective.
In this paper, a computed low-rank matrix was used prior to or after the application of PCT, PPT and PLST in the PT regime for comparative purposes.

PCT
PCT was introduced by Rajic et al. [6,27] based on the popular multivariate statistical method, principal component analysis (PCA) [36]. This method constructs a set of empirical orthogonal functions (EOFs), which are strong representations of complex input signals. In IRNDT, PCT tends to project data in the orthogonal space that maximizes the variance of projected data. The EOFs will represent the most critical variability of the data, respectively. In general, the given sequence can be represented with a few EOFs. Typically, the thermal sequence of thousands of frames can be replaced by a maximum of ten EOFs.

PPT
Pulsed phase thermography was introduced by Maldague et al. [9]. Each pixel in the thermal data sequence can be transformed using the one-dimensional discrete Fourier transform (DFT) to extract amplitude and phase information from PT data. Unlike raw thermal data, phase transform φ is less sensitive to environmental reflections, emissivity variations, non-uniform heating, surface geometry and orientation. The most important characteristic of this method is that it can provide qualitative and quantitative analysis. For instance, a straightforward formulation of depth estimation (z) using the thermal diffusion length µ and the blind frequency f b is: where f b is the frequency at which a given defect has enough contrast to be detected, while C 1 is the empirical constant and calculated after a series of experiments. It has been observed that C 1 ≈ 1 for amplitude data and a value in the range of 1.5 to 2, with C 1 = 1.82, are typically adopted for research similar to that presented in [37]. Therefore, probing deeper defects using the phase makes it more interesting than the amplitude. More information regarding PPT can be found in [9].

PLST
PLST [12] is based on a statistical correlation method known as partial least squares regression (PLSR). PLST decomposes predictor X(n × N) and predicted Y(n × M) matrices into loading (P and Q), score (T and U) vectors and residuals (E and F). The predictor matrix corresponds to the thermal profile, while Y is defined by the observation time during which the thermal sequence was acquired. Mathematically, the PLS model is expressed as: In order to select the appropriate number of PLS components, two parameters, i.e., the root mean square error (RMSE) and the percentage variance explained in the X matrix, must be taken into consideration.

Data Acquisition
The experiments were carried out on an academic carbon-fibre-reinforced polymer (CFRP) plate (30.8 cm × 46 cm × 2.57 mm) with 73 defects of 3 different types, i.e., 23 round flat-bottom holes (FBH), 25 triangular Teflon inserts, and pullouts. In order to manufacture the pullout defect, a metallic sheet is removed after polymer curing. Therefore, the pullout can only be located at the edge of the part (Figure 1c). The Teflon insert is made of Teflon sheets inserted between plies ( Figure 1b). In the case of FBH manufacturing, a hole is drilled to have a flat reflecting surface at the hole bottom at the backside of the sample (Figure 1a). One of the important defects in non-destructive inspection is delamination, which occurs between plies during manufacturing or by fatigue, bearing damage, impact, etc., during the life-cycle. The academic plate used in this study was prepared to investigate the differences in the thermal response of different artificial defect types. Strictly speaking, all artificial defects are at best an approximation of a real delamination. A pull-out seems to be closer to a real delamination (thermally speaking) but is difficult to produce anywhere other than on the borders of the specimen (which implies that the sample must have an open border). Teflon inserts are traditionally employed for other NDT techniques (e.g., ultrasounds) in thermography. However, Teflon behaves significantly different than a real delamination (air) does. Lastly, flat-bottom-holes are easier to produce, though they are open on the rear side of the specimen and possess a much larger volume than a real delamination. The surface of the specimen possesses a fairly good emissivity, so environmental reflections were negligible. Non-uniform heating had a greater impact on all techniques, as can be seen in Section 4. The defects vary in size, depth and thickness and are presented in Table 1, and the schematic of the plate shows their respective locations in Figure 2a. The thermophysical properties of CFRP involved in the NDE are: k-thermal conductivity (W/m/K), ρdensity (kg/m 3 ) and c-specific heat capacity (J/kg/K). The other important thermal properties are: α = k/ρ/c-thermal diffusivity and e = kρc-thermal effusivity. The thermophysical information of the CFRP plate is shown in Table 2. The PT experimental setup, two flash lamps for 5 ms sent a thermal pulse (6.4KJ/flash (Balcar, France)) to the specimen; a cooled infrared camera (FLIR Phoenix (FLIR Systems, Inc., Wilsonville, Oregon, USA), InSb, midwave, 3-5 mm, Stirling Cooling) with a frame rate of 180 Hz was used to record the temperature profile in the reflection mode ( Figure 2b). The technical camera specifications of the thermal camera are presented in Table 3. The data processing was performed on a PC with 56 GB memory and an Intel(R) Core(TM) i7-4820K control processing unit. Infrared images were taken from a distance of 70 cm by the IR camera without pan nor tilt in a controlled environment.

Metrics
In this section, we added two metrics-one to yield a thermal score indicating thermal anomalies, another to measure the segmentation potential.

Contrast-to-Noise Ratio (CNR)
The signal-to-noise ratio (SNR) is a metric that quantitatively assesses the desired signal quality by estimating the signal level with respect to the background noise. The contrast-to-noise ratio (CNR) is similar to SNR, but it measures the image quality based on the contrast between a defective area and its neighbourhood. Usamentiaga [38] proposed a definition of SNR, which is more robust against noise and image enhancement operations. Equation (11) shows this definition, which has been used in this study. For this purpose, two areas are considered: an area in the defect area (carea) and a region around the defect region as a reference region (narea).
where µ carea and µ narea are the average levels of contrast in carea and narea, respectively; σ carea and σ narea are the standard deviation of the contrast in carea and narea, respectively.

Jaccard Similarity Coefficient Score
The Jaccard similarity coefficient [39] (also known as Jaccard index or Intersection-Over-Union (IoU)) is a statistical method that emphasizes the similarity between two finite datasets (as illustrated in Figure 3): This approach mathematically represents Equation (12) and is formally defined as the number of the shared members/pixels between two sets (intersection), divided by the total number of members in either set (union) and multiplied by 100. J (A, B) provides a value between 0 (no similarity) and 1 (identical sets). Hence, the higher the value of IoU, the higher the level of similarities between the two sets ( Figure 3b).
For the remainder of this article, we will refer to the low-rank matrix A as low-rank matrix (LRM).

Analysis
The previous section recalls the RPCA we used in our experiments. As described in Figure 4a,b, we conducted two experiments. The main difference between our experiments is that: in the first experiment (Figure 4a), the LRM is computed directly from the raw data; while in the second (Figure 4b), the LRM is computed from the output of the processing methods. For the remainder of this article, we refer to the first experiment as a preprocessing experiment and to the second as a post-processing experiment. We chose to compare our approach with three state-of-the-art approaches, principal component thermography (PCT) [6,27], pulsed phase thermography (PPT) [9] and partial least-squares thermography (PLST) [11,12], due to the popularity and simplicity of these methods.
The metrics are computed using different protocols. The defective areas were labelled using LabelMe © [40]. From the border of the defective region, n pixels are considered as a transient region, and from the boundaries of this area, n pixels are automatically counted as a non-defective or sound area. Figure 5 illustrates the aforementioned regions so as to estimate the CNR score. According to Equation (11) and the labelled regions, the average and standard deviation values are obtained for all data.
Regarding the second metric, Figure 6 depicts the automatic segmentation approach and Jaccard index calculation. In our segmentation approach, after the image's contrast correction, a bilateral filter [41] smoothed the image. Then, after applying local thresholding, the small artifacts are removed from the image. The obtained mask from the segmentation step can be compared with the ground truth in order to compute the metric score.

Results
The original data acquired by pulsed thermography (raw data) is used as pre-and post-processing for different processing approaches. Figure 7 shows some representative results (selected arbitrarily) of the different methods. The first column in Figure 7 results from different techniques on raw data, where the second column presents RPCA results as a pre-processing method, and the last column shows the RPCA approach used as a post-processing method.  (1st row) These images present the 3rd component of PCT data on raw data after using a low-rank matrix for pre-processing and post-processing, respectively. (2nd row) These images present PPT data at 0.135 Hz on raw data after using a low-rank matrix for pre-processing and post-processing, respectively. (3rd row) These images present the 3rd component of PLST data on raw data after using a low-rank matrix for pre-processing and post-processing, respectively.          Figure 15a,b illustrate the numbers of enhanced defects using pre-and post-processing, respectively. The numbers inside the columns represent the enhanced defects when using different techniques, and the number above the columns are the total number of defects in each case. 13 10 12 The best Jaccard index for all data sequences for different methods is shown in Table 7.  Figure 7 illustrates selected results from different methods. In this figure, the first image from each row presents the selected technique on raw data (PCT, PPT or PLST); the second and third images show the effect of using the LRM as a pre-and post-processing method.
Our segmentation approach was evaluated by the Jaccard index presented in Table 7. Figure 7 implies that although pre-processing can reduce the non-uniform heating impact, post-processing accentuates this effect. Thermal profiles of different methods across the different lines are shown in Figures 8-10. As depicted in the graphs, the flat thermal profiles show the non-defective or sound area, and when the amplitude is increased or decreased, the available discontinuities can be guaranteed. The application of preprocessing before PCT and PPT approaches improved the defect detection; also, in the case of PLST, both pre-and post-processing can increase the detection of anomalies. In addition, the graphs show similar results with quantitative metrics, which will be explained later. From Tables 4-6 and Figures 11-14, one can note that the results from the pre-processing experiments are noticeably better than those obtained from the post-processing experiment. Note that these results are compared with results obtained without using low-rank matrices for both experiments. For the PCT method, one can note: Moreover, as indicated in Figures 11-14, regarding the relative depths, in all cases (FBHs, POs and TEFs), the deeper the defect, the lower the CNR value (as expected). Comparing the two experiments, one can observe that the pre-processing experiment leads to a larger number of defective regions for PCT and PPT methods than the post-processing experiments. Nevertheless, this observation is not valid for the PLST method, where the results are pretty similar in both experiments. For the PO defect, the increase in terms of CNR score is higher in the pre-processing experiments; the mean ratio of improvement is 2.6 times higher than it is for the post-processing experiments. Similarly, the mean ratio of improvement for the Teflon defects is 1.7 times higher in the pre-processing experiment than in the post-processing experiments. Nonetheless, the mean improvement ratio is 2.5 times higher in the post-processing experiment than in the pre-processing experiment. To conclude, our results show that computing an LRM from the raw data before applying any state-of-the-art method significantly improves the results of the method. In the particular case of FBH defects, one can consider computing an LRM before and after the method.

Discussion
As one can note in Table 7 and see in Figure 15b, using the LRM, prior to the state-ofthe-art processing method, leads to better Jaccard index scores and therefore segmentation in all cases. One can also note that the Jaccard index score for the PLST method does not change much between the pre-processing and post-processing experiments. The Jaccard index score for the PCT and PPT methods decreases noticeably for the segmentation of the post-processing experiment results compared with the segmentation of the raw data. This indicates that the results of the segmentation worsen.

Conclusions
The present study investigates the benefits of the low-rank matrices for pulsed thermography. The investigation conducted for this study focuses on enhancing defective regions located within a reference sample of CFRP. The sample we used had three types of defects. Two experiments were conducted: during the first experiment, the low-rank matrix was computed from the raw data before applying any processing. During the second experiment, the low-rank matrix is computed from the output of a method, after it was applied on raw data. For both experiments, we used PPT [9], PCT [6,27] and PLST [11,12]. Two figures of merit, the contrast-to-noise ratio (CNR) and the Jaccard similarity coefficient, were used to evaluate the results quantitatively.
Our results conclude that using a low-rank matrix, when used as a pre-processing method, noticeably improves the results of all of the techniques. The low-rank matrix reconstruction effectively reduces the noise and non-uniform heating. When used as a postprocessing method, the results vary from one method to another. The results indicate that pre-processing can improve 67.12% of PCT results more than post-processing, especially regarding FBHs (the detectability of FBHs, pullouts and Teflon inserts was increased to 92.86%, 88% and 82.35%, respectively). Furthermore, pre-processing has a better effect on PPT results (67.12% of the defects were detected) than post-processing. For FBHs, pullouts and Teflon inserts, the detectability of defects reached 71.43%, 100% and 82.35%. The detectability of pullouts and Teflon insert defects in both pre-and post-processing has improved, reaching 100% and 76.47%, respectively; however, the detectability is better after using pre-processing in the PLST method. In addition, when used on the output of PLST, the low-rank matrix reconstruction still shows better results than the PLST alone. Nonetheless, this conclusion is not shared for both PPT and PCT. The Jaccard index proved that pre-processing can improve the segmentation potential in all aforementioned methods. In the case of PLST, improvements were made for both pre-processing and post-processing.
This study presents very promising results regarding the improvement of anomaly detection in pulsed thermography in CFRPs. To make the proposed approach more practical in NDT techniques, future research will be directed towards the application of pre-and post-processing on a wider range of materials.

Conflicts of Interest:
The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.