Next Article in Journal
Correlating XCO2 Trends over Texas, California, and Florida with Socioeconomic and Environmental Factors
Previous Article in Journal
Forest Three-Dimensional Reconstruction Method Based on High-Resolution Remote Sensing Image Using Tree Crown Segmentation and Individual Tree Parameter Extraction Model
Previous Article in Special Issue
A Survey of Sampling Methods for Hyperspectral Remote Sensing: Addressing Bias Induced by Random Sampling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing FTIR Spectral Feature Construction for Aero-Engine Hot Jet Remote Sensing via Integrated Peak Refinement and Higher-Order Statistical Fusion

Department of Electronic and Optical Engineering, Space Engineering University, Beijing 101416, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(13), 2185; https://doi.org/10.3390/rs17132185
Submission received: 16 May 2025 / Revised: 23 June 2025 / Accepted: 24 June 2025 / Published: 25 June 2025
(This article belongs to the Special Issue Recent Progress in Hyperspectral Remote Sensing Data Processing)

Abstract

Regarding the issue of constructing Fourier transform infrared (FTIR) spectral characteristics of hot jet of aero-engines, this paper presented a construction algorithm for the FTIR spectral characteristics of an aero-engine hot jet, which integrated staged refined processing and statistical feature fusion. First, a remote-sensing Fourier transform infrared spectrometer was employed to collect data on the hot jets of two distinct types of aero-engines, thereby establishing a measured spectral dataset. Subsequently, a multi-dimensional feature extraction vector construction algorithm was proposed, encompassing a peak feature extraction algorithm based on staged refined processing and a high-order statistical feature extraction algorithm. The peak feature extraction algorithm based on staged refined processing consisted of four steps: “coarse detection—local optimization—dynamic screening—intelligent merging”. It adopted an adaptive threshold for the initial coarse detection of peaks, enhanced the positioning accuracy through local gradient optimization, dynamically screened the local strongest peak according to intensity information, and resolved the problem of overlapping peak resolution via an intelligent merging strategy based on the physical characteristics of spectral lines, achieving high-precision and high-robustness peak feature extraction. The high-order statistical feature extraction algorithm realized the extraction of the intensity distribution information and waveform symmetry information of the spectral signal by fusing the kurtosis and skewness statistics. Compared with the traditional feature construction algorithms, the multi-dimensional feature vector construction algorithm proposed in this paper possessed a higher-dimensional comprehensive representation capability. In the experiment, we selected the GMM classifier of the unsupervised clustering algorithm. The classification accuracy of the features extracted by the algorithm in this paper on this classifier reached 82.42%, thereby validating the effectiveness of the algorithm presented in this paper.

1. Introduction

The precise diagnosis of faults in aero-engines is a core link to enhance their operational reliability and reduce operation and maintenance costs, which is of profound significance to the safety and efficiency of the aviation field. Compared with traditional detection equipment, the Fourier transform infrared spectrometer has significant advantages: it does not require cumbersome sampling operations or complex pretreatment of samples and it can monitor target substances in real time and at a distance. Furthermore, this device can also conduct rapid analysis of multiple components simultaneously, significantly enhancing the detection efficiency [1,2,3]. Particularly in passive remote sensing mode, the Fourier transform infrared spectrometer does not require additional active infrared light sources and can directly receive infrared radiation signals from the environment, which makes the selection of detection locations more flexible. In practical applications, its detection range can reach several kilometers at the farthest, which can meet the detection requirements of various complex scenarios. When the Fourier transform infrared spectrometer is working, the broadband infrared light emitted by the light source is divided into two beams by the beam splitter, and after interference, it illuminates the sample. The sample selectively absorbs light of specific frequencies [4,5]. The detector acquires the interference pattern, and then through the computer’s Fourier transform it is converted from the time domain to the frequency domain to obtain the infrared spectrum, based on which the elements, components, and molecular structure of the substance can be analyzed.
In recent years, with the rapid development of artificial intelligence technology, deep learning, due to its powerful feature extraction and pattern recognition capabilities, has gradually become a leading technical means for fault diagnosis in this field. Seiari et al. proposed a fault diagnosis method for aero-engine actuators using a combination of model observers. They designed a Luneberger model observer, which can detect actuator faults through the observer residuals [6]. However, this method overly relies on the accuracy of the established aero-engine model. Moreover, it can only detect some relatively obvious actuator faults. For complex, multi-factor-induced faults or those with mild fault severity, relying solely on the observer residuals for detection may not be sensitive and accurate enough, making it difficult to precisely diagnose the specific type and severity of the fault. Chen et al. proposed an improved SUKF algorithm and applied it to the estimation of engine performance degradation. This effectively improved the estimation accuracy of the onboard adaptive model for the actual health parameters and performance parameters of the engine degradation [7]. However, although their improved SUKF algorithm increased the estimation accuracy, it usually increased the computational load. In the onboard environment, computing resources are often limited, which may affect the real-time performance of the algorithm and prevent it from quickly providing the estimation results of engine performance degradation, thereby posing certain obstacles to taking timely maintenance measures. Fan et al. designed an adaptive law based on the interval observer theory, which can effectively reduce the error range of the output estimation and enhance the sensitivity to sudden failures. Based on the adaptive interval observer (AIO), a flexible event-triggered fault detection mechanism was designed [8]. Although the adaptive law based on the interval observer theory can compress the error range of the output estimation, the design of the adaptive law may be relatively complex, and it may require a large number of parameter adjustments and optimizations for different engine models and operating conditions.
Feature extraction can reduce dimensionality, remove noise, enhance data interpretability and model performance [9], and also help uncover hidden information in the data. It is an important step in data analysis [10,11,12]. Yang et al. combined the adaptive spectral pattern extraction (ASPE) theory and proposed a fast Fourier transform (FFT) method to quantitatively assess the degree of bearing damage. They established an early fault identification mechanism to determine the optimal time for the first prediction [13]. Gao et al. proposed an autocorrelation multi-head attention transformer algorithm for infrared spectral sequence deconvolution. They used the attention mechanism for feature extraction and the autocorrelation function for attention calculation. The autocorrelation attention model was used to utilize the inherent sequential properties of spectral data and effectively restore the spectrum by capturing the autocorrelation patterns in the sequence. This model was trained using supervised learning and showed good performance in infrared spectral restoration [14]. Sun et al. proposed a HAD background reconstruction method based on contrast self-supervised learning. By constructing a self-supervised pre-training model based on a pixel-block-level masking strategy and a dual attention network (DAN) encoder, this pre-training model can learn general background representations without generating labeled samples. The DAN encoder consists of a visual transformer and a channel attention module, extracting global context information and the correlation between the spectrum [15].
During the process of extracting features using deep learning, some original information may be compressed, filtered, or lost, making it difficult to restore the details of the data. Moreover, redundant features may be extracted, increasing the complexity of the model, reducing the training efficiency, and easily leading to overfitting, causing the model to perform well on the training data but have poor generalization ability on new data and be unable to accurately extract effective features. In the field of spectral data analysis, spectral peaks [16,17,18] serve as key characteristic parameters, which can intuitively reflect the advantages of the absorption or emission properties of substances, and they have become one of the most representative and widely applied features. They are commonly used in important research and practical scenarios such as substance composition identification, concentration determination, and structural analysis. Kurtosis and skewness are two important statistical quantities used to describe the distribution characteristics of data. Kurtosis is used to describe the peak shape of the data distribution, that is, the degree of concentration of the data around the mean value, and is usually used to measure the steepness or flatness of the distribution compared to a normal distribution [19,20]. Skewness is a statistical quantity used to describe the symmetry of the data distribution, and it measures the degree to which the data distribution deviates from a symmetrical distribution (such as a normal distribution) [21,22]. Through cluster analysis, the hidden natural grouping structure in the data can be discovered, revealing the intrinsic connections and similarities between data points [23,24,25,26].
The main contents of this article can be summarized as follows:
  • FTIR spectroscopy was employed to conduct precise field measurements of hot jets from two types of aero-engines, obtaining spectral data independently generated by each engine model.
  • A multi-stage refinement-based peak feature extraction algorithm is proposed. This algorithm establishes a four-level processing architecture of “coarse detection—local optimization—dynamic screening—intelligent merging,” providing a basis for hot jet characteristic analysis of different types of engines.
  • By integrating the statistical measures of kurtosis and skewness, this study proposes a multi-dimensional feature vector-based algorithm for feature construction. This algorithm systematically integrates the fundamental attributes and distribution patterns of spectral data, enabling structured, high-dimensional, and comprehensively representative feature extraction for hot jet data from two distinct aero-engine models. The proposed method establishes a robust data foundation for subsequent research, including a hot jet characteristic comparison and fault diagnosis.
This paper consists of five main sections. Section 1 reviews the current state of aero-engine fault diagnosis and deep learning-based feature extraction methods and briefly introduces the proposed approach, contributions, and framework. Section 2 presents the field experimental design for aero-engine hot jet spectral measurements and details the structure and methodology of the multi-dimensional feature extraction approach. Section 3 elaborates on the experimental procedures and results, followed by a comprehensive analysis of the findings. Section 4 discusses the implications of the experimental results, analyzing the strengths and limitations of the proposed method, along with potential directions for future improvements. Section 5 provides a systematic summary of the entire study.

2. Material and Methods

2.1. Data Collection

We conducted an outdoor field experiment to collect the hot jet data of the aero-engines. The data collection is shown in Figure 1. The measurement distance range of the spectrometer and the aero-engine is 127–280 m. During the measurement, the outdoor temperature was 20–25 °C and the humidity was 40–73% Rh. The measurement equipment is a Fourier infrared spectrometer developed by the Institute of Aerospace Information Innovation of the Chinese Academy of Sciences. The measurement mode is passive mode. The spectral resolution is 1 cm−1. The spectral measurement range is 2.5–12 μm. The superimposition times are 8 times. The full viewing angle can reach 30 mrad.
To ensure the accuracy and reliability of spectral data, it is necessary to perform radiometric calibration on the FTIR spectrometer. This involves calibrating the instrument’s wavelength, intensity, and other parameters in combination with the radiation characteristics of a blackbody. After calibration was completed, the background spectrum was collected. The instrument automatically performed background subtraction when acquiring the hot jet spectrum data of the aero-engine. Given the significant temperature difference between the hot jet and the background, this method is currently used to achieve a rough background subtraction. Then we calculate the brightness temperature based on Planck’s formula [27].
T v = h c v k l n L v + 2 h c 2 v 3 / L v
where h represents the Planck constant, h = 6.62607015 × 10 34   J · s . c represents the speed of light, c = 2.998 × 10 8   m / s . v represents the wave number, with the unit of cm−1. k represents the Boltzmann constant, k = 1.380649 × 10 23   J / k . L v represents the radiant flux of a unit beam.
In the field experiment, we collected data on the hot jet of two different types of aero-engines, which are denoted as type 0 and type 1. We collected 280 samples for the hot jet of type 0 and 522 samples of type 1. From the collected data of each type, the first three samples were selected for visualization, and the results are shown in Figure 2.

2.2. Multi-Dimensional Feature Construction Algorithm Design

This study, based on the collected FTIR data of the aero-engine hot jet, combined the low-order peak positions and intensities as well as statistical features to construct a multi-dimensional feature vector F . This provides a structured and comprehensive feature for the subsequent analysis of two different types of aero-engine hot jets. Its specific form is as follows:
F = P , V , k , γ T
where P represents the position characteristic of the peak, V represents the intensity characteristic corresponding to P , k represents the kurtosis characteristic, and γ represents the skewness characteristic.

2.2.1. Extraction of Peak Position and Intensity Characteristics

To meet the high-precision detection requirements for the hot jet characteristics of aero-engines, this study proposes a peak feature extraction algorithm based on multi-stage refined processing. This algorithm establishes a four-level processing framework of “coarse detection—local optimization—dynamic screening—intelligent merging” to provide a basis for the analysis of different types of engine hot jet characteristics.
Our spectral data is represented by each row as a sample. For each sample’s spectral signal s = s 1 , s 2 , , s n , the following operations need to be performed to extract the position and intensity information of each sample:
Step 1: Conduct a preliminary detection of the peaks. For all the peak positions p = p 1 , p 2 , , p m , select those that meet the following conditions. The intensity corresponding to each peak position is V = s p 1 , s p 2 , , s p m .
p i = { p s p i > s p i ± 1 }
Step 2: For each peak p i detected in the initial stage, we will search within a specific range centered on this peak to find the true maximum point. This specific range is determined by two parameters, Δ l and Δ r , where Δ l represents the leftward extension range. We will look for the maximum value within the interval p i Δ l , p i + Δ r . However, this interval cannot exceed the range of the spectral data. Therefore, the actual search interval will be limited to max 0 , p i Δ l , min n , p i + Δ r + 1 , where n represents the length of the spectral data.
p i = s j j max 0 , p i Δ l , min n , p i + Δ r + 1 a r g m a x
where s j represents the intensity value of the j point in the spectral data. The argmax() function indicates the value of the independent variable that makes the function reach its maximum. In this formula, it will iterate through all the j values within the interval max 0 , p i Δ l , min n , p i + Δ r + 1 , calculate the corresponding s j , and then return the j value that gives the maximum s j . The j value is the optimized peak position p i .
Step 3: For each optimized peak p i , the corresponding spectral intensity value is s p . Here, s represents the spectral data sequence, and the intensity value at the corresponding peak can be obtained through the index p i . The argsort() function represents the sorting of the intensity indices of the optimized peaks. This function returns an index sequence arranged according to the intensity. Since we need to select the peak with the highest intensity, we need to sort the intensity in descending order, that is, reorganize the indices in order of intensity from largest to smallest. Select the first N indices from the descendingly sorted index sequence, and the peaks corresponding to these indices are the N peaks with the strongest intensity.
p i = a r g s o r t s p
where p i represents the index set of the strongest peak that has been finally selected. a r g s o r t s p sorts the intensity s p corresponding to the optimized peak (with index p ) in ascending order.
Step 4: Based on the previous three steps, determine the position and intensity (i.e., brightness temperature) of the peak.
p o s = W p
i n t = s p
Step 5: Assess two adjacent peak points p o s a , i n t a and p o s b , i n t b , where p o s represents the position of the peak (e.g., wave number) and i n t represents the intensity of the peak. We determine whether to merge the two peaks by comparing the absolute value of the difference between the two peak positions p o s a p o s b with a pre-defined threshold m . If p o s a p o s b   m , it indicates that the two adjacent peaks are close enough in distance. We consider them to represent the same physical feature and decide to merge them. The new peak point after merging is p o s a + p o s b 2 , max i n t a ,   i n t b . That is, the new peak position is the average of the two original peak positions, and the new peak intensity takes the maximum value of the two original peak intensities. This is performed because taking the average position can integrate the position information of the two peaks, while taking the maximum intensity can retain the most significant signal feature. If p o s a p o s b > m , it indicates that the two adjacent peaks are far apart, representing different physical features, and no merging is performed.
p o s a ,   i n t a p o s b , i n t b = p o s a + p o s b 2 , max i n t a ,   i n t b ,   i f p o s a p o s b m s e p a r a t e , o t h e r w i s e
where ⨁ represents merge.

2.2.2. Extraction of Peak Kurtosis and Skewness Characteristics

Traditional data feature extraction methods usually only focus on the basic statistics of the data, such as mean and variance, while ignoring the higher-order features of the data. This study focuses on the kurtosis and skewness features of the peak, aiming to achieve in-depth exploration of the higher-order features of the hot jet FTIR data of the aero-engine, thereby more comprehensively and accurately revealing the characteristics of the hot jet.
Kurtosis is a statistical measure used to quantify the degree of sharpness of a data distribution compared to a normal distribution. In the FTIR data of the hot jet of an aero-engine, the peak kurtosis can reflect the concentration of the spectral absorption characteristic peaks of a specific component in the hot jet. Specifically, if the kurtosis value is greater than the kurtosis of a normal distribution, it indicates that the peak is more sharp, meaning that the spectral absorption characteristic peaks of the specific component in the hot jet are more concentrated; if the kurtosis value is less than the kurtosis of a normal distribution, it suggests that the peak is more flat, and the data distribution is more dispersed, that is, the range of data values near and on both sides of the peak is wider, which implies that the composition of the hot jet is complex and contains more substances.
k = μ 4 μ 2 2 3 = 1 n i = 1 n s i s ¯ 4 1 n i = 1 n s i s ¯ 2 2 3
where μ 4 represents the fourth-order central distance; μ 2 represents the second-order central distance; n represents the number of data points; s i represents the value of the i -th data point; and s ¯ represents the mean of the data. The “-3” is used to set the kurtosis value of the normal distribution to 0, which enables a more intuitive comparison of the kurtosis differences between different data distributions and a normal distribution.
Skewness is used to describe the asymmetry of data distribution. In the FTIR data of the hot jet of an aero-engine, the peak skewness can reflect the direction in which certain special substances in the hot jet affect the spectrum. When the skewness value is positive, the peak shows a right-skewed distribution, meaning that the long tail on the right side (in the direction of larger values) is longer, indicating that certain special substances in the hot jet have an influence on the spectrum that is biased towards the high-value direction; when the skewness value is negative, the peak shows a left-skewed distribution, with the long tail on the left side (in the direction of smaller values) being longer, indicating that certain special substances in the hot jet have an influence on the spectrum that is biased towards the low-value direction; when the skewness value is close to 0, the peak is approximately symmetrically distributed, indicating that the data distribution on both sides of the peak is relatively balanced.
γ = μ 3 μ 2 3 / 2 = 1 n i = 1 n s i s ¯ 3 1 n i = 1 n s i s ¯ 2 3 / 2
where μ 3 represents the third-order center distance and μ 2 represents the second-order center distance.

3. Experiment and Results

The overall experimental process of this article is shown in Figure 3. The experiment is mainly divided into four parts: collection of hot jet spectral data of the aero-engine, data preprocessing, construction of multi-dimensional features, and clustering.

3.1. Spectral Analysis

We calculated the average values for the hot jet spectral data of each type of engine. The average spectral diagrams of the hot jets of the two types of aero-engines are shown in Figure 4. In the spectral diagrams, the curve trends reflect the changes in brightness temperature of different types of substances at various wave numbers. From the overall trend of the spectral curves, in the vast majority of wave number intervals, the brightness temperature curves of the hot jets of the two types of aero-engines are almost overlapping, indicating that their radiation characteristics at these frequencies are highly similar. This means that in most wavelength ranges, the material composition, content ratio, and physical and chemical states of the hot jets of the two types of aero-engines are similar, resulting in a consistent interaction with electromagnetic radiation. For example, in the low wave number region (such as 500–1500 cm−1), the differences between the two curves are within the measurement error range, indicating that the hot jet radiation in this wavelength range is not significantly affected by the differences in engine types, and may be dominated by common basic substances (such as carbon dioxide, water, common combustion products, etc.). Around 2000–2500 cm−1, the differences between the two curves are obvious. The hot jet radiation of the type 1 aero-engine in this wavelength range significantly increases, possibly due to the generation of specific high-temperature radiation substances (such as isocyanate-based compounds, nitrogen oxides, etc.) during the combustion process, or because its combustion efficiency is higher, generating more high-temperature gases, causing these substances to exhibit intense spectral absorption and re-radiation phenomena at the frequency corresponding to the 2000–2500 cm−1 wavelength range. In contrast, the radiation of the type 0 engine in this wavelength range is relatively weak, possibly indicating that the content of such radiation substances in its combustion products is lower or the energy distribution characteristics during the combustion process result in a lower radiation efficiency in this wavelength range.

3.2. Data Preprocessing

The Savitzky–Golay filter was selected to smooth the spectral data collected from the external field experiment. The Savitzky–Golay filter smooths the spectral data by fitting polynomials within a local window, eliminating noise while retaining the features of the data well, providing a more reliable data basis for subsequent analyses (such as peak detection and feature extraction). The window parameter of the filter was set to 11, meaning that the current data point and the preceding and following five data points were selected for each smoothing operation. The order of the polynomial used for fitting was set to 3, that is, a third-order polynomial was used to fit the data points within the window. The coefficients of the polynomial were determined using the least squares method to achieve data smoothing.

3.3. Multi-Dimensional Feature Construction Experiment

A multi-stage refined processing algorithm for extracting peak features was employed to identify the characteristic peaks in the FTIR spectral data of the two types of aero-engines. Due to the spectral resolution of 1 cm−1, considering the influence of instrument noise during the measurement and to avoid excessive merging, both Δ l and Δ r are set to 3 when detecting the peaks. Different vibration–rotation transition lines of the same component may produce adjacent peaks. When the combustion of the aero-engine is incomplete, new weak peaks may appear, and it is necessary to avoid merging the peaks of different components. We set Δ m to 10. The settings of each parameter are shown in Table 1.
The results of the peak feature extraction algorithm are shown in Figure 5. The FTIR spectrum of the type 0 engine hot jet has a distinct and stable characteristic peak at 668 cm−1. This peak matches the specific out-of-plane bending vibration mode of the benzene ring, thus inferring that the type 0 engine hot jet contains aromatic compounds with benzene ring structures such as benzene, toluene, and phenol. At the same time, the spectrum shows the strongest characteristic absorption peak at 2391 cm−1, which is highly consistent with the typical absorption peak position of carbon dioxide gas, indicating that carbon dioxide is one of the main gas components of the hot jet. Additionally, in the 2281–2283 cm−1 wavelength range, two relatively concentrated characteristic absorption peaks appeared; combined with spectral analysis theory, this absorption feature is consistent with the asymmetric stretching vibration of the isocyanate group (–N=C=O), indicating that there may be compounds containing isocyanate groups in the hot jet. In the FTIR spectrum of the type 1 engine hot jet, the positions of the characteristic peaks are relatively concentrated. They mainly appear at 2276 cm−1, 2287 cm−1, and 2290 cm−1, and the stretching vibration of the carbon–nitrogen triple bond in the amide compounds is generally around 2240–2280 cm−1; 2276 cm−1 is within this range, which may indicate a compound containing an amide group, such as hexanediamide. The peaks at 2287 cm−1 and 2290 cm−1 indicate that compounds containing isocyanate groups also exist in the hot jet of type 1 engines. The peak at 2386–2387 cm−1 indicates that carbon dioxide also exists in the hot jet of type 1 engines.
The results after extracting the features based on the peak height of the wave are shown in Figure 6. Figure 6a presents the frequency distribution of the peak height values of the hot jets of the two types of aero-engines, and the KDE curve smoothly depicts the probability density distribution of the peak height values. It can be seen that there are differences in the peak height distribution of the hot jets of the two types of aero-engines. The distribution of the peak height values of type 0 is more scattered, and the spectral characteristics have a broader and more complex distribution. The peak height values of type 1 are concentrated in the higher positive region, corresponding to the aggregation characteristics of spectral absorption peaks in specific substances or states. Through Figure 6b, the distribution characteristics of the peak height data of the hot jets of the two types of aero-engines can be intuitively compared, such as the median position and the degree of data dispersion. The median of type 0 is lower and the data distribution is more scattered; the median of type 1 is higher and the data is relatively concentrated.
The results after extracting the features through the peak skewness are shown in Figure 7. Figure 7a presents the frequency distribution of the skewness values of the two types of engine hot jets, and the KDE curve depicts the probability density distribution of the skewness values. The skewness distributions of type 0 and type 1 are significantly different. The skewness distribution of type 0 is relatively dispersed, covering a wide range from negative values to positive values, indicating that the spectral data of this type of hot jet has greater diversity in the asymmetry of the peak shape. It suggests that type 0 includes various different engine operating states or hot jet components, resulting in different asymmetric manifestations of spectral features. For example, in some operating conditions, there may be more impurities or incompletely burned substances in the wake flow, causing the peak to show different degrees of left or right bias. As shown in Figure 7b, in contrast, the skewness values of type 1 are mainly concentrated in the positive region, indicating that the data distribution is mostly right-skewed, and the distribution is relatively concentrated. This implies that the spectral data of type 1 hot jets has relatively similar peak asymmetry characteristics, possibly corresponding to the wake flow generated under a relatively stable engine operating condition, or the spectral characteristics of the main components in the wake flow are relatively consistent, causing the peak shape to tend towards a right-tailed distribution.

3.4. Multi-Dimensional Feature Verification Experiment

Extract the key features such as peak position, intensity, kurtosis, and skewness from the FTIR spectral data of the aero-engine tail flow and construct a multi-dimensional feature vector. To verify the effectiveness of the feature vector, we use the unsupervised clustering algorithm. Combine these features of each sample into a vector, thereby constructing a multi-dimensional feature matrix F = f 1 , f 2 , , f N , where N represents the number of samples and f N is the multi-dimensional feature vector of the i-th sample. After standardizing the extracted multi-dimensional feature matrix F = f 1 , f 2 , , f N using StandardScaler, it is input into the GMM model for clustering.
The GMM model iteratively estimates the parameters (mean, covariance, and weight) of each Gaussian distribution through the EM algorithm, maximizing the likelihood function of the data. After the model is trained, it performs clustering predictions on new data points. Based on the probability that each aero-engine wake sample belongs to each Gaussian distribution, it is assigned to the cluster with the highest probability.
In this study, the probability distribution model of the multi-dimensional feature matrix F can be expressed as follows:
P f | θ = k = 1 K α k ϕ f | θ k
where K represents the number of Gaussian distributions, which is also the number of clusters. α k denotes the coefficient of the k-th Gaussian distribution, satisfying k = 1 K α k = 1 , indicating the contribution proportion of the k-th Gaussian distribution in the mixture model. ϕ ( f | θ k ) is the probability density function of the k-th Gaussian distribution, ϕ f θ k = 1 2 π σ k e ( f u k ) 2 2 σ k 2 , where u k is the mean of the k-th Gaussian distribution and σ k is the standard deviation. θ k = u k , σ k represents the parameters of the k-th Gaussian distribution.
Suppose the multi-dimensional feature matrix follows a Gaussian distribution with mean u k and standard deviation σ k . Then, the implicit parameters of each spectrum in the feature matrix can be solved using the Expectation–Maximization (EM) algorithm. That is, for the multi-dimensional feature matrix f 1 , f 2 , , f N , the corresponding parameters are θ 1 , θ 2 , θ 3 , · · · · θ N . The Gaussian model parameters θ can be estimated using the EM algorithm. First, initialize the model parameters. Using the current model parameters θ , calculate the probability that the multi-dimensional feature matrix f k comes from the k-th sub-model:
γ j k = α k f j θ k k = 1 K α k ϕ f j θ k , j = 1 , 2 , , N ; k = 1 , 2 , , K
where γ j k represents the probability of sample f j belonging to the k-th cluster.
Then, update the model parameters and repeat the iterative calculation multiple times. Then, based on the results of each parameter, obtain the clustering results of the multi-dimensional feature moments.
μ k = j = 1 N γ j k f j j = 1 N γ j k , k = 1 , 2 , , K
σ k 2 = j = 1 N γ j k ( f j μ k ) 2 j = 1 N γ j k , k = 1 , 2 , , K  
α k = n k N = j = 1 N γ j k N , k = 1 , 2 , , K
where n k = j = 1 N γ j k represents the effective number of samples in the k-th cluster and α k represents the proportion of the k-th cluster among all samples.
We compared the common other clustering models: divisive clustering and agglomerative clustering [28]. Figure 8 shows the resulting graph after clustering. Compared with divisive clustering and agglomerative clustering, the GMM clustering algorithm can to some extent distinguish the two types of hot jets of the aero-engines, and the clustering effect is the most obvious.
We used the evaluation metrics of accuracy, AUC, and F1 score. Table 2 shows the results of each clustering algorithm. It can be seen that GMM performed exceptionally well in these performance indicators, while divisive clustering and agglomerative clustering performed relatively poorly.
A c c u r a c y = T P + T N T P + T N + F P + F N
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
where TP represents true positive cases (where the prediction is correct and the actual outcome is positive), TN represents true negative cases (where the prediction is correct and the actual outcome is negative), FP represents false positive cases (where the prediction is incorrect and the actual outcome is negative), and FN represents false negative cases (where the prediction is incorrect and the actual outcome is positive).

4. Discussion

This study focuses on the issue of constructing the hot jet characteristics of two types of aero-engines and adopts the field measurement method for research. To ensure the comparability of data, the experiment strictly controls the measurement environmental parameters (temperature, humidity, etc.) and geometric conditions (measurement distance, angle, etc.), and uses a Fourier transform infrared spectrometer to collect spectral data from the center area of the engine’s tail nozzle. The temperature at the tail nozzle of an aero-engine is extremely high (ranging from 1500 to 2000 K), and direct measurement poses safety risks. In field experiments, considering the safety of the measurement personnel and the stability of the spectrometer, the spectrometer was deployed on the side of the aero-engine. The distance between the hot jet source and the spectrometer was within 127–280 m. Although the setting of the measurement angle and distance ensured operational safety, it also led to a certain degree of attenuation of the hot jet radiation signal. In particular, the atmospheric transmission along the measurement path would cause absorption and attenuation of the spectrum. During the field experiment collection, after the calibration was completed, the background spectrum was collected. When the instrument was obtaining the hot jet spectrum data of the aero-engine, it automatically completed the background subtraction. Given that the temperature difference between the hot jet and the background is significant, this method is currently used to achieve a rough background subtraction. In the data analysis stage, the radiation brightness temperature spectra in the 400–4000 cm−1 characteristic band are selected for modeling processing. This study employs a multi-stage refined peak extraction algorithm to analyze the FTIR spectral data of the hot jets of the two types of engines and successfully extracts the key characteristic peaks. The results show that the components of the hot jets of both types of engines are closely related to CO2 and isocyanate-based compounds, but there are significant differences in their content distribution. This difference may be attributed to multiple factors. At the combustion process level, the type and composition of the fuel and the combustion conditions (temperature, pressure, air–fuel ratio, etc.) directly affect the generation of combustion products; in terms of engine structural design, differences in the shape and size of the combustion chamber and nozzle design can alter the gas flow mixing and fuel injection effect, thereby interfering with combustion efficiency and product distribution; in terms of operating conditions, changes in load and speed cause alterations in fuel supply, air flow, combustion temperature and pressure, etc., affecting the combustion process; in addition, changes in environmental temperature and pressure, by influencing the intake state of the engine, indirectly affect the content distribution of combustion products.
This study extracts multiple key features from the FTIR spectral data of aero-engine exhaust plumes, including peak position, intensity, kurtosis, and skewness. These features describe the characteristics of the spectra from different perspectives and, when combined, form a multi-dimensional feature vector. The peak position reflects the specific location of the absorption peak in the spectrum, which is related to the absorption characteristics of specific substances; the peak intensity indicates the content or concentration of the substance in the exhaust gas; kurtosis describes the sharpness of the peak, reflecting the concentration or dispersion of the data distribution; and skewness characterizes the asymmetry of the peak, which helps to elucidate the direction of the influence of special substances on the spectrum. The kurtosis and skewness of the brightness temperature spectrum can be used to infer the component concentration gradient, temperature field symmetry, and flow state of the hot jet without direct contact, providing a spectralological basis for engine fault prediction. For example, during normal combustion, the radiation peak of the CO2 component shows a narrow band distribution and has a high kurtosis. If unburned kerosene mixes with the combustion products, the kurtosis will decrease. Under normal operating conditions, the overall distribution of the brightness temperature spectrum is approximately symmetrical. If the concentration of the oxidant (such as O2) in the hot jet decreases along the flow direction, the downstream combustion is insufficient, and the CO concentration increases. Correspondingly, the skewness of the brightness temperature at 2143 cm−1 is rightward. This multi-dimensional integration enhances the robustness of the model against noise and measurement fluctuations, providing a richer and more discriminative structured input for subsequent classifiers.
In this study, the dataset exhibits a significant class imbalance issue, with only 280 samples of type 0. To address this problem, white noise was added to the 0-type samples for data augmentation, increasing the sample size to 560. The feature processing method presented in this paper was then applied. The clustering results based on the Gaussian mixture model (GMM) showed that the model accuracy reached 77.08%, the AUC value was 0.77, and the F1 score was 76.79%. When attempting to use the generative adversarial network (GAN) for data augmentation, the clustering performance actually decreased. In-depth analysis revealed that the features of the two types of aero-engine hot jets have extremely small inter-class distances in the high-dimensional space, belonging to a low-discrimination feature scenario. Although the GAN-generated samples increased the number of 0-type samples, due to the high overlap of the feature manifolds of the two types, it was difficult to maintain inter-class discriminability of the generated samples, resulting in a large amount of pattern overlap in the feature space after expansion. The current research is limited by the sample size. If more hot jet data from different types of aero-engines can be obtained in the future, we will use the existing feature construction method to mine the data features and achieve precise classification and identification of different types of engines. This exploration is expected to open up a new path in the field of aero-engine fault diagnosis. Building a more comprehensive feature library and classification model will help improve the accuracy and timeliness of fault diagnosis and provide innovative solutions for aero-engine health management.

5. Conclusions

This paper proposes a multi-stage refined processing and statistical feature fusion algorithm for the remote sensing feature construction of aero-engine hot jets. Firstly, a remote Fourier transform infrared spectrometer is used to collect data on the hot jets of two different types of aero-engines. Then, a multi-dimensional feature construction vector algorithm is proposed, which includes a peak feature extraction algorithm based on multi-stage refined processing and a high-order statistical feature extraction algorithm. The peak feature extraction algorithm based on multi-stage refined processing consists of a four-level processing architecture of “coarse detection—local optimization—dynamic screening—intelligent merging”. This algorithm first uses an adaptive threshold for initial coarse peak detection, then improves the positioning accuracy through local gradient optimization, combines intensity information to dynamically screen the local strongest peak, and finally solves the problem of overlapping peak resolution based on an intelligent merging strategy of spectral line physical characteristics, achieving high-precision and high-robustness peak feature extraction. The high-order statistical feature extraction algorithm extracts the intensity distribution information and waveform symmetry information of the spectral signal by fusing the kurtosis and skewness statistics. Compared with traditional feature construction algorithms, the multi-dimensional feature vector construction algorithm proposed by us has a higher-dimensional comprehensive representation ability. To verify the effectiveness of the algorithm in this paper, the GMM classifier of the unsupervised clustering algorithm is selected, and the accuracy rate reaches 82.42%. The method proposed in this paper provides a new idea for the construction of hot jet features of aero-engines and is expected to open up new ways in the field of aero-engine fault diagnosis for engine condition monitoring and fault early warning.

Author Contributions

Formal analysis, Y.L.; investigation, Z.K. and Z.L.; software, X.Y.; validation, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 62005320.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lewiner, F.; Klein, J.P.; Puel, F.; Févotte, G. On-line ATR FTIR measurement of supersaturation during solution crystallization processes. Calibration and applications on three solute/solvent systems. Chem. Eng. Sci. 2001, 56, 2069–2084. [Google Scholar] [CrossRef]
  2. Frederik, V.D.V.; Furness, D.; Viset, M. Titrimetric-comparable BN results determined for in-service lubricants using quantitative FTIR spectroscopy. Lubr. Sci. 2023, 35, 93–102. [Google Scholar] [CrossRef]
  3. Jackson, M.; Mantsch, H.H. The Use and Misuse of FTIR Spectroscopy in the Determination of Protein Structure. Crit. Rev. Biochem. Mol. Biol. 1995, 30, 95–120. [Google Scholar] [CrossRef] [PubMed]
  4. Aygun, A.; Tiri, R.N.E.; Bayat, R.; Sen, F. Hydrothermal synthesis of BCQD@g-C3N4 nanocomposites supporting environmental sustainability: Organic dye removal and bacterial inactivation. Microelectron. J. 2024, 16, 100464. [Google Scholar] [CrossRef]
  5. Nicolet, Y.; de Lacey, A.L.; Vernède, X.; Fernandez, V.M.; Hatchikian, E.C.; Fontecilla-Camps, J.C. Crystallographic and FTIR spectroscopic evidence of changes in Fe coordination upon reduction of the active site of the Fe-only hydrogenase from Desulfovibrio desulfuricans. J. Am. Chem. Soc. 2001, 123, 1596–1601. [Google Scholar] [CrossRef]
  6. Seiari, S.A.; Cen, Z.; Youssef, H.; Tsoutsanis, E. Aeroengine Actuator Fault Detection and Estimation Via Combined Model Observers. IFAC Pap. 2024, 58, 121–125. [Google Scholar] [CrossRef]
  7. Chen, Q.; Sheng, H.; Zhang, T. An improved nonlinear onboard adaptive model for aero-engine performance control. Chin. J. Aeronaut. 2023, 36, 317–334. [Google Scholar] [CrossRef]
  8. Fan, Q.Y.; Ren, H.; Xu, B.; Han, W. Fault detection based on adaptive interval observer and its application in aeroengine. J. Frankl. Inst. 2023, 360, 11243–11269. [Google Scholar] [CrossRef]
  9. Duan, Y.; Chen, C.; Fu, M.; Gong, X.; Niu, Y.; Luo, F. GITANet: Group Interactive Threshold-based Attention Network for Hyperspectral Image Classification. IEEE Trans. Multimed. 2025, 27, 3571–3584. [Google Scholar] [CrossRef]
  10. Shin, J.; Lee, G.; Kim, T.H.; Cho, K.H.; Hong, S.M.; Kwon, D.H.; Pyo, J.; Cha, Y. Deep learning-based efficient drone-borne sensing of cyanobacterial blooms using a clique-based feature extraction approach. Sci. Total Environ. 2024, 912, 16. [Google Scholar] [CrossRef]
  11. Jeong, S.; Chung, H. Combining two-trace two-dimensional correlation analysis and convolutional autoencoder-based feature extraction from an entire correlation map to enhance vibrational spectroscopic discrimination of geographical origins of agricultural products. Talanta 2025, 285, 127385. [Google Scholar] [CrossRef] [PubMed]
  12. Zhou, T.; Yu, X.; Chen, S.; Zhang, J.; Xu, H. An intelligent identification method of draft tube vortex rope based on dynamic feature extraction and random forest: Application to a prototype pump-turbine. J. Energy Storage 2024, 102 Pt B, 114227. [Google Scholar] [CrossRef]
  13. Yang, N.; Zhang, W.; Zhang, J.; Wang, K.; Su, Y.; Liu, Y. A Method for Remaining Useful Life Prediction and Uncertainty Quantification of Rolling Bearings Based on Fault Feature Gain. IEEE Trans. Instrum. Meas. 2025, 74, 3507414. [Google Scholar] [CrossRef]
  14. Gao, L.; Cui, L.; Chen, S.; Deng, L.; Wang, X.; Yan, X.; Zhu, H. AMTrans: Auto-Correlation Multi-Head Attention Transformer for Infrared Spectral Deconvolution. Tsinghua Sci. Technol. 2025, 30, 1329–1341. [Google Scholar] [CrossRef]
  15. Sun, X.; Zhang, Y.; Dong, Y.; Du, B. Contrastive Self-Supervised Learning-Based Background Reconstruction for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5504312. [Google Scholar] [CrossRef]
  16. Han, S.; Kang, T.; Lee, J.; Kim, N.; Won, H.; Kim, Y.H.; Gong, W.; Kwak, I.Y. A deep neural network approach to heart murmur detection using spectrogram and peak interval features. Eng. Appl. Artif. Intell. 2024, 137 Pt A, 109156. [Google Scholar] [CrossRef]
  17. Ma, H.; Wang, D.; Zhou, C.G. A simple conductivity measurement method using a peak-frequency feature of ferrite-cored eddy current sensor. NDT E Int. Indep. Nondestruct. Test. Eval. 2024, 142, 103024. [Google Scholar] [CrossRef]
  18. Deng, X.; Wu, M.; Yang, W.; Tang, X.; Cao, Y. Fault detection of multimode chemical processes using weighted density peak clustering and trend slow feature analysis. Process Saf. Environ. Prot. 2025, 196, 106941. [Google Scholar] [CrossRef]
  19. Vashishtha, G.; Chauhan, S.; Zimroz, R.; Kumar, R.; Gupta, M.K. Optimization of spectral kurtosis-based filtering through flow direction algorithm for early fault detection. Measurement 2025, 241, 115737. [Google Scholar] [CrossRef]
  20. Gioffrè, M.; Gusella, V.; Grigoriu, M. Non-Gaussian Wind Pressure on Prismatic Buildings. I: Stochastic Field. J. Struct. Eng. 2001, 127, 981–989. [Google Scholar] [CrossRef]
  21. Guttal, V.; Jayaprakash, C. Changing skewness: An early warning signal of regime shifts in ecosystems. Ecol. Lett. 2010, 11, 450–460. [Google Scholar] [CrossRef] [PubMed]
  22. Bendel, R.B.; Higgins, S.S.; Pyke, J.E.T.A. Comparison of skewness coefficient, coefficient of variation, and Gini coefficient as inequality measures within populations. Oecologia 1989, 78, 394–400. [Google Scholar] [CrossRef] [PubMed]
  23. Ordaz, M.; Arroyo, D.; Singh, S.K.; Salgado-Gálvez, M.A. A PSHA for Mexico City based solely in Fourier-based GMM of the response spectra. Soil Dyn. Earthq. Eng. 2024, 187, 109025. [Google Scholar] [CrossRef]
  24. Hajihosseinlou, M.; Maghsoudi, A.; Ghezelbash, R. A comprehensive evaluation of OPTICS, GMM and K-means clustering methodologies for geochemical anomaly detection connected with sample catchment basins. Geochemistry 2024, 84, 18. [Google Scholar] [CrossRef]
  25. Zhu, X.; Yang, C.; Yang, C.; Gao, D.; Lou, S. An unsupervised fault monitoring framework for blast furnace: Information extraction enhanced GRU-GMM-autoencoder. J. Process Control. 2023, 130, 103087. [Google Scholar] [CrossRef]
  26. Li, J.; Li, G.; Lin, L. Terraced compression method with automated threshold selection using GMM algorithm for heterogeneous bodies detection. Measurement 2024, 238, 115415. [Google Scholar] [CrossRef]
  27. Ade, P.; Aghanim, N.; Arnaud, M.; Arroja, F.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Ballardini, M.; Banday, A.J.; Barreiro, R.B.; et al. Planck 2015 results: XX. Constraints on inflation. Astron. Astrophys. 2015, 594, A20. [Google Scholar] [CrossRef]
  28. Kerr, W.R.; Kominers, S.D. Agglomerative Forces and Cluster Shapes; Social Science Electronic Publishing: Rochester, NY, USA, 2012; Volume 97, pp. 877–899. [Google Scholar] [CrossRef]
Figure 1. Data Acquisition. (a) Schematic diagram of the field experiment scene; (b) picture of the experimental site.
Figure 1. Data Acquisition. (a) Schematic diagram of the field experiment scene; (b) picture of the experimental site.
Remotesensing 17 02185 g001
Figure 2. Original data collected. (a) Type 0; (b) type 1.
Figure 2. Original data collected. (a) Type 0; (b) type 1.
Remotesensing 17 02185 g002
Figure 3. The experimental flowchart.
Figure 3. The experimental flowchart.
Remotesensing 17 02185 g003
Figure 4. Average spectral diagram of the hot jets of the two types of aero-engines.
Figure 4. Average spectral diagram of the hot jets of the two types of aero-engines.
Remotesensing 17 02185 g004
Figure 5. Results of extracting the peak positions of the hot jet waves for the two types of aero-engines. The red circles indicate the positions of the extracted features. (a) Type 0; (b) type 1.
Figure 5. Results of extracting the peak positions of the hot jet waves for the two types of aero-engines. The red circles indicate the positions of the extracted features. (a) Type 0; (b) type 1.
Remotesensing 17 02185 g005aRemotesensing 17 02185 g005b
Figure 6. Results of peak height feature extraction for the hot jet waves of the two types of aero-engines. (a) Histogram; (b) box plot.
Figure 6. Results of peak height feature extraction for the hot jet waves of the two types of aero-engines. (a) Histogram; (b) box plot.
Remotesensing 17 02185 g006aRemotesensing 17 02185 g006b
Figure 7. Results of extracting the peak skewness characteristics of the hot jet waves of the two types of aero-engines. (a) Histogram; (b) box plot.
Figure 7. Results of extracting the peak skewness characteristics of the hot jet waves of the two types of aero-engines. (a) Histogram; (b) box plot.
Remotesensing 17 02185 g007aRemotesensing 17 02185 g007b
Figure 8. Results after clustering. (a) Divisive clustering; (b) agglomerative clustering; (c) GMM.
Figure 8. Results after clustering. (a) Divisive clustering; (b) agglomerative clustering; (c) GMM.
Remotesensing 17 02185 g008aRemotesensing 17 02185 g008b
Table 1. Settings of each parameter.
Table 1. Settings of each parameter.
SymbolValue
Δ l 3
Δ r 3
N5
Δ m 10
Table 2. Results after clustering of each model.
Table 2. Results after clustering of each model.
ModelAccuracy (%)AUC (%)F1 Score (%)
Divisive clustering81.2977.7680.92
Agglomerative clustering81.5477.9581.15
GMM82.4279.3782.16
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kang, Z.; Liao, Y.; Yang, X.; Li, Z. Enhancing FTIR Spectral Feature Construction for Aero-Engine Hot Jet Remote Sensing via Integrated Peak Refinement and Higher-Order Statistical Fusion. Remote Sens. 2025, 17, 2185. https://doi.org/10.3390/rs17132185

AMA Style

Kang Z, Liao Y, Yang X, Li Z. Enhancing FTIR Spectral Feature Construction for Aero-Engine Hot Jet Remote Sensing via Integrated Peak Refinement and Higher-Order Statistical Fusion. Remote Sensing. 2025; 17(13):2185. https://doi.org/10.3390/rs17132185

Chicago/Turabian Style

Kang, Zhenping, Yurong Liao, Xinyan Yang, and Zhaoming Li. 2025. "Enhancing FTIR Spectral Feature Construction for Aero-Engine Hot Jet Remote Sensing via Integrated Peak Refinement and Higher-Order Statistical Fusion" Remote Sensing 17, no. 13: 2185. https://doi.org/10.3390/rs17132185

APA Style

Kang, Z., Liao, Y., Yang, X., & Li, Z. (2025). Enhancing FTIR Spectral Feature Construction for Aero-Engine Hot Jet Remote Sensing via Integrated Peak Refinement and Higher-Order Statistical Fusion. Remote Sensing, 17(13), 2185. https://doi.org/10.3390/rs17132185

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop