Aircraft Landing Gear Retraction/Extension System Fault Diagnosis with 1-D Dilated Convolutional Neural Network

The faults of the landing gear retraction/extension(R/E) system can result in the deterioration of an aircraft’s maneuvering conditions; how to identify the faults of the landing gear R/E system has become a key issue for ensuring aircraft take-off and landing safety. In this paper, we aim to solve this problem by proposing the 1-D dilated convolutional neural network (1-DDCNN). Aiming at developing the limited feature information extraction and inaccurate diagnosis of the traditional 1-DCNN with a single feature, the 1-DDCNN selects multiple feature parameters to realize feature integration. The performance of the 1-DDCNN in feature extraction is explored. Importantly, using padding dilated convolution to multiply the receptive field of the convolution kernel, the 1-DDCNN can completely retain the feature information in the original signal. Experimental results demonstrated that the proposed method has high accuracy and robustness, which provides a novel idea for feature extraction and fault diagnosis of the landing gear R/E system.


Introduction
The landing gear R/E system is the significant subsystem for aircrafts, after long-term running under complex and variable conditions, with heavy loads and strong impact, the key parts in the landing gear R/E system will inevitably generate multifarious faults, which may affect take-off, landing, and flight safety.
Firstly, Hinton proposed a deep learning method in 2006, which set off a new wave of research on artificial intelligence and its applications [1]. In particular, deep learning models have shown significant success in image processing, speech recognition, target detection, information retrieval, natural language processing, and so on [2]. Moreover, as an important network structure, CNNs are widely applied in computer vision and natural language processing [3]. Machine learning methods have made great progress in the field of fault diagnosis. For example, Gligorijevic et al. proposed a method for rolling bearing fault diagnosis. Through the five-level wavelet decomposition of the vibration signals, the standard deviations of the wavelet coefficients from six sub-bands were extracted as representative features; feature dimensionality reduction was then performed, and the diagnosis accuracy reached 98.9% [4]. However, some scholars gradually introduced CNNs into the field of fault diagnosis. By converting 1-D timeseries vibration signals into 2-D input matrices, some experts and scholars constructed 2-D convolutional neural network models for fault diagnosis of rotating machinery. Janssens et al. performed a short-time Fourier transform on the vibration information of rotating machinery, then input the transformed coefficient map into a constructed CNN model for feature extraction to achieve CNN-based multi-fault identification [5]. Jing et al. proposed an adaptive multi-sensor data fusion method based on deep convolutional neural networks, for fault 1.
Analyze the aircraft landing gear R/E system's main fault mode with its working principle.

2.
A conventional 1-DCNN was constructed to classify faults based on the actuator's displacement; experimental results show the average accuracy of the test set reached 91.80%. 3.
The receptive field size affects the extraction of the original information by the model. There is a nonlinear relationship between the receptive field and the convolution kernel size for the expanded convolution. The convolution kernel size's influence on classification accuracy is explored.

4.
There is a low classification result with a single feature; consequently, 1-DDCNNselects multi-feature parameters with the system pressure, the pressure at the right and left end of the actuator cylinder. Experimental results reveal that the average test accuracy of the model reaches 99.80%.

System Working Principle and Composition
The landing gear system of a typical aircraft mainly includes the following: front landing gear and cabin door, main undercarriage and cabin door, landing gear R/E system, wheels and brakes, turning control system, landing gear position indication system, etc. Among them, the R/E system mainly completes normal R/E and emergency R/E functions, and provides the landing gear position indication signal.
The front landing gear R/E process is similar to that of the main landing gear; therefore, only the front landing gear retraction process is described here. The working status of the front landing gear is shown in Figure 1. When the plane takes off, and the landing gear wheels are off the ground, the pilot sets the landing gear R/E control switch to the "UP" position, and the current flows to the landing gear R/E electromagnetic switch and the accumulator charging electromagnetic switch. The hydraulic fluid from the three-position four-way directional valve enters the front landing gear lower lock and actuating cylinder. As the accumulator charging solenoid switch is turned on, oil from the pump is supplied to the actuating cylinder, and the oil in the accumulator is also released to aid the landing gear retraction. When the aircraft is about to land, the pilot moves the control switch to the "DOWN" position, the three-position four-way directional valve is switched to the down circuit, and the fluid enters the front landing gear upper lock. Once the lock is opened, the oil enters the lowering chamber of the R/E actuator to lower the landing gear.
the right and left end of the actuator cylinder.Experimental results reveal that the average test accuracy of the model reaches 99.80%.

System Working Principle and Composition
The landing gear system of a typical aircraft mainly includes the following: fron landing gear and cabin door, main undercarriage and cabin door, landing gear R/E sys tem, wheels and brakes, turning control system, landing gear position indication system etc. Among them, the R/E system mainly completes normal R/E and emergency R/E functions, and provides the landing gear position indication signal.
The front landing gear R/E process is similar tothat of the main landing gear; there fore, only the front landing gear retraction process is described here. The working status of the front landing gear is shown in Figure 1. When the plane takes off, and the landing gear wheels are off the ground, the pilot sets the landing gear R/E control switch to the "UP" position, and the current flows to the landing gear R/E electromagnetic switch and the accumulator charging electromagnetic switch. The hydraulic fluid from the three-position four-way directional valve enters the front landing gear lower lock and actuating cylinder. As the accumulator charging solenoid switch is turned on, oil from the pump is supplied to the actuating cylinder, and the oil in the accumulator is also re leased to aidthe landing gear retraction. When the aircraft is about to land, the pilo movesthe control switch to the "DOWN" position, the three-position four-way direc tional valve is switched to the down circuit, and the fluid enters the front landing gear upper lock. Once the lock is opened, the oil enters the lowering chamber of the R/E actuator to lower the landing gear. The landing gear R/E system mainly includes the following components: constan pressure variable pump, tank, hydraulic motor, filter, accumulator, actuator, press con trol, throttle valve, and three-position four-way directional control valve.
Through providing certain pressure and oil mass, the pump converts mechanica energy into hydraulic energy. The tank is used to store hydraulic oil. The filter is used to filter the hydraulic oil and remove its impurities. The accumulator not only supplies oi atboth weakand heavy flows, it also compensates for leakage and maintains constan pressure. The actuator is a device that converts hydraulic energy into mechanical energy for linear reciprocating motion, which overcomes the load (including friction) and maintains the speed of motion using pressure-driven liquid flow. The relief valve is one of the common pressure valves used to regulate or limit the pressure in a hydraulic sys tem. The throttle valve is a hydraulic component that regulates and controls the flow o oil in a hydraulic system. The function of the one-way throttle valve is to ensure that the The landing gear R/E system mainly includes the following components: constant pressure variable pump, tank, hydraulic motor, filter, accumulator, actuator, press control, throttle valve, and three-position four-way directional control valve.
Through providing certain pressure and oil mass, the pump converts mechanical energy into hydraulic energy. The tank is used to store hydraulic oil. The filter is used to filter the hydraulic oil and remove its impurities. The accumulator not only supplies oil at both weak and heavy flows, it also compensates for leakage and maintains constant pressure. The actuator is a device that converts hydraulic energy into mechanical energy for linear reciprocating motion, which overcomes the load (including friction) and maintains the speed of motion using pressure-driven liquid flow. The relief valve is one of the common pressure valves used to regulate or limit the pressure in a hydraulic system. The throttle valve is a hydraulic component that regulates and controls the flow of oil in a hydraulic system. The function of the one-way throttle valve is to ensure that the oil flows in one direction, with no backflow, using the throttling effect. The solenoid directional valve is one of the frequently used hydraulic components in hydraulic systems, and is used to switch the direction of the hydraulic circuit. This article uses a three-position four-way solenoid directional valve. The "Position" refers to the working position of the spool. The "Way" marks the valve body of the oil port [15].

Failure Mode and Effect Analysis
The failure mode and effect analysis (FMEA), derived from typical civil aircraft design data in this subsection, can be used for parameter selection and fault injection into the simulation model, whereby fault datasets are obtained. The FMEA was carried out on the system's main components for subsequent fault diagnosis, the results of which are shown in Table 1 below. The failure analysis in this paper focuses on the component level of the landing gear R/E system and does not explore the specific internal failure of each component. For the excessive noise from the pump, the failure threshold can be obtained by changing the air content of the oil. If the clogging of the throttle valves at both ends of the actuator cylinder has different effects on the system, it is necessary to change the throttle valve's diameter at both ends of the actuator cylinder to obtain the failure threshold. Regarding the system failure caused by constant pressure variable pump leakage, a throttle valve should be connected to the constant pressure variable pump in parallel, and the throttle valve's diameter should be changed to simulate different degrees of leakage of constant pressure variable pump. Actuator cylinder leakage also affects the normal operation of the system, and the failure threshold can be obtained by changing the actuator leakage coefficient.

1-DCNN
The receptive field is defined as the area size mapped on the original image by each pixel on the feature map output from each layer in the CNN. The neuron's receptive field value decides the original range it can cover, meaning that it may contain more global features. Figure 2a shows the range of neuron receptive fields in the third layer of the 1-DCNN, with a convolutional kernel size of 3 × 1 and a step size of 1 × 1. The marked blue neurons in the third layer are mapped from the blue regions in the first layer, that is, the receptive field size of the input sequence data corresponding to a neuron in the output feature map 2 is 5 × 1. marked blue neurons in the third layer are mapped from the blue regions in the first layer, that is, the receptive field size of the input sequence data corresponding to a neuron in the output feature map 2 is 5 × 1.
(a) (b) Before inputting 1-D time series signals into 2-DCNN, the common method is to rearrange and combine signal sampling points, using a simple procedure, and convert them into 2-D matrix form. The 1-DCNN has the advantage that 1-Dtime series signals can be input directly without the need for cumbersome conversions.
The output receptive field of the n-th layer is: where n r is the receptive field size of the n-th layer, n k is the filter size of the n-th layer, and i s is the movement step size of the i-th layer filter.
According to the receptive field's design principle, the size of the neuron receptive field in the last layer is close to the input signal's length, that is, it satisfies the condition n r L = , where L is the length of the input signal. The convolution kernel size is k, and the sliding step size of the convolution kernel is s. Each convolutional layer is followed by the maximum pooling layer, where the step size of the maximum pooling layer is 2 pool k = , and the sliding window of the maximum pooling layer is 2 pool k = . When 2 n > , the receptive field of the convolutional layer is: and the receptive field of the pooling layer is: when 2 n > , the n-th network layer is a convolutional layer and n is odd, then the difference between the front and back receptive fields is 1 2 n− ;thus, the expression for the receptive field n r of the neurons in the last pooling layer at the input signal, when n is even number, is:  Before inputting 1-D time series signals into 2-DCNN, the common method is to rearrange and combine signal sampling points, using a simple procedure, and convert them into 2-D matrix form. The 1-DCNN has the advantage that 1-Dtime series signals can be input directly without the need for cumbersome conversions.
The output receptive field of the n-th layer is: where r n is the receptive field size of the n-th layer,k n is the filter size of the n-th layer, and s i is the movement step size of the i-th layer filter. According to the receptive field's design principle, the size of the neuron receptive field in the last layer is close to the input signal's length, that is, it satisfies the condition r n = L, where L is the length of the input signal. The convolution kernel size is k, and the sliding step size of the convolution kernel is s. Each convolutional layer is followed by the maximum pooling layer, where the step size of the maximum pooling layer is k pool = 2, and the sliding window of the maximum pooling layer is k pool = 2. When n > 2, the receptive field of the convolutional layer is: and the receptive field of the pooling layer is: when n > 2, the n-th network layer is a convolutional layer and n is odd, then the difference between the front and back receptive fields is 2 n−1 ; thus, the expression for the receptive field r n of the neurons in the last pooling layer at the input signal, when n is even number, is: According to the condition r n = L, the value of k is obtained from:

1-DDCNN
Dilated convolution is also called expanded convolution; it replaces the traditional CNN pooling operation by introducing the expansion ratio, which can completely retain the feature information in the original signal so that the convolution kernel of the same size can obtain a larger receptive field.
where r n is the receptive field size of the n-th layer network structure, k n is the convolutional kernel size of the n-th layer network structure, and l n is the expansion rate of the n-th layer network structure. Figure 2b shows the receptive fields' range in the third layer of the 1-DDCNN (output feature map 2) for the first layer(input sequence data) and the second layer (output feature map 1). The convolutional kernel size is 3 × 1 (k = 3). The step size is 1 × 1. The expansion rate is 2 (l 1 = l 2 = 2). The receptive field size in the second layer corresponding to output feature 2 is 5 × 1, and the receptive field size in the first layer is 9 × 1.

1-DDCNN Fault Diagnosis Model Framework
The structural framework of the proposed fault diagnosis method based on the 1-DDCNN is shown in Figure 3. The model fault diagnosis process was as follows:

Data Description and Operating Environment
Due to the fault data insufficiency regarding operation conditions, there is a centiveto use AMESim ® to model the landing gear R/E system model and obtain datasets. According to the mutual logical relationship between components, the lan gear R/E system model was established, and is presentedin Figure 4. The blue se represents the hydraulic subsystem, the green section signifies the mechanical su tem, and the red section denotes the external load of the system. Component param settings in the model are shown in Table 2. Step 1: Datasets were divided into training set, validation set, and test set.
Step 2: According to the structure and parameters of the traditional 1-DCNN model, the 1-DDCNN was preliminarily designed.
Step 3: The diagnostic accuracy of the multi-feature 1-DDCNN model under different convolutional kernel sizes was investigated to determine the final model hyper-parameters.
Step 4: The proposed model was trained and tested with a test set to obtain the fault diagnosis accuracy.

Data Description and Operating Environment
Due to the fault data insufficiency regarding operation conditions, there is an incentiveto use AMESim ® to model the landing gear R/E system model and obtain fault datasets. According to the mutual logical relationship between components, the landing gear R/E system model was established, and is presentedin Figure 4. The blue section represents the hydraulic subsystem, the green section signifies the mechanical subsystem, and the red section denotes the external load of the system. Component parameter settings in the model are shown in Table 2.

Data Description and Operating Environment
Due to the fault data insufficiency regarding operation conditions, there is an incentiveto use AMESim ® to model the landing gear R/E system model and obtain fault datasets. According to the mutual logical relationship between components, the landing gear R/E system model was established, and is presentedin Figure 4. The blue section represents the hydraulic subsystem, the green section signifies the mechanical subsystem, and the red section denotes the external load of the system. Component parameter settings in the model are shown in Table 2.    The specific parameters of the FMEA in Section 2.2 are shown in Table 3. The failure status:1 curve in the subgraphs a, b, c, and d of Figure 5 shows the main parameters' variation trends under normal conditions, and that the entire landing gear R/E process time is 32 s, during which the landing gear retraction time is 7.5 s and the extension time is 10.8 s. These times are similar to those specified in the manual, and the manual requires that the R/E time shall not exceed the specified time by 1 s, or it will be regarded as a fault [16]. According to the fault thresholds in Table 3, 300 simulations were conducted for each of the six fault states, and the four parameters (actuator cylinder displacement, system pressure, and the pressure at the right and left end of the actuating cylinder) were sampled. The sampling frequency was set as 0.01 to obtain 1800 samples. Training, validation, and test setswere divided as 8:1:1, respectively. The details of the sin- According to the fault thresholds in Table 3, 300 simulations were conducted for each of the six fault states, and the four parameters (actuator cylinder displacement, system pressure, and the pressure at the right and left end of the actuating cylinder) were sampled. The sampling frequency was set as 0.01 to obtain 1800 samples. Training, validation, and test sets were divided as 8:1:1, respectively. The details of the single-feature and multifeature datasets are shown in Tables 4 and 5, respectively, and the operation environment for the simulation is described in Table 6.

Experimental Model
Zhou [17] analyzed the following three important factors that have an impact on the performance of the CNN: network organization structure, network depth, and feature maps number. On the one hand, increasing the network depth can improve the recognition accuracy; on the other hand, increasing the feature maps number can also improve the recognition accuracy. Therefore, it is necessary to conduct a comparative study separately to determine the final model parameters. From Section 4.1, it is known that the sequence length of a single sample is 3201. The total number of convolutional and pooling layers is n = 12 (excluding the dropout layer). On the basis of Equation (5), the convolution kernel size of the first convolutional layer is 50. From the comparative test, the model with convolution kernel size 50, convolution number 4, and moving step size 1 at the first convolution layer, has the best diagnostic effect. The specific parameters of the traditional 1-DCNN model are shown in Table 7. Due to the limited feature information extraction and inaccurate classification of the traditional 1-DCNN model with a single-feature parameter (actuator cylinder displacement), the three features, e.g., system pressure and the pressure at the right and left end of actuating cylinder, are selected to jointly characterize six failure statuses.
Referring to the traditional 1-DCNN, in 1-DDCNN we initially set the convolution kernel size as 50, the step size as 1, and the expansion factor as l n = 2 n−1 , the calculation formula for the receptive field is: The network structure and initial settings are shown in Figure 6 and Table 8, respectively. The design principle of the model is that the output feature graph size of the last convolution layer is similar to, or exactly the same as, the size of the input data. The proposed 1-DDCNN has the following advantages: firstly, it constructs the convolutional kernel to obtain a larger receptive field and completely retain the feature information in the original signal; secondly, it can act as a dropout layer to prevent over-fitting. Due to the limited feature information extraction and inaccurate classification of the traditional 1-DCNN model with a single-feature parameter (actuator cylinder displacement), the three features, e.g., system pressure and the pressure at the right and left end of actuating cylinder, are selected to jointly characterize six failure statuses.
Referring to the traditional 1-DCNN, in 1-DDCNN we initially set the convolution kernel size as 50, the step size as 1, and the expansion factor as The network structure and initial settings are shown in Figure 6 and Table 8, respectively. The design principle of the model is that the output feature graph size of the last convolution layer is similar to, or exactly the same as, the size of the input data. The proposed 1-DDCNN has the following advantages: firstly, it constructs the convolutional kernel to obtain a larger receptive field and completely retain the feature information in the original signal;secondly, it can act as a dropout layer to prevent over-fitting.    According to Equation (7), once the expansion factor is determined, the parameter that has a decisive influence on the receptive field size is the convolution kernel size. Since the output size of each dilated convolution layer in the 1-DDCNN is 3201 × 1, the convolution kernel size does not affect the output features' size of the dilated convolution layer, but has a great influence on the feature extraction degree of the original data. Therefore, it is necessary to investigate the convolutional kernel size's effect on the classification accuracy. The convolution kernel size was set to 30, 40, 50, 60, and 70 in turn to investigate the diagnostic accuracy of the 1-DDCNN under different conditions, and to determine the final model hyper-parameters. Table 9 and Figure 7 show the detailed diagnosis results for the effect of convolution kernel size on the test samples in each trial. As the convolution kernel size increases, the total training parameters rise, and the model running time expands accordingly. The average accuracy of the 1-DDCNN with different convolution kernel sizes reached more than 90%. In particular, when the convolution kernel size was 40, the highest average accuracy of 99.80% was achieved in five training sessions. When the convolution kernel size was 50, its standard deviation was at least 0.0000, which indicates that the model had the highest stability under this condition.  Figure 8 shows that accuracies were close to 100% at the 10th iteration with convolution kernel size ranging from 30 × 1 to 50 × 1, and fluctuated around 94.5% from the 50th iteration onwards, with convolution kernel size ranging from 60 × 1 to 70 × 1. In fact, when the convolution kernel size is 60 × 1, the accuracy actually dropped. It can be seen from Figure 9 that the loss values within the convolution kernel size range of 30 × 1 to 50 × 1 approach 0 at the 10th iteration. The loss value at convolution kernel size 60 × 1 remained around 0.92 after 10th iteration, which indicates over-fitting. The loss value of the model corresponding to the convolution kernel size 70 × 1 remains around 0.14 after the 40th iteration. In particular, at the 65th iteration, the training and validation loss values at convolution kernel size 40 × 1 were both less than 1.0 × 10 −5 .   Figure 8 shows that accuracies were close to 100% at the 10th iteration with convo lution kernel size ranging from 30 × 1 to 50 × 1, and fluctuated around 94.5% from the 50th iteration onwards, with convolution kernel size ranging from 60 × 1 to 70 × 1. In fact, when the convolution kernel size is 60 × 1, the accuracy actually dropped. It can be seen from Figure 9 that the loss values within the convolution kernel size range of 30 × 1 to 50 × 1 approach 0 at the 10th iteration. The loss value at convolution kernel size 60 × 1 remained around 0.92 after 10th iteration, which indicates over-fitting. The loss value o the modelcorresponding to the convolution kernel size 70 × 1 remains around 0.14 after the 40th iteration. In particular, at the 65th iteration, the training and validation loss values at convolution kernel size 40 × 1 were both less than 1.0 × 10 −5 .
Considering the accuracy, stability, and training cost comprehensively, results are optimal when convolution kernel size is set as 40 × 1.

Comparative Experiment of Three Models under Different Datasets
To show the dilated convolution's advantages in feature extraction and information loss prevention, the comparative experiments of three models, e.g., traditional 1-DCNN, 1-DDCNN, and 1-DDCNN II, with dataset A and dataset B,were conducted. Depending on whether the same size is maintained between the feature map and the input data, two types of convolution operations exist: VALID (without padding) and SAME (with padding) convolution operations. Compared to the 1-DDCNN, the 1-DDCNN II's dilated convolution layer was VALID, and the convolution kernel size was uniformly set to 51 × 1. Inputting the dataset into 1-DDCNN II, the output size of the flattening layer was 3264 x FOR PEER REVIEW 13 of 19 the 40th iteration. In particular, at the 65th iteration, the training and validation loss values at convolution kernel size 40 × 1 were both less than 1.0 × 10 −5 .
Considering the accuracy, stability, and training cost comprehensively, results are optimal when convolution kernel size is set as 40 × 1.

Comparative Experiment of Three Models under Different Datasets
To show the dilated convolution's advantages in feature extraction and information loss prevention, the comparative experiments of three models, e.g., traditional 1-DCNN, 1-DDCNN, and 1-DDCNN II, with dataset A and dataset B,were conducted. Depending on whether the same size is maintained between the feature map and the input data, two types of convolution operations exist: VALID (without padding) and SAME (with padding) convolution operations. Compared to the 1-DDCNN, the 1-DDCNN II's dilated convolution layer was VALID, and the convolution kernel size was uniformly set to 51 × 1. Inputting the dataset into 1-DDCNN II, the output size of the flattening layer was 3264 Considering the accuracy, stability, and training cost comprehensively, results are optimal when convolution kernel size is set as 40 × 1.

Comparative Experiment of Three Models under Different Datasets
To show the dilated convolution's advantages in feature extraction and information loss prevention, the comparative experiments of three models, e.g., traditional 1-DCNN, 1-DDCNN, and 1-DDCNN II, with dataset A and dataset B, were conducted. Depending on whether the same size is maintained between the feature map and the input data, two types of convolution operations exist: VALID (without padding) and SAME (with padding) convolution operations. Compared to the 1-DDCNN, the 1-DDCNN II's dilated convolution layer was VALID, and the convolution kernel size was uniformly set to 51 × 1. Inputting the dataset into 1-DDCNN II, the output size of the flattening layer was 3264 × 1 after six dilated convolution layers, which is slightly larger than the sequence length of the input samples. The model structure is similar to the 1-DDCNN, and specific parameter settings are shown in Table 10. It can be seen from Table 11 and Figure 10 that, compared with dataset A, the diagnostic accuracies of the 1-DCNN, 1-DDCNN, and 1-DDCNN II with dataset B were higher, and the total training parameters and training time increased slightly. The average accuracies of the 1-DDCNN and 1-DDCNN II reached more than 99%, and the standard deviation of both was 0.0045, which indicates that both models are stable. It can be seen from Table 11 and Figure 10 that, compared with dataset A,the diagnostic accuracies of the 1-DCNN, 1-DDCNN, and 1-DDCNN II with dataset B were higher, and the total training parameters and training time increased slightly. The average accuracies of the 1-DDCNN and 1-DDCNN II reached more than 99%, and the standard deviation of both was 0.0045, which indicates that both models are stable.  The training processes and confusion matricesforthe three models are shown in Figures 11-13.
In Figure 13, the rows represent the predicted class (output class) and the columns represent the true class (target class). The diagonal cells represent the observations that  The training processes and confusion matrices for the three models are shown in  idation loss value convergence curve of the 1-DCNN had a significant gap after the previous 20 iterations. In particular, at the 63th iteration, training and validation loss values of the 1-DDCNN with dataset B were both less than1.0 × 10 −5 . In terms of training cost, the iteration number should be set as 63 in subsequent model training.
It can be seen in Figure 13 that the test accuracies of subgraphs b, d, and f are higher than those of the subgraphs a, c, and e. In particular, the test accuracy of the 1-DDCNN (subgraph d) with dataset B reached 100%.  idation loss value convergence curve of the 1-DCNN had a significant gap after the previous 20 iterations. In particular, at the 63th iteration, training and validation loss values of the 1-DDCNN with dataset B were both less than1.0 × 10 −5 . In terms of training cost, the iteration number should be set as 63 in subsequent model training. It can be seen in Figure 13 that the test accuracies of subgraphs b, d, and f are higher than those of the subgraphs a, c, and e. In particular, the test accuracy of the 1-DDCNN (subgraph d) with dataset B reached 100%.   In Figure 13, the rows represent the predicted class (output class) and the columns represent the true class (target class). The diagonal cells represent the observations that were correctly classified. The off-diagonal cells represent incorrectly classified observations. Both the observation number and the percentage of the total observation numbers, are shown in each cell. The far-right column in the plot shows the percentages of all the predicted examples belonging to each class that were correctly and incorrectly classified. These metrics are often called the precision and false rate, respectively. The bottom row in the plot shows the percentages of all the examples belonging to each class that are correctly and incorrectly classified. These metrics are often called the recall (or true positive rate) and false negative rate, respectively. The cell in the bottom right of the plot shows the overall accuracy.
When the 1-DCNN model with dataset A was located at the 100th iteration, the training accuracy was 98.63%, and the training loss value was 0.0441. The validation accuracy was 98.61% and the validation loss value was 0.0609. The confusion matrix corresponding to the test set is shown in Figure 13a; four fault samples caused by "excessive noise from the pump" were incorrectly classified as "normal", and this corresponded to 2.2% of all 180 samples in dataset A. Similarly, six fault samples caused by "throttle valve blocking at left end of actuating cylinder" were incorrectly classified as "excessive noise from the pump", and this corresponded to 3.3% of all data. Overall, 93.9% of the classifications were correct and 6.1% were wrong. It is believed that the potential reason for the identification error was the high similarities between sample sequences of different failure status, as shown in Figure 5a.
It can be seen in Figure 11 that the accuracies of both 1-DDCNNs with dataset B were close to 100% at the 10th iteration, while the accuracy of the traditional 1-DCNN model with dataset A and B did not reach 100%, even after 100 epochs, which indicates that the traditional 1-DCNN model loses a large amount of information during the pooling process. Figure 12 shows the loss values of the 1-DDCNN and 1-DDCNN II with dataset B are both close to 0 at the 10th iteration; the convergence trend of the loss value of the 1-DCNN with dataset B was slower than that of the other models. The training and validation loss value convergence curve of the 1-DCNN had a significant gap after the previous 20 iterations. In particular, at the 63th iteration, training and validation loss values of the 1-DDCNN with dataset B were both less than1.0 × 10 −5 . In terms of training cost, the iteration number should be set as 63 in subsequent model training.
It can be seen in Figure 13 that the test accuracies of subgraphs b, d, and f are higher than those of the subgraphs a, c, and e. In particular, the test accuracy of the 1-DDCNN (subgraph d) with dataset B reached 100%. Figure 14 shows the output features visualization of Conv1 and Flatten layers in the 1-DDCNN with six failure statuses. The feature similarity of the four-channel output of Conv1 was relatively high for the six failure statuses. After convolution and flattening operations, the six failure statuses exhibited unique characteristics, which are conducive to distinguishing the failure status of the model. Figure 14 shows the output features visualization of Conv1 and Flatten layers in the 1-DDCNN with six failure statuses. The feature similarity of the four-channel output of Conv1 was relatively high for the six failure statuses. After convolution and flattening operations, the six failure statuses exhibited unique characteristics, which are conducive to distinguishing the failure status of the model.  Table 3.  Table 3.

Conclusions
Since the 2-DCNN cannot directly process one-dimensional time series data, which often requires complex pre-processing, a novel 1-DDCNN is proposed for landing gear R/E system fault diagnosis in this paper. Dilated convolution can exponentially increase the receptive field of the convolution kernel by adding the convolution layer, which could acquire more redundant information to alleviate the influence of randomness. The displacement of the actuator cylinder was selected as the feature parameter, and the diagnosis classification was carried out on the traditional 1-DCNN model, for which the average diagnosis accuracy reached 91.80%. Due to the limited feature information extraction and inaccurate diagnosis for a single feature in the traditional 1-DCNN, multiple feature parameters are selected to jointly represent the fault and to input into the proposed model for feature integration. The convolution kernel size's influence on classification accuracy is explored. When the convolution kernel size is 50, the model has the highe ststability. The results show that the average diagnostic accuracy of the proposed model is 99.80%, compared with other models.
Future work will be carried out on the following two aspects. Firstly, the system has noise in the actual working environment, and it is necessary to verify the robustness of the proposed model on noisy data. Secondly, this paper only considers the influence of a single parameter, such as oil mixing into the air or actuator leakage, on the system operating state. In the future, the complex situation of the simultaneous failure of multiple internal components, and the consequent effects on the system operating state, should be studied.