# Synthetic Aperture Radar Target Recognition with Feature Fusion Based on a Stacked Autoencoder

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Proposed Approach

**X**

_{ZCAWhite}= T

**X**, where T = UP

^{−(1/2)}U

^{T}and U and P are the eigenvectors and eigenvalues of the covariance matrix of

**X**: $\sum =\frac{1}{m}{\displaystyle {\sum}_{i=1}^{m}({x}^{(i)}){({x}^{(i)})}^{T}}$.

## 3. Feature Extraction

#### 3.1. Baseline Features

- (1)
- NumConRegion: the number of connected regions in the binary or dilated binary image.
- (2)
- Area: the total number of pixels with value one in the binary or dilated binary image.
- (3)
- Centroid: the center of the mass of the binary or dilated binary image.
- (4)
- BoundingBox: the smallest rectangle containing the mass of the binary or dilated binary image
- (5)
- MajorLength: the length (in pixels) of the major axis of the ellipse that has the same normalized second central moments as the mass of the binary or dilated binary image.
- (6)
- MinorLength: the length (in pixels) of the minor axis of the ellipse that has the same normalized second central moments as the mass of the binary or dilated binary image.
- (7)
- Eccentricity: the eccentricity of the ellipse that has the same second-moments as the mass of the binary or dilated binary image. The eccentricity is the ratio of the distance between the foci of the ellipse and its major axis length. The value is between 0 and 1.
- (8)
- Orientation: the angle (in degrees ranging from −90 to 90 degrees) between the x-axis and the major axis of the ellipse that has the same second-moments as the mass of the binary or dilated binary image.
- (9)
- ConvexHull: the matrix that specifies the smallest convex polygon that can contain the mass of the binary or dilated binary image. Each row of the matrix contains the x- and y-coordinates of one vertex of the polygon. The first row is selected here to construct the feature vector.
- (10)
- ConvexHullNum: the number of the vertices of the smallest convex polygon that can contain the mass of the binary or dilated binary image.
- (11)
- ConvexArea: the number of the pixels in the convex hull that specifies the smallest convex polygon that can contain the mass of the binary or dilated binary image.
- (12)
- FilledArea: the number of pixels with value one in the Filled image, which is a binary image (logical) of the same size as the bounding box of the mass of the binary or dilated binary image, with all holes filled in.
- (13)
- EulerNumber: the number of objects in the mass of the binary or dilated binary image minus the number of holes in those objects.
- (14)
- Extrema: the matrix that specifies the extrema points in the mass of the binary or dilated binary image. Each row of the matrix contains the x- and y-coordinates of one of the points. The format of the vector is [top-left top-right right-top right-bottom bottom-right bottom-left left-bottom left-top].
- (15)
- EquivDiameter: the diameter of a circle with the same area as the mass of the binary or dilated binary image.
- (16)
- Solidity: the proportion of the pixels in the convex hull that are also in the mass of the binary or dilated binary image.
- (17)
- Extent: the ratio of pixels in the mass of the binary or dilated binary image to pixels in the total bounding box.
- (18)
- Perimeter: the distance between each adjoining pair of pixels around the border of the mass of the binary or dilated binary image.
- (19)
- WeightCentroid: the center of the mass of the binary or dilated binary image based on location and intensity value. This measure is also based on the power-detected SAR chip.
- (20)
- MeanIntensity: the mean of all the intensity values in the mass of the power-detected image as defined by the binary image or the dilated binary image.
- (21)
- MinIntensity: the value of the pixel with the lowest intensity in the mass of the power-detected image as defined by the binary image or the dilated binary image.
- (22)
- MaxIntensity: the value of the pixel with the greatest intensity in the mass of the power-detected image as defined by the binary image or the dilated binary image.
- (23)
- SubarrayIndex: the cell-array containing indices of pixels within the bounding box of the binary image or the dilated binary image. The first and last elements of each cell are selected here to construct the features.

#### 3.2. TPLBP Operators

## 4. Deep Model

#### 4.1. Stacked Autoencoder

**x**to

**a**. Likewise, the hidden layer and output layer constitute a decoder in order to transform

**a**to output signal $\widehat{x}$. It can be expressed as follows:

**W**and $\widehat{W}$ are the weight matrixes of encoder and decoder, respectively. Additionally, $f(\cdot )$ and $g(\cdot )$ are the mapping functions such as sigmod function or tanh function. When $\widehat{x}\approx x$, it is considered that the autoencoder reconstructs the input. For the dataset containing m samples, the cost function is defined as follows [20]:

**x**, then equation ${\widehat{\rho}}_{j}=\frac{1}{m}{\displaystyle \sum _{i=1}^{m}[{a}_{j}{}^{(2)}({x}_{i})]}$ represents average activation of hidden unit. The average activation ${\widehat{\rho}}_{j}$ is set to $\rho $ which is called sparsity parameter and typically has a small value close to 0 [31].

#### 4.2. Softmax Classifier

## 5. Experiments

#### 5.1. The Influence of the SAE Network Structure on Performance

#### 5.2. The Comparison of Different Features

#### 5.3. Comparisons with Other Methods

## 6. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Cong, Y.L.; Chen, B.; Liu, H.W.; Jiu, B. Nonparametric Bayesian Attributed Scattering Center Extraction for Synthetic Aperture Radar Targets. IEEE Trans. Signal Proc.
**2016**, 64, 4723–4736. [Google Scholar] [CrossRef] - Song, S.L.; Xu, B.; Yang, J. SAR Target Recognition via Supervised Discriminative Dictionary Learning and Sparse Representation of the SAR-HOG Feature. Remote Sens. Lett.
**2016**, 8, 863. [Google Scholar] [CrossRef] - Pei, J.F.; Huang, Y.L.; Huo, W.B.; Wu, J.J.; Yang, J.Y.; Yang, H.G. SAR Imagery Feature Extraction Using 2DPCA-Based Two-Dimensional Neighborhood Virtual Points Discriminant Embedding. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2016**, 9, 2206–2214. [Google Scholar] [CrossRef] - Gao, F.; Mei, J.Y.; Sun, J.P.; Wang, J.; Yang, E.F.; Hussain, A. Target detection and recognition in SAR imagery based on KFDA. J. Syst. Eng. Electron.
**2015**, 26, 720–731. [Google Scholar] - El-Darymli, K.; Gill, E.W.; McGuire, P.; Power, D.; Moloney, C. Automatic Target Recognition in Synthetic Aperture Radar Imagery: A State-of-the-Art Review. IEEE Access
**2016**, 4, 6014–6058. [Google Scholar] [CrossRef] - Mangai, U.; Samanta, S.; Das, S.; Chowdhury, P. A Survey of Decision Fusion and Feature Fusion Strategies for Pattern Classification. IETE Tech. Rev.
**2010**, 27, 293–307. [Google Scholar] [CrossRef] - Guan, D.D.; Tang, T.; Zhao, L.J.; Lu, J. A feature combining spatial and structural information for SAR image classification. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium, Milan, Italy, 26–31 July 2015; pp. 4396–4399.
- Kim, S.; Song, W.J.; Kim, S.H. Robust Ground Target Detection by SAR and IR Sensor Fusion Using Adaboost-Based Feature Selection. Sensors
**2016**, 16, 1117. [Google Scholar] [CrossRef] [PubMed] - Zhang, H.S.; Lin, H. Feature Selection for Urban Impervious Surfaces Estimation Using Optical and SAR Images. In Proceedings of the 2015 Joint Urban Remote Sensing Event (JURSE), Lausanne, Switzerland, 30 March–1 April 2015; pp. 1–4.
- Liu, M.J.; Dai, Y.S.; Zhang, J.; Zhang, X.; Meng, J.M.; Xie, Q.C. PCA-based sea-ice image fusion of optical data by HIS transform and SAR data by wavelet transform. Acta Oceanol. Sin.
**2015**, 34, 59–67. [Google Scholar] [CrossRef] - Chaudhary, M.D.; Upadhyay, A.B. Fusion of Local and Global Features using Stationary Wavelet Transform for Efficient Content Based Image Retrieval. In Proceedings of the 2014 IEEE Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, 1–2 March 2014.
- Guo, H.Y.; Wang, J.Q.; Lu, H.Q. Multiple deep features learning for object retrieval in surveillance videos. IET Comput. Vis.
**2016**, 10, 268–272. [Google Scholar] [CrossRef] - Ren, Z.Q.; Deng, Y.; Dai, Q.H. Local visual feature fusion via maximum margin multimodal deep neural network. Neurocomputing
**2016**, 175, 427–432. [Google Scholar] [CrossRef] - Marmanis, D.; Datcu, M.; Esch, T.; Stilla, U. Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks. IEEE Geosci. Remote Sens. Lett.
**2016**, 13, 105–109. [Google Scholar] [CrossRef] - Cui, Z.; Cao, Z.; Yang, J.; Ren, H. Hierarchical Recognition System for Target Recognition from Sparse Representations. Math. Probl. Eng.
**2015**, 2015, 527095. [Google Scholar] [CrossRef] - Zhao, Z.; Jiao, L.; Zhao, J.; Gu, J.; Zhao, J. Discriminant deep belief network for high-resolution SAR image classification. Pattern Recognit.
**2016**, 61, 686–701. [Google Scholar] [CrossRef] - Chen, S.; Wang, H.; Xu, F.; Jin, Y.Q. Target Classification Using the Deep Convolutional Networks for SAR Images. IEEE Trans. Geosci. Remote Sens.
**2016**, 54, 4806–4817. [Google Scholar] [CrossRef] - Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science
**2006**, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] - Ni, J.C.; Xu, Y.L. SAR Automatic Target Recognition Based on a Visual Cortical System. In Proceedings of the 2013 6th International Congress on Image and Signal Processing (CISP), Hangzhou, China, 16–18 December 2013; pp. 778–782.
- Geng, J.; Fan, J.C.; Wang, H.Y.; Ma, X.R.; Li, B.M.; Chen, F.L. High-Resolution SAR Image Classification via Deep Convolutional Autoencoders. IEEE Geosci. Remote Sens. Lett.
**2015**, 12, 2351–2355. [Google Scholar] [CrossRef] - Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2014**, 7, 2094–2107. [Google Scholar] [CrossRef] - Sun, Z.; Xue, L.; Xu, Y. Recognition of SAR target based on multilayer auto-encoder and SNN. Int. J. Innov. Comput. Inf. Control
**2013**, 9, 4331–4341. [Google Scholar] - El-Darymli, K.; Mcguire, P.; Gill, E.W.; Power, D.; Moloney, C. Characterization and statistical modeling of phase in single-channel synthetic aperture radar imagery. IEEE Trans. Aerosp. Electron. Syst.
**2015**, 51, 2071–2092. [Google Scholar] [CrossRef] - Wolf, L.; Hassner, T.; Taigman, Y. Descriptor Based Methods in the Wild. In Proceedings of the Workshop on Faces In ‘real-Life’ images: Detection, Alignment, and Recognition, Marseille, France, 17–20 October 2008.
- Ross, T.D.; Mossing, J.C. MSTAR evaluation methodology. In Proceedings of the AeroSense’99, 1999, International Society for Optics and Photonics, Orlando, FL, USA, 5 April 1999; pp. 705–713.
- LeCun, Y.; Ranzato, M. Deep learning tutorial. In Proceedings of the Tutorials in International Conference on Machine Learning (ICML’13), Atlanta, GA, USA, 16–21 June 2013.
- Chen, Y.-W.; Lin, C.-J. Combining SVMs with various feature selection strategies. In Feature Extraction; Springer: Berlin/Heidelberg, Germany, 2006; pp. 315–324. [Google Scholar]
- Kapur, J.N.; Sahoo, P.K.; Wong, A.K. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process.
**1985**, 29, 273–285. [Google Scholar] [CrossRef] - Fundamentals, M. Dilation and Erosion. Matlab Help Version
**2009**, 7, R2009a. [Google Scholar] - Heikkilä, M.; Pietikäinen, M.; Schmid, C. Description of Interest Regions with Center-Symmetric Local Binary Patterns. In Proceedings of the 5th Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 2006, 13–16 December 2006, Madurai, India; Kalra, P.K., Peleg, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 58–69. [Google Scholar]
- Ng, A. Sparse autoencoder. CS294A Lect. Notes
**2011**, 72, 1–19. [Google Scholar] - Bengio, Y.; Lamblin, P.; Popovici, D.; Larochelle, H. Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst.
**2007**, 19, 153–160. [Google Scholar] - Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature
**1986**, 323, 533–536. [Google Scholar] [CrossRef] - Bridle, J.S. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition. In Neurocomputing: Algorithms, Architectures and Applications; Soulié, F.F., Hérault, J., Eds.; Springer: Berlin/Heidelberg, Germany, 1990; pp. 227–236. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res.
**2008**, 9, 2579–2605. [Google Scholar] - Song, H.; Ji, K.; Zhang, Y.; Xing, X.; Zou, H. Sparse Representation-Based SAR Image Target Classification on the 10-Class MSTAR Data Set. Appl. Sci.
**2016**, 6, 26. [Google Scholar] [CrossRef] - Morgan, D.A.E. Deep convolutional neural networks for ATR from SAR imagery. In Proceedings of the Algorithms for Synthetic Aperture Radar Imagery XXII, Baltimore, MD, USA, 23 April 2015; p. 94750F.

**Figure 3.**Examples of baseline features. (

**a**) A SAR chip processed by energy detection; (

**b**) A binary image of SAR; (

**c**) A dilated binary image of SAR. In (

**b**,

**c**), centroid, boundingbox, extreme and center of gravity of target area were marked in blue, red, green and magenta.

**Figure 5.**The structure of autoencoder and SAE. (

**a**) A three-layers autoencoder; (

**b**) A SAE composed of two autoencoders.

**Figure 6.**The optical images of ten military targets in MSTAR dataset. (

**a**) BMP2-C21; (

**b**) BTR70; (

**c**) T72-132; (

**d**) BRDM2; (

**e**) BTR60; (

**f**) T62; (

**g**) ZSU234; (

**h**) 2S1; (

**i**) D7; (

**j**) ZIL131.

**Figure 8.**The distribution of cascaded features and fused features. (

**a**) The distribution of cascaded features; (

**b**) The distribution of the fused features in the same feature space.

**Figure 9.**The classification accuracy of 10-class targets. (

**a**) Comparison of the classification accuracy of baseline features, TPLBP features and fused features on 10-class targets; (

**b**) Comparison of the classification accuracy of different method including SVM, SRC, CNN, SAE and the proposed method.

Number | 1 | 2 | 3 | 4 | 5 |

Feature Name | NumConRegion | Area | Centroid | BoundingBox | MajorLength |

Number | 6 | 7 | 8 | 9 | 10 |

Feature Name | MinorLength | Eccentricity | Orientation | ConvexHull | ConvexHullNum |

Number | 11 | 12 | 13 | 14 | 15 |

Feature Name | ConvexArea | FilledArea | EulerNumber | Extrema | EquivDiameter |

Number | 16 | 17 | 18 | 19 | 20 |

Feature Name | Solidity | Extent | Perimeter | WeightCentroid | MeanIntensity |

Number | 21 | 22 | 23 | ||

Feature Name | MinIntensity | MaxIntensity | SubarrayIndex |

Targets | BMP2 | BTR70 | T72 | BTR60 | 2S1 | BRDM2 | D7 | T62 | ZIL131 | ZSU234 | Total |
---|---|---|---|---|---|---|---|---|---|---|---|

17° | 233 | 233 | 232 | 256 | 299 | 298 | 299 | 299 | 299 | 299 | 2747 |

15° | 196 | 196 | 196 | 195 | 274 | 274 | 274 | 273 | 274 | 274 | 2426 |

Categories | BMP2 | BTR70 | T72 | BTR60 | 2S1 | BRDM2 | D7 | T62 | ZIL131 | ZSU234 | Classification Accuracy (%) |
---|---|---|---|---|---|---|---|---|---|---|---|

Baseline | 72.96 | 78.57 | 89.80 | 81.54 | 99.64 | 86.13 | 95.62 | 96.70 | 98.54 | 97.08 | 90.81 |

TPLBP | 61.73 | 77.04 | 85.20 | 78.97 | 92.34 | 97.81 | 98.54 | 97.43 | 98.54 | 88.32 | 89.19 |

Fused | 88.27 | 90.81 | 90.31 | 90.26 | 98.18 | 97.08 | 98.54 | 98.17 | 99.27 | 97.08 | 95.43 |

Features | Baseline | TPLBP | Cascaded | Fused |
---|---|---|---|---|

Classification accuracy (%) | 85.70 | 83.80 | 91.01 | 93.61 |

Algorithms | SVM [36] | SRC [36] | CNN [37] | SAE | Proposed Method |
---|---|---|---|---|---|

Classification accuracy (%) | 86.73 | 89.76 | 92.30 | 93.61 | 95.43 |

SAE trained on-images | BMP2 | BTR70 | T72 | BTR60 | 2S1 | BRDM2 | D7 | T62 | ZIL131 | ZSU234 | |

BMP2 | 88.3 | 2.1 | 0.5 | 0.5 | 6.1 | 0.5 | 1.0 | 1.0 | 0.0 | 0.0 | |

BTR70 | 0.5 | 92.9 | 0.0 | 0.5 | 5.6 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | |

T72 | 0.5 | 0.5 | 90.3 | 1.0 | 3.7 | 0.0 | 0.5 | 2.0 | 1.0 | 0.5 | |

BTR60 | 1.5 | 4.2 | 0.5 | 87.7 | 5.1 | 0.0 | 0.5 | 0.5 | 0.0 | 0.0 | |

2S1 | 0.0 | 1.8 | 0.4 | 0.0 | 90.5 | 1.8 | 0.4 | 0.7 | 4.4 | 0.0 | |

BRDM2 | 1.0 | 0.4 | 0.4 | 0.4 | 0.0 | 96.7 | 0.7 | 0.0 | 0.0 | 0.4 | |

D7 | 0.0 | 0.0 | 0.0 | 0.4 | 0.0 | 0.7 | 98.5 | 0.4 | 0.0 | 0.0 | |

T62 | 0.0 | 0.4 | 0.4 | 0.4 | 0.4 | 0.7 | 0.7 | 90.5 | 0.4 | 6.1 | |

ZIL131 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 0.4 | 99.2 | 0.0 | |

ZSU234 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.8 | 1.1 | 0.4 | 96.7 | |

The proposed method | BMP2 | BTR70 | T72 | BTR60 | 2S1 | BRDM2 | D7 | T62 | ZIL131 | ZSU234 | |

BMP2 | 88.3 | 1.5 | 4.1 | 4.6 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.5 | |

BTR70 | 1.0 | 90.8 | 0.5 | 7.2 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | |

T72 | 7.7 | 0.5 | 90.3 | 1.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |

BTR60 | 1.0 | 5.1 | 0.5 | 90.3 | 0.5 | 2.6 | 0.0 | 0.0 | 0.0 | 0.0 | |

2S1 | 0.0 | 0.0 | 0.0 | 0.0 | 98.2 | 0.0 | 0.3 | 1.5 | 0.0 | 0.0 | |

BRDM2 | 2.1 | 0.4 | 0.0 | 0.4 | 0.0 | 97.1 | 0.0 | 0.0 | 0.0 | 0.0 | |

D7 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 98.5 | 0.4 | 0.7 | 0.4 | |

T62 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.7 | 98.2 | 1.1 | 0.0 | |

ZIL131 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 0.4 | 99.2 | 0.0 | |

ZSU234 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 0.0 | 1.8 | 0.7 | 0.0 | 97.1 |

Consumed Time | Training Time (s) | Testing Time (s) |
---|---|---|

SAE trained on images | 4254.53 | 3.19 |

Proposed method | 340.69 | 0.044 |

© 2017 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kang, M.; Ji, K.; Leng, X.; Xing, X.; Zou, H. Synthetic Aperture Radar Target Recognition with Feature Fusion Based on a Stacked Autoencoder. *Sensors* **2017**, *17*, 192.
https://doi.org/10.3390/s17010192

**AMA Style**

Kang M, Ji K, Leng X, Xing X, Zou H. Synthetic Aperture Radar Target Recognition with Feature Fusion Based on a Stacked Autoencoder. *Sensors*. 2017; 17(1):192.
https://doi.org/10.3390/s17010192

**Chicago/Turabian Style**

Kang, Miao, Kefeng Ji, Xiangguang Leng, Xiangwei Xing, and Huanxin Zou. 2017. "Synthetic Aperture Radar Target Recognition with Feature Fusion Based on a Stacked Autoencoder" *Sensors* 17, no. 1: 192.
https://doi.org/10.3390/s17010192