A Residual Optronic Convolutional Neural Network for SAR Target Recognition

Gu, Ziyu; Huang, Zicheng; Lu, Xiaotian; Zhang, Hongjie; Kuang, Hui

doi:10.3390/photonics12070678

Open AccessArticle

A Residual Optronic Convolutional Neural Network for SAR Target Recognition

by

Ziyu Gu

^1,*,†,

Zicheng Huang

^2,†,

Xiaotian Lu

^1,†,

Hongjie Zhang

^1,† and

Hui Kuang

^1,†

¹

Chinese Academy of Space Technology, Beijing 100048, China

²

Shanghai Jiao Tong University, Shanghai 200030, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Photonics 2025, 12(7), 678; https://doi.org/10.3390/photonics12070678

Submission received: 30 March 2025 / Revised: 14 May 2025 / Accepted: 26 May 2025 / Published: 5 July 2025

Download

Browse Figures

Versions Notes

Abstract

Deep learning (DL) has shown great capability in remote sensing and automatic target recognition (ATR). However, huge computational costs and power consumption are challenging the development of current DL methods. Optical neural networks have recently been proposed to provide a new mode to replace artificial neural networks. Here, we develop a residual optronic convolutional neural network (res-OPCNN) for synthetic aperture radar (SAR) recognition tasks. We implement almost all computational operations in optics and significantly decrease the network computational costs. Compared with digital DL methods, res-OPCNN offers ultra-fast speed, low computation complexity, and low power consumption. Experiments on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset demonstrate the lightweight nature of the optronic method and its feasibility for SAR target recognition.

Keywords:

optronic convolutional neural network; low computational complexity; residual connection; SAR recognition

1. Introduction

In recent decades, with the development of synthetic aperture radar (SAR) information processing, many significant research studies with endless algorithms emerged in automatic target recognition (ATR) [1]. Recently, deep learning (DL) has demonstrated strong capability in SAR image processing fields and automatic target recognition (ATR) tasks by extracting the target feature through deep neural networks [2,3,4,5,6]. However, due to data explosion, data-driven deep learning methods encountered huge challenges in processing speed and the consumption of power. At present, the implementation of artificial intelligence algorithms represented by convolutional neural networks mainly relies on the hardware architecture, with electronics as the physical carrier [7,8,9,10,11]. The development of digital processors relies on the performance of chips, and the manufacturing process of chips is restricted by Moore’s Law. Therefore, the limitations of processor performance gradually emerge. According to Moore’s Law, the number of transistors that can be accommodated on an integrated circuit doubles approximately every 18 to 24 months. In other words, the performance of electronic computer processors only doubles approximately every two years. However, since 2012, the floating-point computing volume of convolutional neural networks has grown rapidly at an exponential rate, thus requiring the performance of electronic computer processors to double every 3.4 months. The increasing demand for computing power is far from the actual growth rate of processors, and the actual computing power of computer hardware will also be unable to achieve the real-time processing of large-scale neural networks. Hence, huge demands for replacing electronic computing techniques with alternative modes motivated the emergence of computing in optical platforms [12,13,14].

Recently, the optical neural network (ONN) has proved to be an effective technique to implement deep learning methods in optical platforms. Compared with digital processing, optical processing provides ultra-fast processing speed, low power consumption, and parallel processing characteristics. Researchers have successfully implemented some ONNs. In 1978, the optical vector matrix multiplier proposed by J.W. Goodman provided the basis for the development of optical matrix multiplication and spatial optical neural networks [15]. In digital neural networks, the part that consumes the most energy and takes the most time is matrix multiplication. However, matrix multiplication can be performed at the speed of light in optical neural networks. By introducing nonlinear optical components, nonlinear modulation in digital neural networks can also be achieved through optical methods in optical neural networks. Therefore, the trained optical neural network is capable of performing all-optical processing calculations without additional energy input. Lin et al. first proposed a diffractive deep neural network (D2NN). Then, Fourier-space diffractive DNNs and all-optical neural networks were proposed successively by Yan and Zuo [16,17,18,19,20]. In 2017, Marin Soljačić et al. published an on-chip integrated optical neural network based on 56 programmable Mach–Zehnder interferometers (MZIs). In the same year, Paul R. Prucnal et al. published an on-chip silicon-based optical recurrent neural network based on a parallel cascaded Microring Resonator (MRR) structure [21,22,23,24,25,26,27]. In our previous work, we proposed an optronic convolutional neural network (OPCNN) using optical lens 4f systems to perform optical convolution and implemented the OPCNN for SAR target classification [28,29]. However, the methods above all face the same problem: the gradient calculation on both the phase and amplitude components might cause the gradient vanishing and exploding problems, challenging the loss of gradient information and instability in network training. [30]

In this work, we propose a residual optronic convolutional neural network (res-OPCNN) to overcome such difficulty and develop an opto-electronic method for SAR ATR. Specifically, we design the Encoder/Classifier optronic convolutional structure with residual connections. In the Encoder, we implement optical convolutional layers to extract the image features. We utilize several trainable light shortcut connections between different optical convolutional layers, providing direct optical field mapping by skipping possible convolutional layers. Moreover, we utilize optical lens demagnification systems to implement down-sampling in optics. In the Classifier, a global average pooling (GAP) layer is implemented to transform the extracted features of hidden layers into the output layer [31]. The airy disk’s light intensity detected by the camera represents the classification confidence of all potential classes. Here, the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset is used as the target to test the performance of res-OPCNN on SAR ATR tasks. According to the experimental results, res-OPCNN obtained 95.3% classification accuracy with 4% computational costs, compared with the digital CNN. res-OPCNN implements almost all computational operations in optics. Moreover, power consumption is decreased by more than 75%, compared with digital DL methods operated on computer servers. res-OPCNN demonstrates the feasibility and efficiency of opto-electronic methods and provides a new lightweight technique for SAR target recognition tasks. Compared with digital methods, res-OPCNN has the advantage of lower computational operation, lower processing time, and lower power consumption. We can decrease operation complexity, system processing latency, and power requirements through res-OPCNN in optical applications.

2. Method

In res-OPCNN, the Encoder and Classifier are designed in the res-OPCNN structure, and the architecture of res-OPCNN is shown in Figure 1. The Encoder aims to extract the features of the input image and comprises optical convolutional layers with residual connection and down-sampling layers. The GAP layer is used in the Classifier for transforming the extracted features in the hidden layers into the output layer. The confidence vector with the highest light intensity value represents the classification result.

2.1. The Encoder

2.1.1. Optical Convolutional Layer with Residual Connection

In the Encoder, we implement optical convolutional layers to extract the features of the input image. Here, to avoid the gradient vanishing and exploding problems, we design an optical residual connection to connect two optical convolutional layers to provide direct optical field mapping by skipping one convolutional layer. In res-OPCNN, we implement the beam splitter cubes and reflecting mirrors to obtain the optical residual connection. We demonstrate the optical convolutional layers with a residual connection in Figure 2a. The output light field

L_{o u t}

through the residual connection comprises the incident light field through free-space diffractive propagation and the convolutional results by two optical convolutional layers, which can be expressed as follows:

L_{o u t} = L_{i n} ⊙ h_{k} + \tilde{F} (L_{i n})

(1)

where

h_{k}

indicates the free-space diffractive propagation kernel for the incident light field and

\tilde{F}

indicates the complex transform function of the double optical convolutional layers. The free-space diffractive propagation kernel

h_{k}

can also be characterized as a linear spatial frequency domain filter with propagation

d_{k}

:

H_{k} (f_{X}, f_{Y}) = e x p [j \frac{2 π d_{k}}{λ} \sqrt{1 - {(λ f_{X})}^{2} - {(λ f_{Y})}^{2}}]

(2)

where

H_{k} (f_{X}, f_{Y})

denotes the frequency domain spectrum of the free-space propagation kernel

h_{k}

and

d_{k}

is the free-space propagation distance.

λ

is the light wavelength, and

(f_{X}, f_{Y})

denote the frequency spectrum coordinates. The complex transform function

\tilde{F}

comprises two optical convolutional layers (OP-Conv1 and OP-Conv2), including twice convolution and nonlinear activation.

In the optical convolutional layer, two-dimensional convolution is implemented through the optical Fourier transform. To be specific, the spatial light modulator (SLM) and optical lens 4f system is established to implement the Fourier transform in optical applications. As shown in Figure 2b, the convolution of two matrices is changed into the matrix product of the frequency spectrum and the time spectrum. The phase information of the kernel is set as the trainable parameter, and the Fourier transform is performed through the lens. The phase-only SLM placed at the 2f position loads the phase kernel, and the camera placed at the 4f position collects the convolutional result. Here, as shown in Figure 2b, we load the input image

I (x_{f}, y_{f}

on the amplitude-only SLM and each phase kernel

K_{j} (f_{x}, f_{y})

on the phase-only SLM. The convolutional results

O_{j} (x_{f}, y_{f})

can be expressed as follows:

O_{j} (x_{f}, y_{f}) = F^{- 1} [F [I (x_{f}, y_{f})] \cdot K_{j} (f_{x}, f_{y})]

(3)

where

(\cdot)

and

F

represent the two-dimensional element-wise matrix product and the optical Fourier transform, respectively.

I (x_{f}, y_{f})

,

K_{j} (f_{x}, f_{y})

, and

O_{j} (x_{f}, y_{f})

represent the input light field, the

j^{t h}

channel phase kernel, and the

j^{t h}

channel convolutional output results.

(x_{f}, y_{f})

denote the spatial coordinates, and

(f_{x}, f_{y})

denote the spatial frequency spectrum coordinates. Here, we set the phase kernel modulation values varying from 0 to 2

π

:

\begin{matrix} K (f_{x}, f_{y}) & = e x p [2 π (f_{x}, f_{y})] = e x p [\frac{2 π}{λ f} (x_{f}, x_{y})] \end{matrix}

(4)

where

λ

denotes the light wavelength and f denotes the focal length of the optical lens 4f system.

2.1.2. Optical Down-Sampling Layer

In res-OPCNN, the optical down-sampling layer is utilized to decrease the spatial dimension of the extracted features. Moreover, reducing spatial dimension can improve computational efficiency and save the network parameters. The optical demagnification system is used in this work to perform strided convolution and is constructed with the down-sampling layer. The optical demagnification system comprises a set of lenses with a focal length of 2/1. We demonstrate the optical down-sampling layer in Figure 3. According to the Fresnel diffraction theory and lens imaging system, the input light field

S_{i} (x, y)

and the output light field

S_{o} (x, y)

have a coordinated transformation relationship as follows:

S_{o} (x, y) \approx \frac{A^{2}}{- λ^{2} f_{1} f_{2}} S_{i} (- \frac{f_{1}}{f_{2}} x, - \frac{f_{1}}{f_{2}} y)

(5)

where

S_{i} (x, y)

and

S_{o} (x, y)

represent the input and output light fields of the optical lens demagnification system.

\frac{A^{2}}{- λ^{2} f_{1} f_{2}}

represents the irrelevant terms. Therefore, if we ignore the coefficient terms, the output image of the system is the flipped image of the input signal, but the coordinates are changed with the transformation

f_{2} / f_{1}

. In res-OPCNN, we utilize L1 and L2 with focal lengths (

f_{1} = 100

mm,

f_{2} = 50

mm) to form the optical lens demagnification system. The system’s output result demonstrates the flipped input with down-sampling at two-pixel intervals.

2.2. The Classifier

In res-OPCNN, the Classifier aims to transform the extracted features in the hidden layer into the output layer. Here, we first implement one optical convolutional layer to decrease the feature maps’ multiple channels to the target class ten channels. Then, the global average pooling (GAP) layer is implemented to transform the ten channel feature maps into ten classification confidence vectors. The GAP layer aims to consider the average intensity information of the whole image. Specifically, we utilize the optical lens 2f system to construct the GAP layer in optics. As shown in Figure 4, we implement a collimated lens to focus the input light at the focal point and utilize a camera to collect the Airy disk in the central region. The Airy disk of the input light in the zero-order frequency spectrum represents the summation of the input light field components:

\begin{matrix} F (x, y) = \int \int t (m, n) e x p [- j \frac{2 π}{λ f} (m x + n y)] d_{m} d_{n} \\ I n t e n s i t y_{A i r y} \propto F (0, 0) = \int \int t (m, n) d_{m} d_{n} \end{matrix}

(6)

where

t (m, n)

and

F (x, y)

represent the input matrix and the Airy disk on the focal length plane of the lens. We calculate the light intensity of the Airy disk for the ten channel feature matrices as the ten confidence values of the target class vector for the res-OPCNN Classifier output.

3. Experimental Results

Here, we apply res-OPCNN on optical platforms. We demonstrate the detailed res-OPCNN architecture in Figure 5. Specifically, we design the optical techniques by using an amplitude-only SLM (Holoeye Pluto, with a pixel pitch of 8

μ

m) as the input plane to encode the target images. We utilize a laser generator (10 mW, 532 nm) to produce coherent light and expand it into a collimated beam through a beam expander. As shown in Figure 5, we implement two optical convolutional layers in the Encoder and utilize the reflecting mirror to obtain the residual connection. The beam splitter cube combines the incident light and the convolutional results as the output light, which is detected and collected by an sCMOS camera (Hamamatsu C14440; 2048 × 2048 pixels, 6.5

μ

m). Moreover, we implement the optical demagnification lens system to construct the down-sampling layer, which consists of two lenses with different focal lengths. In the Encoder, the output feature channel number after the optical convolutional layers is 40. In the Classifier, we first utilize the optical convolutional layer to decrease the feature channels from 40 to 10, which is the target class number. Then, we implement the optical GAP layer to transform the feature map into the confidence vector. We utilize a collimated lens to focus the incident light and then collect the Airy disk with an sCMOS camera. We then calculate the light intensity of the ten channels, and the lightest channel represents the classification result.

For res-OPCNN, we utilized the MSTAR dataset as the recognition target, including 2747 images for training and 2425 images for validation. The training process was implemented on a computer. First, we set the phase of the frequency spectrum as the training variable, and we did not perform the quantization of the phase value in the training stage. When training our networks, we considered the wavelength, focal lengths of lenses, and pixel pitch of the SLMs in our training code. We simulated the optical part by utilizing the fast Fourier transform algorithm and angular spectrum propagation. After we obtained a well-trained phase matrix, we had to quantize the phase value to a gray value. The SLMs were used for addressing 8-bit gray-level patterns. Hence, the trained weight had to be quantized to 256 gray levels in advance before encoding. For comparison, we trained a digital convolutional network with a similar structure to res-OPCNN which was composed of the same convolutional layer, residual connection, down-sampling layer, and global average pooling layer. The difference was that all the calculations of the matrix were executed digitally. We deployed both the training res-OPCNN and digital network on the same computer server with two graphics processing units (NVIDIA A6000) by using pytorch and python 3.8. The training process involved 50 batch sizes and 100 epochs, taking up to 40 h. We utilized the stochastic gradient descent (SGD) optimizer with a

2.5 \times 10^{- 4}

initial learning rate as the loss minimization function until the model converged.

Here, we validate res-OPCNN’s target recognition performance on the MSTAR dataset. The loss function and recognition accuracy during the training process are demonstrated in Figure 6a. res-OPCNN achieves approximately 95.3% recognition accuracy on the validation targets from the MSTAR dataset. We illustrate the confusion matrix of res-OPCNN’s target recognition performance in Figure 6b. Compared with the digital CNN, res-OPCNN requires only 4% computational costs. Moreover, res-OPCNN has lower power consumption. Since most of the computational operations of res-OPCNN are performed in optics, the main power consumption mainly comprises the laser generator, SLM power, and camera power. The overall power consumption of res-OPCNN could be at most 250 W, while the digital CNN could cost almost 1000 W in power consumption by the computer servers. We list the validation results, computational costs, and power consumption of res-OPCNN and the digital CNN in Table 1. Hence, res-OPCNN has better classification accuracy, lower computational complexity, and lower power consumption.

4. Conclusions and Discussion

This paper proposed a residual optronic convolutional neural network (res-OPCNN) for SAR target recognition. We developed the Encoder/Classifier optronic structure with a residual connection. res-OPCNN implements almost all computational operations in optics, significantly reducing the network computational costs and power consumption. We use MSTAR dataset objects as the classification targets to validate res-OPCNN’s recognition performance. Compared with the digital CNN, res-OPCNN achieved 95.3% classification accuracy with 4% computational costs. Moreover, the power consumption of the optical computing platform is decreased by more than 75% compared with digital DL methods operated on computer servers. The recognition results on the MSTAR dataset demonstrate the efficiency and lightweight nature of the proposed res-OPCNN for SAR ATR tasks.

However, for convenience of application, we also need to improve our method in future works. Considered the low SNR (signal to noise ratio) and speckle noise in the MSATR dataset, we need to take measures for the suppression of speckle noise. Also, we need to explore another down-sampling method with high robustness against misalignment in the optical system. Although the implementation of strided convolution can improve the performance of res-OPCNN, it is sensitive to position variance, which influences the portability of res-OPCNN. Finally, we also need to make the structure of res-OPCNN more miniaturized and integrated for the convenience of extension and portability. It is the key to the application of res-OPCNN and also the development tendency of ONNs.

Author Contributions

Conceptualization, Z.G.; Methodology, Z.H.; Validation, Z.G.; Resources, X.L.; Data curation, H.Z.; Writing—original draft, Z.H.; Funding acquisition, H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research study was supported by National Nature Science Foundation of China (U2341202 and 62205371), as well as the Beijing Nova Program (20230484285).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Authors Ziyu Gu, Xiaotian Lu, Hongjie Zhang and Hui Kuang were employed by the company Chinese Academy of Space Technology. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
Shi, M.; Gao, Y.; Chen, L.; Liu, X. Dual-Branch Multiscale Channel Fusion Unfolding Network for Optical Remote Sensing Image Super-Resolution. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6519105. [Google Scholar] [CrossRef]
Shi, M.; Gao, Y.; Chen, L.; Liu, X. Dual-Resolution Local Attention Unfolding Network for Optical Remote Sensing Image Super-Resolution. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6016105. [Google Scholar] [CrossRef]
Shi, M.; Gao, Y.; Chen, L.; Liu, X. Double Prior Network for Multidegradation Remote Sensing Image Super-Resolution. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3131–3147. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25, Lake Tahoe, NV, USA, 3–6 December 2012. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 770–778. [Google Scholar]
Jouppi, N.P.; Young, C.; Patil, N.; Patterson, D.; Agrawal, G.; Bajwa, R.; Bates, S.; Bhatia, S.; Boden, N.; Borchers, A.; et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture, Toronto, ON, Canada, 24–28 June 2017; pp. 1–12. [Google Scholar]
Chen, T.; Du, Z.; Sun, N.; Wang, J.; Wu, C.; Chen, Y.; Temam, O. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. Acm Sigarch Comput. Archit. News 2014, 42, 269–284. [Google Scholar] [CrossRef]
Zhang, S.; Du, Z.; Zhang, L.; Lan, H.; Liu, S.; Li, L.; Guo, Q.; Chen, T.; Chen, Y. Cambricon-X: An accelerator for sparse neural networks. In Proceedings of the 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE, Taipei, Taiwan, 15–19 October 2016; pp. 1–12. [Google Scholar]
Pei, J.; Deng, L.; Song, S.; Zhao, M.; Zhang, Y.; Wu, S.; Wang, G.; Zou, Z.; Wu, Z.; He, W.; et al. Towards artificial general intelligence with hybrid Tianjic chip architecture. Nature 2019, 572, 106–111. [Google Scholar] [CrossRef]
Merolla, P.A.; Arthur, J.V.; Alvarez-Icaza, R.; Cassidy, A.S.; Sawada, J.; Akopyan, F.; Jackson, B.L.; Imam, N.; Guo, C.; Nakamura, Y.; et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 2014, 345, 668–673. [Google Scholar] [CrossRef]
Waldrop, M.M. More than moore. Nature 2016, 530, 144–148. [Google Scholar] [CrossRef]
Moore, J.F. Predators and prey: A new ecology of competition. Harv. Bus. Rev. 1993, 71, 75–86. [Google Scholar]
Lin, X.; Yang, W.; Wang, K.L.; Zhao, W. Two-dimensional spintronics for low-power electronics. Nat. Electron. 2019, 2, 274–283. [Google Scholar] [CrossRef]
Goodman, J.W.; Dias, A.R.; Woody, L.M. Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms. Opt. Lett. 1978, 2, 1–3. [Google Scholar] [CrossRef] [PubMed]
Lin, X.; Rivenson, Y.; Yardimci, N.T.; Veli, M.; Luo, Y.; Jarrahi, M.; Ozcan, A. All-optical machine learning using diffractive deep neural networks. Science 2018, 361, 1004–1008. [Google Scholar] [CrossRef] [PubMed]
Mengu, D.; Luo, Y.; Rivenson, Y.; Ozcan, A. Analysis of diffractive optical neural networks and their integration with electronic neural networks. IEEE J. Sel. Top. Quantum Electron. 2019, 26, 1–14. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Mengu, D.; Luo, Y.; Rivenson, Y.; Ozcan, A. Class-specific differential detection in diffractive optical neural networks improves inference accuracy. Adv. Photonics 2019, 1, 046001. [Google Scholar] [CrossRef]
Yan, T.; Wu, J.; Zhou, T.; Xie, H.; Xu, F.; Fan, J.; Fang, L.; Lin, X.; Dai, Q. Fourier-space diffractive deep neural network. Phys. Rev. Lett. 2019, 123, 023901. [Google Scholar] [CrossRef]
Zuo, Y.; Li, B.; Zhao, Y.; Jiang, Y.; Chen, Y.-C.; Chen, P.; Jo, G.-B.; Liu, J.; Du, S. All-optical neural network with nonlinear activation functions. Optica 2019, 6, 1132–1137. [Google Scholar] [CrossRef]
Tait, A.N.; De Lima, T.F.; Zhou, E.; Wu, A.X.; Nahmias, M.A.; Shastri, B.J.; Prucnal, P.R. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 2017, 7, 7430. [Google Scholar] [CrossRef]
Zhang, H.; Gu, M.; Jiang, X.D.; Thompson, J.; Cai, H.; Paesani, S.; Santagati, R.; Laing, A.; Zhang, Y.; Yung, M.H.; et al. An optical neural chip for implementing complex-valued neural network. Nat. Commun. 2021, 12, 457. [Google Scholar] [CrossRef] [PubMed]
Tahersima, M.H.; Kojima, K.; Koike-Akino, T.; Jha, D.; Wang, B.; Lin, C.; Parsons, K. Deep neural network inverse design of integrated photonic power splitters. Sci. Rep. 2019, 9, 1368. [Google Scholar] [CrossRef]
Shen, Y.; Harris, N.C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 2017, 11, 441–446. [Google Scholar] [CrossRef]
Hughes, T.W.; Minkov, M.; Shi, Y.; Fan, S. Training of photonic neural networks through in situ backpropagation and gradient measurement. Optica 2018, 5, 864–871. [Google Scholar] [CrossRef]
Jiao, S.; Liu, J.; Zhang, L.; Yu, F.; Zuo, G.; Zhang, J.; Zhao, F.; Lin, W.; Shao, L. All-optical logic gate computing for high-speed parallel information processing. Opto-Electron. Sci. 2022, 1, 220010. [Google Scholar] [CrossRef]
Qiu, C.; Xiao, H.; Wang, L.; Tian, T. Recent advances in integrated optical directed logic operations for high performance optical computing: A review. Front. Optoelectron. 2022, 15, 1. [Google Scholar] [CrossRef] [PubMed]
Gu, Z.; Huang, Z.; Gao, Y.; Liu, X. Training optronic convolutional neural networks on an optical system through backpropagation algorithms. Optics Express 2022, 30, 19416–19440. [Google Scholar] [CrossRef]
Gu, Z.; Shi, M.; Huang, Z.; Gao, Y.; Liu, X. In-Situ Training Optronic Convolutional Neural Network for SAR Target Recognition. In Proceedings of the IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 3363–3366. [Google Scholar]
Dou, H.; Deng, Y.; Yan, T.; Wu, H.; Lin, X.; Dai, Q. Residual D2NN: Training diffractive deep neural networks via learnable light shortcuts. Opt. Lett. 2020, 45, 2688–2691. [Google Scholar] [CrossRef]
Gu, Z.; Gao, Y.; Liu, X. Position-robust optronic convolutional neural networks dealing with images position variation. Opt. Commun. 2022, 505, 127505. [Google Scholar] [CrossRef]

Figure 1. Overall architecture of res-OPCNN.

Figure 2. The implementation of optical convolutional layers with a residual connection. (a) OP-Conv1 and OP-Conv2 are connected with residual connections. (b) The structure of the optical convolutional layer.

Figure 3. The structure of the optical down-sampling layer.

Figure 4. The structure of the Classifier. The optical convolutional layer reduces the input channels to 10 channels. The GAP layer transforms the input features into the Airy disk confidence vector.

Figure 5. The optical implementation of res-OPCNN.

Figure 6. (a) Training loss and classification accuracy curve of res-OPCNN. (b) Confusion matrix of res-OPCNN for MSTAR target recognition.

Table 1. Comparison for digital CNN and res-OPCNN.

Method	Power Consumption	Computational Complexity	Accuracy
Digital CNN	1000 W	18.56 MMac	95.1%
res-OPCNN	250 W	748.6 KMac	95.3%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gu, Z.; Huang, Z.; Lu, X.; Zhang, H.; Kuang, H. A Residual Optronic Convolutional Neural Network for SAR Target Recognition. Photonics 2025, 12, 678. https://doi.org/10.3390/photonics12070678

AMA Style

Gu Z, Huang Z, Lu X, Zhang H, Kuang H. A Residual Optronic Convolutional Neural Network for SAR Target Recognition. Photonics. 2025; 12(7):678. https://doi.org/10.3390/photonics12070678

Chicago/Turabian Style

Gu, Ziyu, Zicheng Huang, Xiaotian Lu, Hongjie Zhang, and Hui Kuang. 2025. "A Residual Optronic Convolutional Neural Network for SAR Target Recognition" Photonics 12, no. 7: 678. https://doi.org/10.3390/photonics12070678

APA Style

Gu, Z., Huang, Z., Lu, X., Zhang, H., & Kuang, H. (2025). A Residual Optronic Convolutional Neural Network for SAR Target Recognition. Photonics, 12(7), 678. https://doi.org/10.3390/photonics12070678

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Residual Optronic Convolutional Neural Network for SAR Target Recognition

Abstract

1. Introduction

2. Method

2.1. The Encoder

2.1.1. Optical Convolutional Layer with Residual Connection

2.1.2. Optical Down-Sampling Layer

2.2. The Classifier

3. Experimental Results

4. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI