Next Article in Journal
SPEEDY Quantum Circuit for Grover’s Algorithm
Next Article in Special Issue
Fusion Information Multi-View Classification Method for Remote Sensing Cloud Detection
Previous Article in Journal
Permeation Effect Analysis of Drug Using Raman Spectroscopy for Iontophoresis
Previous Article in Special Issue
Research on the Lightweight Deployment Method of Integration of Training and Inference in Artificial Intelligence
 
 
Article
Peer-Review Record

SBNN: A Searched Binary Neural Network for SAR Ship Classification

Appl. Sci. 2022, 12(14), 6866; https://doi.org/10.3390/app12146866
by Hairui Zhu, Shanhong Guo *, Weixing Sheng and Lei Xiao
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(14), 6866; https://doi.org/10.3390/app12146866
Submission received: 30 May 2022 / Revised: 1 July 2022 / Accepted: 3 July 2022 / Published: 7 July 2022
(This article belongs to the Special Issue Intelligent Computing and Remote Sensing)

Round 1

Reviewer 1 Report

(1)   The performance of the proposed SBNN model outperforms the existing CNN models and other existing BNN models in terms of both accuracy as well as its complexity, which is not an easy job and demonstrate the good performance of the proposed SBNN model. I am curious about the reason why the proposed SBNN model owns such good performance compared to the other CNN and BNN models. Is the major reason to adopt the NAS technology to obtain a good teacher model to teach the SBNN model?

(2)   I am curious about which framework is used in the model training of the proposed SBNN model. I suggest the authors to give more description on the adopted framework for binary CNN model training.

(3)   For the implementation of the SBNN model, the best choice is the dedicated hardware accelerator or ASIC to exhibit the feature of low complexity and power consumption. I am wondering if there is other way to realize the proposed SBNN  model in addition to the above mentioned dedicated hardware accelerator or ASIC.

(4)   Please describe more details if it is necessary to perform any SAR data pre-processing or post-processing tasks before or after the proposed SBNN.

Author Response

Reviewer #1:

Reviewer #1, Comment #1: The performance of the proposed SBNN model outperforms the existing CNN models and other existing BNN models in terms of both accuracy as well as its complexity, which is not an easy job and demonstrate the good performance of the proposed SBNN model. I am curious about the reason why the proposed SBNN model owns such good performance compared to the other CNN and BNN models. Is the major reason to adopt the NAS technology to obtain a good teacher model to teach the SBNN model?

Author response: Generally speaking, we believe the searched architecture, the binarization tricks, and the training with distributional loss all contribute to the good performance of SBNN. The searched floating point teacher not only teaches SBNN, but also shares its network architecture with SBNN. The proposed SBNN is modified from the searched floating point network with several binarization technologies.

For those counterparts, CNN based SAR classification methods and vision binary networks, we discuss in details:

The Finetuned VGG [1], the Plain CNN [2], and VGG with Hybrid Channel Feature Loss [3] employ outdated backbones without advanced tricks, such as re-parameterization. In our opinion, those methods treat CNN as a pre-defined tool for feature extraction and are not focused on designing network architectures. For example, VGG with Hybrid Channel Feature Loss requires inputs with a size of 224*224, which is the default size of a famous vision dataset, ImageNet. As a result, SBNN outperform those methods for the accuracy and the computational complexity.

On HOG-ShipCLSNet [4] side, we believe their model is not well trained. Firstly, the number of epochs in their training is only 100 which is much lower than common vision training scripts. Secondly, HOG-ShipCLSNet is built with a large number of fully connect layers without reporting special tricks. Those linear neurons are easily over-fitted in a small dataset, such as OpenSARShip.

EfficientNet in Mini Hourglass Region Extraction and Dual-Channel Efficient Fusion Network [5] is a medium size vision architecture, which requires more MADDs and weights than SBNN. On accuracy side, we guess their learnable data pre-processing increases the task difficulty. This method classifies a target after detecting and cropping. The employed detector sub-network is not very reliable. Hence, each target has a possibility to be damaged or given a new relative potion. Notice that, targets are located at the center by default in OpenSARShip.

Vision binary networks, Bi-Real-Net [7], ReActNet [8], and AdamBNN [9], are designed for Imagenet, which is much more difficult than OpenSARShip. The higher task difficulty requires those vision binary networks have a stronger learning capability with more weights. As a result, SBNN outperforms them for model size and computational complexity in the SAR ship classification task. In terms of accuracy, there is no reference about applying vision binary networks on SAR data. Hence, we train those vision binary networks with the same data of SBNN. Bi-Real-Net dose not equip the improved architecture, distribution reshaping, and distributional loss. ReActNet has a problem in training which is reported [9]. Last, about AdamBNN, we agree that a good teacher is the major reason why SBNN owns a higher accuracy than AdamBNN. The searched teacher of SBNN has a better performance than the ResNet teacher of AdamBNN.

We have no ideal about GSESCNNs[10] and SE-LPN-DPFF[6], because some core hyper-parameters of those methods are hidden.

 

Reviewer #1:

Reviewer #1, Comment #2: I am curious about which framework is used in the model training of the proposed SBNN model. I suggest the authors to give more description on the adopted framework for binary CNN model training.

Author response: The searching of the floating point network, the training of the floating point network, and the training of the binary network all are conducted with Pytorch neural network framework. Pytorch was mentioned at the first submitted version of our manuscript.

Following the suggestion from the reviewer, we update the manuscript by replacing the original simple description with more details including the specific version number (Page 12, Line 286-291).

 

Reviewer #1:

Reviewer #1, Comment #3: For the implementation of the SBNN model, the best choice is the dedicated hardware accelerator or ASIC to exhibit the feature of low complexity and power consumption. I am wondering if there is other way to realize the proposed SBNN model in addition to the above mentioned dedicated hardware accelerator or ASIC.

Author response: We agree with the reviewer’s opinion. Nowadays, the best choice for the hardware implementation of SBNN is the dedicated accelerator or Application Specific Integrated Circuit (ASIC).

Neural-network Processing Unit (NPU), which is general neural network accelerator, has a possibility to be a choice in the future. Developers are enhancing the fixed point computing capability and the logical computing capability of NPU, which will benefit binary networks.

 

 

 

Reviewer #1:

Reviewer #1, Comment #4: Please describe more details if it is necessary to perform any SAR data pre-processing or post-processing tasks before or after the proposed SBNN.

Author response: OpenSARShip dataset provides uint8 files, which have been processed and generated from SAR products with the related automatic identification system information. Compared with prior studies using PCA or a sub-network in data pre-processing, SBNN requires a much easier data pre-processing where only center cropping and resizing are employed. Considering the targets are surrounded by a large area of clear background, each sample is processed as follows:

In the three categories classification task, targets share a similar size. To reduce the distortion effect of resizing, targets larger than 128*128 are center cropped to 112*112, and targets smaller than 112*112 are resized to 112*112.

In the six categories dual-polarization classification task, targets are lying on a large range of size, which means the distortion effect of resizing is obvious and unavoidable. All targets are resized to 128*128, and then center cropped to 112*112.

Notice that, the data pre-processing gives a size of 112*112 which is slightly larger than the input of SBNN (100*100). This size is instrumental in applying the data augmentation policy, where random cropping produces data with a size of 100*100.

Following the suggestion from the reviewer, we update the manuscript by adding several new paragraphs to descript the data pre-processing in different tasks (Page 14, Line 321-335).

No data post-processing is required in the three categories task. On the six categories task side, the required data post-processing, decision fusion, has been descripted in the first submitted version of the manuscript. Details about data post-processing are kept and can be found at the resubmitted manuscript. (Page 15, Line 358-367)

 

Reference:

[1] Wang, Y.; Wang, C.; Zhang, H. Ship classification in high-resolution SAR images using deep learning of small datasets. Sensors 2018, 18, 2929.

[2] Hou, X.; Ao, W.; Song, Q.; Lai, J.; Wang, H.; Xu, F. FUSAR-Ship: building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition. Science China Information Sciences 2020, 63, 1–19.

[3] Zeng, L.; Zhu, Q.; Lu, D.; Zhang, T.; Wang, H.; Yin, J.; Yang, J. Dual-polarized SAR ship grained classification based on CNN with hybrid channel feature loss. IEEE Geoscience and Remote Sensing Letters 2021, 19, 1–5.

[4] Zhang, T.; Zhang, X.; Ke, X.; Liu, C.; Xu, X.; Zhan, X.; Wang, C.; Ahmad, I.; Zhou, Y.; Pan, D.; et al. HOG-ShipCLSNet: A novel deep learning network with hog feature fusion for SAR ship classification. IEEE Transactions on Geoscience and Remote Sensing 2021, 60, 1–22.

[5] Xiong, G.; Xi, Y.; Chen, D.; Yu, W. Dual-polarization SAR ship target recognition based on mini hourglass region extraction and dual-channel efficient fusion network. IEEE Access 2021, 9, 29078–29089.

[6] Zhang, T.; Zhang, X. Squeeze-and-excitation Laplacian pyramid network with dual-polarization feature fusion for ship classification in sar images. IEEE Geoscience and Remote Sensing Letters 2021, 19, 1–5.

[7] Liu, Z.; Wu, B.; Luo, W.; Yang, X.; Liu, W.; Cheng, K.T. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In Proceedings of the Proceedings of the European conference on computer vision (ECCV), 2018, pp. 722–737.

[8] Liu, Z.; Shen, Z.; Savvides, M.; Cheng, K.T. Reactnet: Towards precise binary neural network with generalized activation functions. In Proceedings of the European Conference on Computer Vision. Springer, 2020, pp. 143–159.

[9] Liu, Z.; Shen, Z.; Li, S.; Helwegen, K.; Huang, D.; Cheng, K.T. How do adam and training strategies help bnns optimization. In Proceedings of the International Conference on Machine Learning. PMLR, 2021, pp. 6936–6946

[10] Huang, G.; Liu, X.; Hui, J.; Wang, Z.; Zhang, Z. A novel group squeeze excitation sparsely connected convolutional networks for SAR target classification. International Journal of Remote Sensing 2019, 40, 4346–4360.

Author Response File: Author Response.pdf

Reviewer 2 Report

The given submission proposes a neural network implementation for ship classification using synthetic aperture radar, typically used with and required for ocean surveillance. In general, the paper is well-written und follows a technically sound approach towards the description of techniques and processes used. The initial results look promising. My main complaints regarding the submission are:

- I don't see a proper related work section. The authors depict the core principles used within the proposal (NAS, patch shift processing, etc), but don't provide an appropriate discussion of and reflexion towards related publications.
- Numerous of the provided figures are virtually unreadable, due to the extremely small font used in the diagrams. Especially on printed paper, this is absolutely indecipherable.
- The authors describe the general experimental setup, but fail to provide the source-code of the implementation; as always in nowaday's world of IT-related publications, providing the source-code is the only way for other researchers to independently verify the presented results.

Author Response

Reviewer #2:

Reviewer #2, Comment #1: I don't see a proper related work section. The authors depict the core principles used within the proposal (NAS, patch shift processing, etc), but don't provide an appropriate discussion of and reflexion towards related publications.

Author response: Following the suggestion from the reviewer, we update the manuscript by adding a new section to discuss the related work covering Network Architecture Searching, Binary Network, and spatial information processing (Page 3, Line 106-166). The reflexions towards related publications are provided together.

 

Reviewer #2:

Reviewer #2, Comment #2: Numerous of the provided figures are virtually unreadable, due to the extremely small font used in the diagrams. Especially on printed paper, this is absolutely indecipherable.

Author response: Following the suggestion from the reviewer, we update the manuscript by redrawing Figure.1 (Page 4), Figure.2 (Page 6), Figure.3 (Page 8),and Figure.5 (Page 12) with larger fonts.

 

Reviewer #2:

Reviewer #2, Comment #3: The authors describe the general experimental setup, but fail to provide the source-code of the implementation; as always in nowaday's world of IT-related publications, providing the source-code is the only way for other researchers to independently verify the presented results.

Author response: We are supporters of open-source software. However, according to the contract between us and our sponsor, we do not have the permission to distribute the source code of SBNN.

We will contact our sponsor and submit a request about publicly opening source code to them.

Author Response File: Author Response.pdf

Back to TopTop