Building Footprint Extraction from Multispectral, Spaceborne Earth Observation Datasets Using a Structurally Optimized U-Net Convolutional Neural Network
Round 1
Reviewer 1 Report
In this paper, the author ensemble different U-Net architectures and figure out which one has the best performance on improving building detection accuracy in multi-spectral satellite images. I consider this result absorbs the advantages of U-net and this architecture can get a great application to damage mapping. However, some expressions in the paper are not very clear.
Major issues with this manuscript:
The innovation of this article is not clear. In my point of view, the method just uses the U-net framework and changes some parameters to get a better result, the network structure has not been improved. The author should highlight the real contribution of your work.
The samples size is not specified. “the training and validation set is composed of a different number of tiles for each area”, so how many training examples, validation examples, and the test examples are used in the experiment? In this paper I can just get the percentages of these sets.
In Section 4.1, some statements are not clear.
(1) line 217-219, “it is removed the 1024-depth layer and four different architectures are generated removing and adding additional layers 8-256, 16-256, 16-512, 32-512 and 64-512,” The numbers of architectures before and after is not consistent. And I wonder to know why the author choose these additional layers architecture.
(2) line 220, “the second one stands for the last layer depth”. Is the last layer of the contraction section or of the whole network? So, the author maybe need to add a flowchart here and to make sure what’s the exact channel design of the network.
The comparison between the proposal and the original U-Net architecture.
In recent years, researchers attempt to introduce the U-Net into building detection., such as:
“Satellite Image Segmentation for Building Detection using U-net” has been discussed.
[23] Chhor, G.; Aramburu, C.B. Satellite Image Segmentation for Building Detection using U-net. 2017
Their proposed approach achieved a reasonable accuracy. So, I think the comparison between the proposal and original U-Net architecture in building detection is essential in the experiment.
Minor problem
Page 10, left column, line 238, it should be “In Table 2 …” instead of “In Tab. 2”
2.The format of the reference should be consistent., like [1], [4], [12] and [15].
Author Response
Please see attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The Authors presented Building Footprint Extraction From Multispectral, Spaceborne Earth Observation Datasets using a Structurally Optimized U-Net Convolutional Neural Network. The topic is interesting. Even though the article is interesting in its current format, some aspects should be improved for possible publication and for a better understanding by the readers. Comments formulated during my review are presented below. These are as follows:
1) It should be noted that the optical or infrared instruments traditionally used for remote sensing are unfortunately compromised by the effects of
cloudiness and impacted by the access of sunlight. When using the Synthetic Aperture Radar (SAR) technology, this problem does not occur. The Synthetic
Aperture Radar produces high-resolution images throughout its operation and regardless of weather conditions. Because of these advantages, the SAR technology is widely used in remote sensing applications for Earth observations.
The Authors should also avoke papers dealing with segmentation and/or object detection problems in remote sensing, using SAR technology, namely:
[a] "Superpixel Segmentation of Polarimetric Synthetic Aperture Radar (SAR) Images Based on Generalized Mean Shift". Remote Sensing, 2018,
vol 10(10), 1592.
[b] "River channel segmentation in polarimetric SAR images: watershed transform combined with average contrast maximisation." Expert Systems with Applications, 2017, vol. 82, 196-215.
[c] "Individual building extraction from TerraSAR-X images based on ontological semantic analysis". Remote Sensing, 2016, vol. 8(9), 708.
[d] "Automatic detection and reconstruction of building radar footprints from single VHR SAR images". IEEE Transactions on Geoscience and Remote Sensing, 2012, 51(2), 935-952.
2) Page 2, lines 66-70: 'chapter'-->'section'
3) What is the novelty of this work? This should be clearly highlighted in the paper.
4) In the related work section, a more rigorous investigation on the existing methods, such as comparison of previous approaches in terms of pros
and cons, should be given. A summary table can be used in this regard.
5) Please give a frank account of the strengths and weaknesses of the proposed research method. This should include theoretical comparison to other approaches in the field.
6) The authors need to present and discuss several solid future research directions.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
All of my remarks have been satisfactorily addressed. I have no further comments.
Reviewer 2 Report
The authors have addressed all the comments.