Next Article in Journal
The ‘Guided Tissue Regeneration (GTR) Effect’ of Guided Bone Regeneration (GBR) with the Use of Bone Lamina: A Report of Three Cases with More than 36 Months of Follow-Up
Previous Article in Journal
Maritime Traffic Evaluation Using Spatial-Temporal Density Analysis Based on Big AIS Data
 
 
Article
Peer-Review Record

Image Semantic Segmentation Fusion of Edge Detection and AFF Attention Mechanism

Appl. Sci. 2022, 12(21), 11248; https://doi.org/10.3390/app122111248
by Yijie Jiao, Xiaohua Wang *, Wenjie Wang and Shuang Li
Reviewer 1:
Reviewer 2:
Reviewer 3:
Appl. Sci. 2022, 12(21), 11248; https://doi.org/10.3390/app122111248
Submission received: 13 September 2022 / Revised: 31 October 2022 / Accepted: 4 November 2022 / Published: 6 November 2022

Round 1

Reviewer 1 Report

I think that would be better a short description of the AFF attention  mechanism at the end of the Introduction, in order to facilitate the comprehension of the rest of the paper

Comments for author File: Comments.pdf

Author Response

Response to Reviewer Comments

Point 1 I think that would be better a short description of the AFF attention mechanism at the end of the Introduction, in order to facilitate the comprehension of the rest of the paper.

Response 1: According to your suggestions, I have revised the article. The last paragraph of the introduction briefly explains the role of the AFF attention mechanism. In addition, the attention mechanism is introduced in detail in the third part of the article.

Reviewer 2 Report

This paper addresses the semantic segmentation of images in case of difficulties arising from the image quality (imprecise edge, holes) or from the model (slow convergence, underfitting). The topic is very timely, and the datasets used for experiments show the wide potential of application of the research.

Unfortunately, the article does not read sufficiently well, due many unstructured and non-verbal sentences, such as in the abstract. As a result, it fails to give an appropriate view of the state-of-the art, the scientific questions it addresses, and the new proposed method.

As such, in spite of its obvious potential, the paper is therefore not suitable for publication. I would recommend a major and rigorous reworking of the material for further submission. 

Author Response

Response to Reviewer Comments

Point 1: This paper addresses the semantic segmentation of images in case of difficulties arising from the image quality (imprecise edge, holes) or from the model (slow convergence, underfitting). The topic is very timely, and the datasets used for experiments show the wide potential of application of the research.

Unfortunately, the article does not read sufficiently well, due many unstructured and non-verbal sentences, such as in the abstract. As a result, it fails to give an appropriate view of the state-of-the art, the scientific questions it addresses, and the new proposed method.

As such, in spite of its obvious potential, the paper is therefore not suitable for publication. I would recommend a major and rigorous reworking of the material for further submission.

Response 1: According to your suggestions, we have modified the language expression logic of the full paper. Revise the Chinese expression to conform to the English format. I also adjusted the structure of the article and described the methods used more reasonably.

Reviewer 3 Report

This paper untitled « Image Semantic Segmentation Fusion of Edge Detection and AFF Attention Mechanism »

This work is about semantic segmentation adding the edge information.

The main contribution is not clearly introducded in the bastract. Major reform will be introduced for the abstract.

L10: Avoid the repetition ( segmentation)

L54: Reform the sentence

L97: Add a complete legend of each used color- Output image (Segmented Image) instead Out Image

L98: Reform the title

L127: Use a complete legend

L128: Reform the title

L129: In Figure 3 (Instead In the figure)

L133: By concatenating

L168: 3x3 or 3 by 3

L185: Table 1: See the stage numbering

L233: Verify the given results of FPS (>80 FPS!!)

L278: The given results are very limited (only 4 street images). Authors are invited to present the segmentation results of different High-resolution sequences with of micro and macro textured regions (forest, sea, cloud, etc.)

L516: label (not lable)

Avoid using Chines notes (Figure 14, Table3)

 

Authors are invited to prove the performance of the proposed method by comparing with a recent works on image segmentation using deep learning techniques. A comparative study based on quantitative and qualitative evaluation should be conducted.

 

Author Response

Response to Reviewer Comments

Point 1: L10: Avoid the repetition ( segmentation); L54: Reform the sentence; L516: label (not lable)

Response 1: The language expression of the full article has been modified to conform to the English expression. The incorrect grammatical expression was modified and the words repeatedly used in the article were replaced.

Point 2:L133: By concatenating;

Response 2:In Figure 3, the feature maps input to the decoder network have different scales, so it is necessary to adjust the feature maps of different scales before performing feature fusion. ƒ5 layer feature map can be input to PPM to obtain new feature map ƒ5 layer feature and new feature are fused through concatenating [15].

Point 3: Add a complete legend of each used color- Output image (Segmented Image) instead Out Image

Response 3: The inappropriate description in the picture has been modified. It can be seen in lines 74 of the article.

Figure 1. Improved PSP-Net combined with Improved HED network.

Point 4: L98: Reform the title; L128: Reform the title

Response 4: The title has been modified. It can be seen in lines 75 and 102 of the article

Line 75:Figure 1. Improved PSP-Net combined with Improved HED network.

Line102:Figure 3. Decoder network structure.

Point 5: L127: Use a complete legend

Response 5: This issue has been modified. It can be seen in lines 101 of the article

Figure 3. Decoder network structure.

Point 6: L185: Table 1: See the stage numbering

Response 6: The disordered expressions in Table 1 have been modified, as shown in line 140 of the article.

Table 1. Parameters of each stage in the improved HED network.

Network layer

Convolution kernel size

Stage1(Pooling1+ Dilated Conv+Sigmoid)

Stage2(pooling2+ Dilated Conv+Sigmoid)

Stage3(pooling3+ Dilated Conv+Sigmoid)

Stage4(pooling4+ Dilated Conv+Sigmoid)

Stage5(pooling5+ Dilated Conv+Sigmoid)

[3,3, 64]

[3,3,128]

[3,3,256]

[3,3,512]

[3,3,512]

Point 7: L278: The given results are very limited (only 4 street images). Authors are invited to present the segmentation results of different High-resolution sequences with of micro and macro textured regions (forest, sea, cloud, etc.)

Response 7: In the last part of the experiment, a group of experiments on the siftflow dataset are added. As shown in Figure 19 and Table 6 in Part 5 of the article

 

5.3. Experiment results and analysis

The model was trained and verified according to the experimental scheme designed in 5.2. The experimental results are shown in Table 5.

Table 5. Feature fusion MIOU comparison of different semantic segmentation networks on cityscapes dataset.

Experiment

â… 

â…¡

â…¢

â…£

â…¤

â…¥

â…¦

â…§

MIOU

79.62

80.37

81.89

82.56

80.45

80.26

83.38

80.54

                     

Table 6. Feature fusion MIOU comparison of different semantic segmentation networks on siftflow dataset.

Experiment

â… 

â…¡

â…¢

â…£

â…¤

â…¥

â…¦

â…§

MIOU

78.52

79.94

80.63

82.71

80.24

80.79

83.68

80.27

                     

As shown in Table 5 and 6. The semantic segmentation effect of PSP Net and improved HED network with AFF is better than networks with AFF. The main reasons are as follows: The PPM in PSP-Net architecture can extract features very well, and AFF can accelerate the speed of feature learning and the depth of feature propagation. When AFF is introduced into the encoder of MobileNetV3, the AFF attention mechanism will make the structure of the model complicated. So, unnecessary hyperparameters will be generated, which makes model training slow and wastes resources. When AFF is introduced into improved HED network, the feature fusion of five bypass structures with different output scales is accelerated, and channels are unified.

In Figure 18, Experiment â…£ and Experiment â…¦ were analyzed by image visualization on the Cityscapes dataset. Experiment â…¦ is the proposed method in this paper.

Figure 18. image of experiment â…£ and experiment â…¦ on the cityscapes dataset.

In Figure 19, Experiment â…£ and Experiment â…¦ were analyzed by image visualization on the Siftflow dataset.

   Figure 19. image of experiment â…£ and experiment â…¦ on the siftflow dataset.

In Figure 18 and 19, Experiment â…¦, the segmentation edges of "car", "vegetation", "road", "light pole" and "land", "ocean", "cloud" are continuous, and the small object "light pole", "cloud " are finely segmented. In experiment â…£, edge is well segmented, but there are still some problems in the segmentation of some small objects.

Point 8: L233: Verify the given results of FPS (>80 FPS!!)

Response 8: The FPS results in the article are verified, and it is found through experiments that the FPS can reach more than 80 in the best case, but the FPS value is stable at about 50 in most cases. Therefore, the data in the text has been modified to ensure the universality of the data.

The experimental data are as follows:



Total Time(s)

Average time(s)

Average FPS

0.019093752

0.02

52.37

0.014995337

0.01

66.69

0.018994331

0.02

52.65

0.018994331

0.02

52.65

0.017994165

0.02

55.57

0.019993782

0.02

50.02

0.019993544

0.02

50.02

0.019994259

0.02

50.01

0.019993782

0.02

50.02

0.018993616

0.02

52.65

0.019993544

0.02

50.02

0.018993855

0.02

52.65

0.019993544

0.02

50.02

0.018993855

0.02

52.65

0.018993378

0.02

52.65

0.022992611

0.02

43.49

0.017994165

0.02

55.57

0.011996746

0.01

83.36

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.015994787

0.02

62.52

0.018994093

0.02

52.65

0.018993855

0.02

52.65

0.019993067

0.02

50.02

0.019993782

0.02

50.02

0.018993616

0.02

52.65

0.018993855

0.02

52.65

0.018993616

0.02

52.65

0.018993616

0.02

52.65

0.018993616

0.02

52.65

0.018993378

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.01999402

0.02

50.01

0.013995647

0.01

71.45

0.018993855

0.02

52.65

0.017994165

0.02

55.57

0.019993782

0.02

50.02

0.016994238

0.02

58.84

0.019993305

0.02

50.02

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.019993305

0.02

50.02

0.017994165

0.02

55.57

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.018994093

0.02

52.65

0.011995792

0.01

83.36

0.011995792

0.01

83.36

0.011995792

0.01

83.36

0.016994238

0.02

58.84

0.018994093

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993616

0.02

52.65

0.018625259

0.02

53.69

0.033988953

0.03

29.42

0.018993855

0.02

52.65

0.019993305

0.02

50.02

0.019993782

0.02

50.02

0.017994642

0.02

55.57

0.018993855

0.02

52.65

0.018993616

0.02

52.65

0.018993616

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.024991989

0.02

40.01

0.020993471

0.02

47.63

0.019993544

0.02

50.02

0.019867182

0.02

50.33

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.019993305

0.02

50.02

0.018993378

0.02

52.65

0.020139456

0.02

49.65

0.019993305

0.02

50.02

0.019993544

0.02

50.02

0.018994093

0.02

52.65

0.018993616

0.02

52.65

0.019993067

0.02

50.02

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018994331

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.01899457

0.02

52.65

0.018994093

0.02

52.65

0.018994331

0.02

52.65

0.019993544

0.02

50.02

0.018993616

0.02

52.65

0.025991678

0.03

38.47

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.02199316

0.02

45.47

0.019093752

0.02

52.37

0.014995337

0.01

66.69

0.018994331

0.02

52.65

0.018994331

0.02

52.65

0.017994165

0.02

55.57

0.019993782

0.02

50.02

0.019993544

0.02

50.02

0.019994259

0.02

50.01

0.019993782

0.02

50.02

0.018993616

0.02

52.65

0.019993544

0.02

50.02

0.018993855

0.02

52.65

0.019993544

0.02

50.02

0.018993855

0.02

52.65

0.018993378

0.02

52.65

0.022992611

0.02

43.49

0.017994165

0.02

55.57

0.011996746

0.01

83.36

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.015994787

0.02

62.52

0.018994093

0.02

52.65

0.018993855

0.02

52.65

0.019993067

0.02

50.02

0.019993782

0.02

50.02

0.018993616

0.02

52.65

0.018993855

0.02

52.65

0.018993616

0.02

52.65

0.018993616

0.02

52.65

0.018993616

0.02

52.65

0.018993378

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.01999402

0.02

50.01

0.013995647

0.01

71.45

0.018993855

0.02

52.65

0.017994165

0.02

55.57

0.019993782

0.02

50.02

0.016994238

0.02

58.84

0.019993305

0.02

50.02

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.019993305

0.02

50.02

0.017994165

0.02

55.57

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.018994093

0.02

52.65

0.011995792

0.01

83.36

0.011995792

0.01

83.36

0.011995792

0.01

83.36

0.016994238

0.02

58.84

0.018994093

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993616

0.02

52.65

0.018625259

0.02

53.69

0.033988953

0.03

29.42

0.018993855

0.02

52.65

0.019993305

0.02

50.02

0.019993782

0.02

50.02

0.017994642

0.02

55.57

0.018993855

0.02

52.65

0.018993616

0.02

52.65

0.018993616

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.024991989

0.02

40.01

0.020993471

0.02

47.63

0.019993544

0.02

50.02

0.019867182

0.02

50.33

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018994093

0.02

52.65

0.019993305

0.02

50.02

0.018993378

0.02

52.65

0.020139456

0.02

49.65

0.019993305

0.02

50.02

0.019993544

0.02

50.02

0.018994093

0.02

52.65

0.018993616

0.02

52.65

0.019993067

0.02

50.02

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.018994331

0.02

52.65

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.01899457

0.02

52.65

0.018994093

0.02

52.65

0.018994331

0.02

52.65

0.019993544

0.02

50.02

0.018993616

0.02

52.65

0.025991678

0.03

38.47

0.018993855

0.02

52.65

0.018993855

0.02

52.65

0.02199316

0.02

45.47

 

Round 2

Reviewer 2 Report

The paper has been greatly improved. There are still some room for improvements (see lines 21 and following). The content is technically sound.

Author Response

"Please see the attachment."

Author Response File: Author Response.docx

Reviewer 3 Report

All corrections has been made.

Author Response

"Please see the attachment."

Author Response File: Author Response.docx

Back to TopTop