Next Article in Journal
Relation of Offensive Performance during Exclusions and Final Ranking in Female Handball
Previous Article in Journal
Natural Language Processing Application on Commit Messages: A Case Study on HEP Software
 
 
Article
Peer-Review Record

Position Detection of Doors and Windows Based on DSPP-YOLO

Appl. Sci. 2022, 12(21), 10770; https://doi.org/10.3390/app122110770
by Tong Zhang 1, Jiaqi Li 2, Yilei Jiang 2,3,*, Mengqi Zeng 1 and Minghui Pang 2
Appl. Sci. 2022, 12(21), 10770; https://doi.org/10.3390/app122110770
Submission received: 26 August 2022 / Revised: 12 October 2022 / Accepted: 21 October 2022 / Published: 24 October 2022

Round 1

Reviewer 1 Report

The paper shows an improved algorithm model for detecting doors and windows by autonomous mobile robots in unknown environments-DSPP-YOLO.

The paper needs proof editing (examples: 88% at line 26 should be 8.8%, Many in line 104 should be Number! network at line 238 should be internet!,....)

Figure 3 is not clear!

I don't think the images shown in Figure 5 b are from a monocular camera!!!

Author Response

Reviewer #1: The paper shows an improved algorithm model for detecting doors and windows by autonomous mobile robots in unknown environments-DSPP-YOLO.

Response: Thanks for your comment.

Reviewer #2: The paper needs proof editing (examples: 88% at line 26 should be 8.8%, Many in line 104 should be Number! network at line 238 should be internet!,....).

Response: Thanks for your suggestion.

Thank you very much for your suggestions. We have carefully checked the whole article and revised the content and format of the article.

Reviewer #3: Figure 3 is not clear!

Response: Thanks for your suggestion.

We manipulated Figure 3 to make it look clearer.

Reviewer #4: I don't think the images shown in Figure 5 b are from a monocular camera!!!

Response: Thanks for your suggestion.

Figure 5 are the images taken by mobile phone, which we have also modified in (b) on page 8 of the article.

Author Response File: Author Response.docx

Reviewer 2 Report

Position detection of doors and windows based on DSPP-2 YOLO

1. Very interesting research entitled “Position detection of doors and windows based on DSPP-2 YOLO”.

2. Correct the structure of the article (only suggestion). (See attached file).

** Check "Microsoft Word template" from Applied Sciences-MDPI.

3. I suggest that the green colored sections be part of the sections:

2. Materials and Methods

3. Results

                         4. Discussion

4. On line 236 it says “Experimental data set and environmen” and you must say “Experimental data set and environment”.

5. On line one, define: “Type of the Paper (Article, Review, Communication, etc.)”.  You must choose only one.

6. Before making a reference, leave a blank space. Example: On line 34 it says “situation[1].” and you must say “situation [1].”. Review the entire document and correct.

7. Eliminate the paragraph of lines 93-98. It is not necessary to comment, what will be discussed later.

8. In line 122 the end point of the paragraph is missing.

9. It is not clear how the photos of doors and windows are taken. It is also not indicated if there is pre-processing of the images of doors and windows. What adjustments were made to the images regarding: brightness, high light, low light, shadows, etc.

10. In what image format were they taken (rgb, XYZ, L*a*b*, L*u*v*, HSV, HLS, YCrCb, YUV, I1I2I3, TSL, etc)?.  There was some format conversion of the images ?.

11. How many images were used to train the algorithms and where were they taken from?

12. It is suggested to elaborate algorithms of the proposed architecture (DSPP-YOLO algorithm). A training algorithm and other data processing. Two example algorithms are shown. (See attached file).

13. Explain in detail the process to extract features from images of doors and windows.

14. In line 245 the end point of the paragraph is missing.

15. Consider some future work, based on your research.

16. Very good bibliography. I hope you can consult more bibliography.

The article has good content and very interesting.

Authors are requested to make all indicated corrections.

 

 

 

 

 

 

Comments for author File: Comments.pdf

Author Response

Reviewer #1: The paper shows an improved algorithm model for detecting doors and windows by autonomous mobile robots in unknown environments-DSPP-YOLO.

Response: Thanks for your comment.

Reviewer #2: The paper needs proof editing (examples: 88% at line 26 should be 8.8%, Many in line 104 should be Number! network at line 238 should be internet!,....).

Response: Thanks for your suggestion.

Thank you very much for your suggestions. We have carefully checked the whole article and revised the content and format of the article.

Reviewer #3: Figure 3 is not clear!

Response: Thanks for your suggestion.

We manipulated Figure 3 to make it look clearer.

Reviewer #4: I don't think the images shown in Figure 5 b are from a monocular camera!!!

Response: Thanks for your suggestion.

Figure 5 are the images taken by mobile phone, which we have also modified in (b) on page 8 of the article.

Response to Reviewer 2

Reviewer #1: Very interesting research entitled “Position detection of doors and windows based on DSPP-2 YOLO”.

Response: Thanks for your comment.

Reviewer #2: Correct the structure of the article (only suggestion). (See attached file).

** Check "Microsoft Word template" from Applied Sciences-MDPI.

Response: Thanks for your suggestion.

About the red part of your suggestion: 1. There is no relevant patent for this article. 2. All the data sets in this paper are from the Internet and taken by ourselves. The pictures from the Internet are found on Baidu website, we have not collated into a public data set, but we can provide the picture link as follows:

https://image.baidu.com/search/index?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&ie=utf-8&dyTabStr=MCw4LDYsMSw1LDQsNywzLDIsOQ%3D%3D&word=%E7%AA%97

https://image.baidu.com/search/index?tn=baiduimage&ipn=r&ct=201326592&cl=2&lm=-1&st=-1&fm=result&fr=&sf=1&fmq=1658734206700_R&pv=&ic=&nc=1&z=&hd=&latest=&copyright=&se=1&showtab=0&fb=0&width=&height=&face=0&istype=2&dyTabStr=MCw4LDYsMSw1LDQsNywzLDIsOQ%3D%3D&ie=utf-8&sid=&word=%E9%97%A

Reviewer #3: I suggest that the green colored sections be part of the sections:

  1. Materials and Methods
  2. Results
  3. Discussion

Response: Thanks for your suggestion.

According to your suggestion, we have discussed and think that the content of Chapter 4 ‘Discussion’ will be very little after the template modification, which will lead to the imbalance of the whole article structure. So we are sorry that we did not take your suggestion.

Reviewer #4: On line 236 it says “Experimental data set and environmen” and you must say “Experimental data set and environment”.

Response: Thank you for the nice reminder.

We have corrected this error in the article.

Reviewer #5: On line one, define: “Type of the Paper (Article, Review, Communication, etc.)”. You must choose only one.

Response: Thank you for the nice reminder.

We have corrected this error in the article.

Reviewer #6: Before making a reference, leave a blank space. Example: On line 34 it says “situation[1].” and you must say “situation [1].”. Review the entire document and correct.

Response: Thank you for the nice reminder.

We reviewed the entire article and corrected the errors.

Reviewer #7: Eliminate the paragraph of lines 93-98. It is not necessary to comment, what will be discussed later.

Response: Thanks for your suggestion.

We deleted this paragraph.

Reviewer #8: In line 122 the end point of the paragraph is missing.

Response: Thank you for the nice reminder.

We have corrected this error in the article.

Reviewer #9: It is not clear how the photos of doors and windows are taken. It is also not indicated if there is pre-processing of the images of doors and windows. What adjustments were made to the images regarding: brightness, high light, low light, shadows, etc.

Response: Thanks for your suggestion.

On lines 247-250, we state the problem: ‘Figure 5 shows part of the dataset from different sources. (a) It's some pictures we found from the Internet. (b) It's some pictures we taken with a mobile phone from different angles. (c) It's some pictures we taken with a car from different angles. All images are not preprocessed and in rgb format.’

Reviewer #10: In what image format were they taken (rgb, XYZ, L*a*b*, L*u*v*, HSV, HLS, YCrCb, YUV, I1I2I3, TSL, etc)?. There was some format conversion of the images ?.

Response: Thanks for your suggestion.

In line 250, we state the problem: ‘All images are not preprocessed and in rgb format.’

Reviewer #11: How many images were used to train the algorithms and where were they taken from?

Response: Thanks for your suggestion.

On lines 238-246, we wrote about the source of the dataset and the number of images used for training: ‘Data set: the experimental data used in this experiment is divided into two parts, one part is the public images collected on the internet, the other part is the multi-angle, multi-state images collected by different devices based on the experimental scene, which are taken in the classrooms and laboratory. There are 1105 samples. The data set of this experiment covers rich picture information of different shapes, different shooting angles, different sizes and different scenes of the target, which can increase the generalization ability of the network. However, some pictures are not clear enough. As shown in Table 2, are divided into training sets and test sets in proportion to, and 25% of the training sets are divided into cross-validation sets, used to validate periodic learning results during training.’

Reviewer #12: It is suggested to elaborate algorithms of the proposed architecture (DSPP-YOLO algorithm). A training algorithm and other data processing. Two example algorithms are shown. (See attached file).

Response: Thanks for your suggestion.

This paper made improvements on the basis of other scholars' articles, and the algorithm also borrowed the achievements of other scholars, so no pseudo-code is written for the algorithm in this paper.

Reviewer #13: Explain in detail the process to extract features from images of doors and windows.

Response: Thanks for your suggestion.

The third chapter explains in detail the process of extracting features from Windows and doors images.

Reviewer #14: In line 245 the end point of the paragraph is missing.

Response: Thank you for the nice reminder.

We have corrected this error in the article.

Reviewer #15: Consider some future work, based on your research.

Response: Thanks for your suggestion.

On lines 238-246, based on this paper, we consider future work, including algorithm improvement and application: ‘YOLOV3 algorithm is a typical one-stage object detection algorithm, which can be improved from the following two aspects in the future: 1) We can build a backbone network with stronger representation ability to improve the accuracy of the algorithm; 2) We can propose a new loss function to solve the problem of sample imbalance encountered in the process of object detection. In the future, we plan to apply the doors and windows object detection in the autonomous exploration algorithm of autonomous mobile robots in unknown environments.’

Reviewer #16: Very good bibliography. I hope you can consult more bibliography.

Response: Thanks for your suggestion.

On lines 43-49 and 61-64, We have added to the content of the article: ‘Target detection methods based on deep learning are mainly divided into two types: two-stage detection and one-stage detection. After entering the deep era, people use candidate boxes as target detection methods based on prior knowledge, such as Selective Search [8] and CPMC [9]. In the subsequent development, people generated candidate regions by the network itself, and gave them a new name anchor box, which formed a new direction of object detection task development.’;’ In order to improve the detection performance of small targets, T. kong et al. proposed HyperNet [15], which integrates the information of shallow, middle and deep layers. Dai et al. proposed region-based Fully Convolutional Network [16] (RFCN) on the basis of Fully Convolutional Networks (FCN).’

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The authors corrected most of the comments.

 It is suggested to elaborate algorithms (pseudocode) of the proposed architecture (DSPP-YOLO algorithm). A training algorithm and other data processing. Two example algorithms are shown (see attached file).

It is requested to elaborate the algorithms in pseudocode.

Comments for author File: Comments.pdf

Author Response

Reviewer #1: It is suggested to elaborate algorithms (pseudocode) of the proposed architecture (DSPP-YOLO algorithm). A training algorithm and other data processing. Two example algorithms are shown (see attached file).

Response: Thanks for your suggestion.

In Section 3.5, we add pseudocode for feature extraction in table 2 and training model in table 3.

Algorithm 1. Feature extraction

1. For X=1;X<= (total number of training images);X++

2.      Read photos of doors and Windows

3.      Divide the picture into n*n areas for use

4.      Search for areas with possible target centers

5.      Generate bounding boxes in possible regions

6.      Predict target width and height

7.      According to anchor boxes size and object size adjust bounding boxes size

8.      Predict the target category

9.      Calculate the confidence score of bounding boxes

10.    Output the center coordinates, width and height, and object category of the bounding box with the highest confidence

11. end for

 

Algorithm 2. Training model

Input:   I:set of n training images

          M:the width of anchor boxes

          N:   the height of anchor boxes

Output:   P: the target category

          (X, Y):  coordinates of the center of the bounding boxes

          W: the width of the bounding boxes

          H:  the height of the bounding boxes

1.  For n=1 to I do:

2.     According to the feature extraction algorithm to extract the training picture: P,X,Y,W,H

3.     Calculate the error between the target center coordinates predicted by the model (X, Y) and the real training images (Xn, Yn)

4.     Calculate the error between the width-height coordinates of the model predicted detection box (W, H) and the real detection box (Wn, Hn)

5.     Calculate the confidence errors of the objects in the predictive detection boxes of models Cn

6.     Calculate the confidence errors of the objects is not found in the predictive detection boxes of models

7.     Calculate the error between the prediction categories Pn and P

8.     The training model adaptively adjusts learning according to the loss function

9.  Convergence of loss function

10. Obtain the training model with the minimum loss function to achieve target detection

 

Reviewer #2: It is requested to elaborate the algorithms in pseudocode.

Response: Thanks for your suggestion.

In Figure 2, we introduce the specific flow chart of DSPP-YOLO algorithm, which can reflect the whole algorithm framework more directly. Feature extraction and training model are the two main parts of the whole algorithm, the pseudo-code we provide can help readers to understand this article more deeply.

Author Response File: Author Response.docx

Back to TopTop