Evaluation of a Low-Power Computer Vision-Based Positioning System for a Handheld Landmine Detector Using AprilTag Markers
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have proposed an interesting technique for the positioning system of landmines. The manuscript is well written, but some changes are needed to make it more complete and accurate.
- How does the deminer position the AprilTag? Manually?
- Considering Figure 3, it would be interesting to highlight the reading pattern that simulates the movement of the deminer to make this part more understandable.
- In section 5.2. Dynamic validation method, were the AprilTags moved manually by an operator? Again, it would be interesting to include the movement pattern.
- Figures 5 and 6 show tags that will be found with extreme precision in their position, while other tags show a high residual standard deviation. In your opinion, why are these tags poorly positioned?
- The unit of measurement in the figure should be enclosed in round brackets, not square brackets.
- Figure 10 has no labels or units of measurement on its axis, and Figures 9 and 10 require a colour-coded axis to be understood correctly.
- Have you tried running some validation tests outdoors? Will changes in lighting and the non-planar position of the target (e.g. on grass) compromise accuracy?
Author Response
We thank the reviewer for their detailed review and the specific comments they have raised, which we address below.
- How does the deminer position the AprilTag? Manually?
We have added the following sentence to section 1.3 for clarity:
"The proposed operating method is that a marker will be placed by the system operator in a safe area near the region to be scanned."
- Considering Figure 3, it would be interesting to highlight the reading pattern that simulates the movement of the deminer to make this part more understandable.
We have added Figure 3(d) to address this.
- In section 5.2. Dynamic validation method, were the AprilTags moved manually by an operator? Again, it would be interesting to include the movement pattern.
We have modified the text in section 5.2 to make this clearer and added further information to the caption of Figure 4.
- Figures 5 and 6 show tags that will be found with extreme precision in their position, while other tags show a high residual standard deviation. In your opinion, why are these tags poorly positioned?
We have added further discussion to section 6.2 to address this.
"Figure 5 and Figure 6 illustrate the static measurement of the system's precision and how it varies with the true position of the target. It can be observed that there is a spread in precision across the AprilTag array, with the general trend that greater precision is achieved when the AprilTag is closer to the centre of the field of view of the camera."
and later within section 6.2:
"Furthermore, the static validation array has only a sparse sampling of points within the measurement space which leads to apparently inconsistent results with respect to precision. This should be contrasted with the results from the dynamic tests where the spatial sampling is significantly denser and shows a much more consistent trend across the field of view."
- The unit of measurement in the figure should be enclosed in round brackets, not square brackets.
There is no MDPI style guidance on this so, no change is needed. We have modified figure 7 to ensure consistency throughout.
- Figure 10 has no labels or units of measurement on its axis, and Figures 9 and 10 require a colour-coded axis to be understood correctly.
As indicated in the text and figure caption, the colour scale is arbitrary and has no physical meaning. These figures are for illustrative purposes only and only comparisons within the figure are meaningful. We do not feel that adding a colour scale would be helpful as this may lead to incorrect conclusions regarding the system performance. These figures are only qualitative and should not be interpreted quantitatively.
- Have you tried running some validation tests outdoors? Will changes in lighting and the non-planar position of the target (e.g. on grass) compromise accuracy?
This is likely to be the subject of future work so we have modified the text in section 7 to address this:
"Further work is ongoing to further improve both precision and accuracy and to assess the behaviour of the system under more representative conditions as outdoor lighting and non-flat surfaces are likely to affect performance."
Reviewer 2 Report
Comments and Suggestions for AuthorsThe paper presents a very interesting study on vision-based positioning applied to handheld detector tracking. Overall, the paper is well written; it provides a comprehensive state-of-the art overview and the presented methodology is clear and easy to follow.
I only have a few minor comments aimed at clarifying some technical aspects of the proposed approach from the end-application point of view.
- In relation to static and dynamic validation shown in Section 5, have the authors attempted to validate the accuracy and resolution of the z-coordinate of the detector head, i.e. the lift-off, as well as head's orientation, e.g. pitch and roll angles? In typical operating procedures, a handheld detector is rarely held at constant lift-off and perfectly parallel to ground surface. Varying lift-off also strongly affects sensitivity, especially in the case of metal detectors, so having a system that can accurately measure the lift-off could be very beneficial from the practical side. Please comment if there are any limitations to the proposed methodology in a context lift-off and sensor orientation estimation?
- When discussing the frame rates (30-50 fps) for a targeted application, the authors state that "the frame rate ensures that the system can capture the position at a sufficient spatial resolution to keep up with normal demining scan patterns and speeds." Could you please clarify what is meant by normal demining speeds (? m/s) and what would be the expected spatial resolution?
- Can you clarify how does 30-50 fps rates translate into position update rates, considering a fact that some level of data averaging is needed?
- Can you explain on how the MD-GPR detector and positioning data streams are synchronized to produce overlay images shown in Figs 9 and 10? How does data synchronization accuracy affects the overall resolution of the detection system for the targeted application?
Author Response
We thank the reviewer for their positive and detailed review and the specific points they have raised. These are addressed below.
1. In relation to static and dynamic validation shown in Section 5, have the authors attempted to validate the accuracy and resolution of the z-coordinate of the detector head, i.e. the lift-off, as well as head's orientation, e.g. pitch and roll angles? In typical operating procedures, a handheld detector is rarely held at constant lift-off and perfectly parallel to ground surface. Varying lift-off also strongly affects sensitivity, especially in the case of metal detectors, so having a system that can accurately measure the lift-off could be very beneficial from the practical side. Please comment if there are any limitations to the proposed methodology in a context lift-off and sensor orientation estimation?
We agree that the effect of variation in z-displacement and head orientation will have an important effect on the precision and accuracy of the system, but unfortunately this has not been studied as part of this work. We have added additional text to section 6.2 to address this and added a further comment in section 7 to indicate this is likely to form part of future work.
2. When discussing the frame rates (30-50 fps) for a targeted application, the authors state that "the frame rate ensures that the system can capture the position at a sufficient spatial resolution to keep up with normal demining scan patterns and speeds." Could you please clarify what is meant by normal demining speeds (? m/s) and what would be the expected spatial resolution?
We have added text and a reference in section 2.1 to address this.
3. Can you clarify how does 30-50 fps rates translate into position update rates, considering a fact that some level of data averaging is needed?
We do not use any averaging on the "live" reporting of position in the system discussed in this paper. We agree that this is a complex interaction requiring further study and so we have added additional text in section 2 (see also previous comment) and section 7 to address this.
4. Can you explain on how the MD-GPR detector and positioning data streams are synchronized to produce overlay images shown in Figs 9 and 10? How does data synchronization accuracy affects the overall resolution of the detection system for the targeted application?
We have added text into section 6.2 to describe the timestamp based synchronisation between MD/GPR data and the positioning system.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe author's responses are sufficient.
However, the layout of the figures depends on both MDPI guidelines and metrological conventions, given that you are publishing in a metrology journal.
Furthermore, in Figure 10, I agree that the colour scale is not necessary in light of the explanation, but at least the units of measurement should be included to make it consistent with Figure 9.
Author Response
Thank you for the review and pointing out the lack of labels in Figure 10. We have added axes labels and dimensions to Figure 10 for consistency.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors provided clarifications to my previous comments which improved the overall readability of the paper. I have no further comments.
Author Response
We thank the reviewer for their comments and thorough review.

