Next Article in Journal
Dynamic Threshold of Carbon Phenology in Two Cold Temperate Grasslands in China
Next Article in Special Issue
ICENETv2: A Fine-Grained River Ice Semantic Segmentation Network Based on UAV Images
Previous Article in Journal
Operationalization of Remote Sensing Solutions for Sustainable Forest Management
Previous Article in Special Issue
Sequence Image Interpolation via Separable Convolution Network
 
 
Article
Peer-Review Record

MultEYE: Monitoring System for Real-Time Vehicle Detection, Tracking and Speed Estimation from UAV Imagery on Edge-Computing Platforms

Remote Sens. 2021, 13(4), 573; https://doi.org/10.3390/rs13040573
by Navaneeth Balamuralidhar 1,2,*,†, Sofia Tilon 1,† and Francesco Nex 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Remote Sens. 2021, 13(4), 573; https://doi.org/10.3390/rs13040573
Submission received: 31 December 2020 / Revised: 29 January 2021 / Accepted: 2 February 2021 / Published: 5 February 2021
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)

Round 1

Reviewer 1 Report

The paper presents a multi-task learning architecture named MultEYE that adds a segmentation head to an object detector backbone, getting fast responses. The proposed solution is used for traffic monitoring system that can detect, track and estimate the velocity of vehicles in a sequence of aerial images.

In general, the paper is well organized and also well written, being easy to understand it.

The objective and contributions are clear.

 

Subsection 3.1.1 is one of the most relevant parts of the paper. However, this subsection is not easy to understand. Because some details of the proposal were not properly explained and defined. What are the functions used in each block, for example, the losses functions? For example, the mathematical relations regarding the speed estimation are presented, similar information is expected to have for the MultEYE (the blocks introduced in Figure 1).

 

The motivation and the relevance to present Table 7 should be better explained. In general, higher spatial resolution needs more fps, considering the quality of the video.

 

There are some figures and tables that are properly presented. (i.e., Fig 5,  Table 6)

 

Result section needs to be improved. Some results were not discussed. Also, some elements of the proposed solution are compared with related works(tracking part) , but the others were not.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper summarizes an interesting application. The paper is well-written and the experiments
support the proposed approach. I have some minor comments and suggestions that may improve
the paper.
1. In some practical scenarios, environmental conditions may be quite different from the training
conditions. It might be good to mention the robustness of the proposed system to poor or low
lighting conditions.
2. The paper used a modified YOLO for object detection. In the Introduction, it might be good to
mention some other YOLO applications to practical and low lighting environments. Ref [a] to [c]
include applications of YOLO for various target detection using code aperture measurements that
can deal with high dynamic ranges of lighting conditions. Please comment on those papers in the
revised paper.
[a] “Compact all-CMOS spatio-temporal 670 compressive sensing video camera with pixel-wise coded exposure,” Optics Express, vol. 24, no. 8, pp. 671 9013-9024, Apr. 2016.

[b] “Deep Learning-Based Target Tracking and Classification for Low Quality Videos Using Coded
Aperture Cameras,” Sensors 19 (17), 3702, 2019.

3. The term edge-computing is a little unclear to me. Is it because UAV does not have enough
communication bandwidth and therefore local computing is done onboard the UAV?
4. Table 3 is missing and hence it is hard to see the performance of MULTEYE as compared to others.
Table 6 is cut off and some numbers are missing. Figure 5 is truncated.
5. What is the unit of error in Table 5?
6. The connection between MOSSE and MULTEYE is not clear. My understanding is that MULTEYE is for
detection and MOSSE is for tracking. However, in Table 4, it is not clear about what inputs are fed into
the different trackers. Are they all using the same inputs coming out of MULTEYE?
7. The captions of the tables should be placed on top of all the tables.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Aprove

Back to TopTop