Moving object detection and tracking from image sequences has been extensively studied in a variety of fields. Nevertheless, observing geometric attributes and identifying the detected objects for further investigation of moving behavior has drawn less attention. The focus of this study is to determine moving trajectories, object heights, and object recognition using a monocular camera configuration. This paper presents a scheme to conduct moving object recognition with three-dimensional (3D) observation using faster region-based convolutional neural network (Faster R-CNN) with a stationary and rotating Pan Tilt Zoom (PTZ) camera and close-range photogrammetry. The camera motion effects are first eliminated to detect objects that contain actual movement, and a moving object recognition process is employed to recognize the object classes and to facilitate the estimation of their geometric attributes. Thus, this information can further contribute to the investigation of object moving behavior. To evaluate the effectiveness of the proposed scheme quantitatively, first, an experiment with indoor synthetic configuration is conducted, then, outdoor real-life data are used to verify the feasibility based on recall, precision, and F1 index. The experiments have shown promising results and have verified the effectiveness of the proposed method in both laboratory and real environments. The proposed approach calculates the height and speed estimates of the recognized moving objects, including pedestrians and vehicles, and shows promising results with acceptable errors and application potential through existing PTZ camera images at a very low cost.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited