Next Article in Journal
Proactive Caching at the Edge Leveraging Influential User Detection in Cellular D2D Networks
Previous Article in Journal
Intelligent Communication in Wireless Sensor Networks
Article Menu
Issue 10 (October) cover image

Export Article

Open AccessArticle
Future Internet 2018, 10(10), 92; https://doi.org/10.3390/fi10100092

Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints

1
School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
2
Shanghai Institute for Advanced Communication and Data Science, Shanghai 200444, China
*
Author to whom correspondence should be addressed.
Received: 13 August 2018 / Revised: 12 September 2018 / Accepted: 13 September 2018 / Published: 21 September 2018
Full-Text   |   PDF [7819 KB, uploaded 21 September 2018]   |  

Abstract

We present an occlusion-aware unsupervised neural network for jointly learning three low-level vision tasks from monocular videos: depth, optical flow, and camera motion. The system consists of three different predicting sub-networks simultaneously coupled by combined loss terms and is capable of computing each task independently on test samples. Geometric constraints extracted from scene geometry which have traditionally been used in bundle adjustment or pose-graph optimization are formed as various self-supervisory signals during our end-to-end learning approach. Different from prior works, our image reconstruction loss also takes account of optical flow. Moreover, we impose novel 3D flow consistency constraints over the predictions of all the three tasks. By explicitly modeling occlusion and taking utilization of both 2D and 3D geometry relationships, abundant geometric constraints are formed over estimated outputs, enabling the system to capture both low-level representations and high-level cues to infer thinner scene structures. Empirical evaluation on the KITTI dataset demonstrates the effectiveness and improvement of our approach: (1) monocular depth estimation outperforms state-of-the-art unsupervised methods and is comparable to stereo supervised ones; (2) optical flow prediction ranks top among prior works and even beats supervised and traditional ones especially in non-occluded regions; (3) pose estimation outperforms established SLAM systems under comparable input settings with a reasonable margin. View Full-Text
Keywords: monocular depth; camera pose; optical flow; joint learning; occlusion-aware; scene geometry monocular depth; camera pose; optical flow; joint learning; occlusion-aware; scene geometry
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Teng, Q.; Chen, Y.; Huang, C. Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints. Future Internet 2018, 10, 92.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Future Internet EISSN 1999-5903 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top