E-Mail Alert

Add your e-mail address to receive forthcoming issues of this journal:

Journal Browser

Journal Browser

Special Issue "Depth Sensors and 3D Vision"

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Physical Sensors".

Deadline for manuscript submissions: closed (20 December 2018)

Special Issue Editor

Guest Editor
Prof. Dr. Roberto Vezzani

AImageLab, Dipartimento di Ingegneria "Enzo Ferrari", University of Modena and Reggio Emilia, Modena, Italy
Website | E-Mail
Interests: computer vision; image processing; machine vision; pattern recognition; surveillance; people behavior understanding; human-computer interaction; depth sensors; 3D vision

Special Issue Information

Dear Colleagues,

The recent diffusion of inexpensive RGB-D sensors has encouraged the computer vision community to explore new solutions based on depth images. Depth information provides a significant contribution to solve or simplify several challenging tasks, such as shape analysis and classification, scene reconstruction, object segmentation, people detection, and body part recognition. The intrinsic metric information as well as the ability to handle texture and illumination variations of objects and scenes are only two of the advantages with respect to pure RGB images.

For example, hardware and software technologies included in the Microsoft Kinect framework allow an easy estimation of the 3D positions of skeleton joints, providing a new compact and expressive representation of the human body.

Although the Kinect failed as a gaming-first device, it has been a launch pad for the spread of depth sensors and, contextually, 3D vision. From a hardware perspective, several stereo, structured IR light, and ToF sensors have appeared on the market, and are studied by the scientific community. At the same time, computer vision and machine learning communities have proposed new solutions to process depth data, individually or fused with other information such as RGB images.

This Special Issue seeks innovative work to explore new hardware and software solutions for the generation and analysis of depth data, including representation models, machine learning approaches, datasets, and benchmarks.

The particular topics of interest include, but are not limited to:

  • Depth acquisition techniques
  • Depth data processing
  • Analysis of depth data
  • Fusion of depth data with other modalities
  • From and to depth domain translation
  • 3D scene reconstruction
  • 3D shape modeling and retrieval
  • 3D object recognition
  • 3D biometrics
  • 3D imaging for cultural heritage applications
  • Point cloud modelling and processing
  • Human action recognition on depth data
  • Biomedical applications of depth data
  • Other applications of depth data analysis
  • Depth datasets and benchmarks
  • Depth data visualization

Prof. Roberto Vezzani
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Depth sensors
  • 3D vision
  • Depth data generation
  • Depth data analysis
  • Depth datasets

Published Papers (63 papers)

View options order results:
result details:
Displaying articles 1-63
Export citation of selected articles as:

Research

Jump to: Review

Open AccessArticle Development of an Active High-Speed 3-D Vision System
Sensors 2019, 19(7), 1572; https://doi.org/10.3390/s19071572
Received: 1 February 2019 / Revised: 15 March 2019 / Accepted: 22 March 2019 / Published: 1 April 2019
PDF Full-text (5945 KB) | HTML Full-text | XML Full-text
Abstract
High-speed recognition of the shape of a target object is indispensable for robots to perform various kinds of dexterous tasks in real time. In this paper, we propose a high-speed 3-D sensing system with active target-tracking. The system consists of a high-speed camera [...] Read more.
High-speed recognition of the shape of a target object is indispensable for robots to perform various kinds of dexterous tasks in real time. In this paper, we propose a high-speed 3-D sensing system with active target-tracking. The system consists of a high-speed camera head and a high-speed projector, which are mounted on a two-axis active vision system. By measuring a projected coded pattern, 3-D measurement at a rate of 500 fps was achieved. The measurement range was increased as a result of the active tracking, and the shape of the target was accurately observed even when it moved quickly. In addition, to obtain the position and orientation of the target, 500 fps real-time model matching was achieved. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Automatic Calibration of an Industrial RGB-D Camera Network Using Retroreflective Fiducial Markers
Sensors 2019, 19(7), 1561; https://doi.org/10.3390/s19071561
Received: 21 December 2018 / Revised: 23 February 2019 / Accepted: 26 March 2019 / Published: 31 March 2019
PDF Full-text (27676 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
This paper describes a non-invasive, automatic, and robust method for calibrating a scalable RGB-D sensor network based on retroreflective ArUco markers and the iterative closest point (ICP) scheme. We demonstrate the system by calibrating a sensor network comprised of six sensor nodes positioned [...] Read more.
This paper describes a non-invasive, automatic, and robust method for calibrating a scalable RGB-D sensor network based on retroreflective ArUco markers and the iterative closest point (ICP) scheme. We demonstrate the system by calibrating a sensor network comprised of six sensor nodes positioned in a relatively large industrial robot cell with an approximate size of 10   m × 10   m × 4 m . Here, the automatic calibration achieved an average Euclidean error of 3 c m at distances up to 9.45 m . To achieve robustness, we apply several innovative techniques: Firstly, we mitigate the ambiguity problem that occurs when detecting a marker at long range or low resolution by comparing the camera projection with depth data. Secondly, we use retroreflective fiducial markers in the RGB-D calibration for improved accuracy and detectability. Finally, the repeating ICP refinement uses an exact region of interest such that we employ the precise depth measurements of the retroreflective surfaces only. The complete calibration software and a recorded dataset are publically available and open source. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Graphical abstract

Open AccessArticle Construction of All-in-Focus Images Assisted by Depth Sensing
Sensors 2019, 19(6), 1409; https://doi.org/10.3390/s19061409
Received: 13 February 2019 / Revised: 6 March 2019 / Accepted: 15 March 2019 / Published: 22 March 2019
PDF Full-text (2885 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Multi-focus image fusion is a technique for obtaining an all-in-focus image in which all objects are in focus to extend the limited depth of field (DoF) of an imaging system. Different from traditional RGB-based methods, this paper presents a new multi-focus image fusion [...] Read more.
Multi-focus image fusion is a technique for obtaining an all-in-focus image in which all objects are in focus to extend the limited depth of field (DoF) of an imaging system. Different from traditional RGB-based methods, this paper presents a new multi-focus image fusion method assisted by depth sensing. In this work, a depth sensor is used together with a colour camera to capture images of a scene. A graph-based segmentation algorithm is used to segment the depth map from the depth sensor, and the segmented regions are used to guide a focus algorithm to locate in-focus image blocks from among multi-focus source images to construct the reference all-in-focus image. Five test scenes and six evaluation metrics were used to compare the proposed method and representative state-of-the-art algorithms. Experimental results quantitatively demonstrate that this method outperforms existing methods in both speed and quality (in terms of comprehensive fusion metrics). The generated images can potentially be used as reference all-in-focus images. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Novel Calibration Method of Articulated Laser Sensor for Trans-Scale 3D Measurement
Sensors 2019, 19(5), 1083; https://doi.org/10.3390/s19051083
Received: 15 December 2018 / Revised: 21 February 2019 / Accepted: 27 February 2019 / Published: 3 March 2019
PDF Full-text (4303 KB) | HTML Full-text | XML Full-text
Abstract
The articulated laser sensor is a new kind of trans-scale and non-contact measurement instrument in regular-size space and industrial applications. These sensors overcome many deficiencies and application limitations of traditional measurement methods. The articulated laser sensor consists of two articulated laser sensing modules, [...] Read more.
The articulated laser sensor is a new kind of trans-scale and non-contact measurement instrument in regular-size space and industrial applications. These sensors overcome many deficiencies and application limitations of traditional measurement methods. The articulated laser sensor consists of two articulated laser sensing modules, and each module is made up of two rotary tables and one collimated laser. The three axes represent a non-orthogonal shaft architecture. The calibration method of system parameters for traditional instruments is no longer suitable. A novel high-accuracy calibration method of an articulated laser sensor for trans-scale 3D measurement is proposed. Based on perspective projection models and image processing techniques, the calibration method of the laser beam is the key innovative aspect of this study and is introduced in detail. The experimental results show that a maximum distance error of 0.05 mm was detected with the articulated laser sensor. We demonstrate that the proposed high-accuracy calibration method is feasible and effective, particularly for the calibration of laser beams. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Recognition of Fingerspelling Sequences in Polish Sign Language Using Point Clouds Obtained from Depth Images
Sensors 2019, 19(5), 1078; https://doi.org/10.3390/s19051078
Received: 18 December 2018 / Revised: 14 February 2019 / Accepted: 25 February 2019 / Published: 3 March 2019
PDF Full-text (1151 KB) | HTML Full-text | XML Full-text
Abstract
The paper presents a method for recognizing sequences of static letters of the Polish finger alphabet using the point cloud descriptors: viewpoint feature histogram, eigenvalues-based descriptors, ensemble of shape functions, and global radius-based surface descriptor. Each sequence is understood as quick highly coarticulated [...] Read more.
The paper presents a method for recognizing sequences of static letters of the Polish finger alphabet using the point cloud descriptors: viewpoint feature histogram, eigenvalues-based descriptors, ensemble of shape functions, and global radius-based surface descriptor. Each sequence is understood as quick highly coarticulated motions, and the classification is performed by networks of hidden Markov models trained by transitions between postures corresponding to particular letters. Three kinds of the left-to-right Markov models of the transitions, two networks of the transition models—independent and dependent on a dictionary—as well as various combinations of point cloud descriptors are examined on a publicly available dataset of 4200 executions (registered as depth map sequences) prepared by the authors. The hand shape representation proposed in our method can also be applied for recognition of hand postures in single frames. We confirmed this using a known, challenging American finger alphabet dataset with about 60,000 depth images. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle On-Line Laser Triangulation Scanner for Wood Logs Surface Geometry Measurement
Sensors 2019, 19(5), 1074; https://doi.org/10.3390/s19051074
Received: 13 January 2019 / Revised: 17 February 2019 / Accepted: 26 February 2019 / Published: 2 March 2019
PDF Full-text (10320 KB) | HTML Full-text | XML Full-text
Abstract
The paper presents the automated on-line system for wood logs 3D geometry scanning. The system consists of 6 laser triangulation scanners and is able to scan full wood logs which can have the diameter ranging from 250 mm to 500 mm and the [...] Read more.
The paper presents the automated on-line system for wood logs 3D geometry scanning. The system consists of 6 laser triangulation scanners and is able to scan full wood logs which can have the diameter ranging from 250 mm to 500 mm and the length up to 4000 mm. The system was developed as a part of the BIOSTRATEG project aiming to optimize the cutting of logs in the process of wood planks manufacturing by intelligent positioning in sawmill operation. This paper illustrates the detailed description of scanner construction, full measurement process, system calibration and data processing schemes. The full 3D surface geometry of products and their applied portion of selected wood logs formed after cutting out the cant is also demonstrated. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Graphical abstract

Open AccessArticle Multi-Channel Convolutional Neural Network Based 3D Object Detection for Indoor Robot Environmental Perception
Sensors 2019, 19(4), 893; https://doi.org/10.3390/s19040893
Received: 4 January 2019 / Revised: 14 February 2019 / Accepted: 14 February 2019 / Published: 21 February 2019
Cited by 1 | PDF Full-text (4412 KB) | HTML Full-text | XML Full-text
Abstract
Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more [...] Read more.
Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more abstract concepts, such as objects and scenes. Moreover, the 2D object detection based on images always fails to provide the actual position and size of an object, which is quite important for a robot’s operation. In this paper, we focus on the 3D object detection to regress the object’s category, 3D size, and spatial position through a convolutional neural network (CNN). We propose a multi-channel CNN for 3D object detection, which fuses three input channels including RGB, depth, and bird’s eye view (BEV) images. We also propose a method to generate 3D proposals based on 2D ones in the RGB image and semantic prior. Training and test are conducted on the modified NYU V2 dataset and SUN RGB-D dataset in order to verify the effectiveness of the algorithm. We also carry out the actual experiments in a service robot to utilize the proposed 3D object detection method to enhance the environmental perception of the robot. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Measurement of Human Gait Symmetry using Body Surface Normals Extracted from Depth Maps
Sensors 2019, 19(4), 891; https://doi.org/10.3390/s19040891
Received: 11 December 2018 / Revised: 16 February 2019 / Accepted: 18 February 2019 / Published: 21 February 2019
PDF Full-text (1183 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, we introduce an approach for measuring human gait symmetry where the input is a sequence of depth maps of subject walking on a treadmill. Body surface normals are used to describe 3D information of the walking subject in each frame. [...] Read more.
In this paper, we introduce an approach for measuring human gait symmetry where the input is a sequence of depth maps of subject walking on a treadmill. Body surface normals are used to describe 3D information of the walking subject in each frame. Two different schemes for embedding the temporal factor into a symmetry index are proposed. Experiments on the whole body, as well as the lower limbs, were also considered to assess the usefulness of upper body information in this task. The potential of our method was demonstrated with a dataset of 97,200 depth maps of nine different walking gaits. An ROC analysis for abnormal gait detection gave the best result ( AUC = 0.958 ) compared with other related studies. The experimental results provided by our method confirm the contribution of upper body in gait analysis as well as the reliability of approximating average gait symmetry index without explicitly considering individual gait cycles for asymmetry detection. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Exploring RGB+Depth Fusion for Real-Time Object Detection
Sensors 2019, 19(4), 866; https://doi.org/10.3390/s19040866
Received: 20 December 2018 / Revised: 1 February 2019 / Accepted: 16 February 2019 / Published: 19 February 2019
PDF Full-text (5191 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a [...] Read more.
In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a Kinect or stereo setups. We investigate the optimal manner to perform this sensor fusion with a special focus on lightweight single-pass convolutional neural network (CNN) architectures, enabling real-time processing on limited hardware. For this, we implement a network architecture allowing us to parameterize at which network layer both information sources are fused together. We performed exhaustive experiments to determine the optimal fusion point in the network, from which we can conclude that fusing towards the mid to late layers provides the best results. Our best fusion models significantly outperform the baseline RGB network in both accuracy and localization of the detections. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Fast Calibration Method for Photonic Mixer Device Solid-State Array Lidars
Sensors 2019, 19(4), 822; https://doi.org/10.3390/s19040822
Received: 25 December 2018 / Revised: 5 February 2019 / Accepted: 14 February 2019 / Published: 17 February 2019
PDF Full-text (23722 KB) | HTML Full-text | XML Full-text
Abstract
The photonic mixer device (PMD) solid-state array lidar, as a three-dimensional imaging technology, has attracted research attention in recent years because of its low cost, high frame rate, and high reliability. To address the disadvantages of traditional PMD solid-state array lidar calibration methods, [...] Read more.
The photonic mixer device (PMD) solid-state array lidar, as a three-dimensional imaging technology, has attracted research attention in recent years because of its low cost, high frame rate, and high reliability. To address the disadvantages of traditional PMD solid-state array lidar calibration methods, including low calibration efficiency and accuracy, and serious human error factors, this paper first proposes a calibration method for an array complementary metal–oxide–semiconductor photodetector using a black-box calibration device and an electrical analog delay method; it then proposes a modular lens distortion correction method based on checkerboard calibration and pixel point adaptive interpolation optimization. Specifically, the ranging error source is analyzed based on the PMD solid-state array lidar imaging mechanism; the black-box calibration device is specifically designed for the calibration requirements of anti-ambient light and an echo reflection route; a dynamic distance simulation system integrating the laser emission unit, laser receiving unit, and delay control unit is designed to calibrate the photodetector echo demodulation; the checkerboard calibration method is used to correct external lens distortion in grayscale mode; and the pixel adaptive interpolation strategy is used to reduce distortion of distance images. Through analysis of the calibration process and results, the proposed method effectively reduces the calibration scene requirements and human factors, meets the needs of different users of the lens, and improves both calibration efficiency and measurement accuracy. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Deep Attention Models for Human Tracking Using RGBD
Sensors 2019, 19(4), 750; https://doi.org/10.3390/s19040750
Received: 21 December 2018 / Revised: 31 January 2019 / Accepted: 3 February 2019 / Published: 13 February 2019
PDF Full-text (4488 KB) | HTML Full-text | XML Full-text
Abstract
Visual tracking performance has long been limited by the lack of better appearance models. These models fail either where they tend to change rapidly, like in motion-based tracking, or where accurate information of the object may not be available, like in color camouflage [...] Read more.
Visual tracking performance has long been limited by the lack of better appearance models. These models fail either where they tend to change rapidly, like in motion-based tracking, or where accurate information of the object may not be available, like in color camouflage (where background and foreground colors are similar). This paper proposes a robust, adaptive appearance model which works accurately in situations of color camouflage, even in the presence of complex natural objects. The proposed model includes depth as an additional feature in a hierarchical modular neural framework for online object tracking. The model adapts to the confusing appearance by identifying the stable property of depth between the target and the surrounding object(s). The depth complements the existing RGB features in scenarios when RGB features fail to adapt, hence becoming unstable over a long duration of time. The parameters of the model are learned efficiently in the Deep network, which consists of three modules: (1) The spatial attention layer, which discards the majority of the background by selecting a region containing the object of interest; (2) the appearance attention layer, which extracts appearance and spatial information about the tracked object; and (3) the state estimation layer, which enables the framework to predict future object appearance and location. Three different models were trained and tested to analyze the effect of depth along with RGB information. Also, a model is proposed to utilize only depth as a standalone input for tracking purposes. The proposed models were also evaluated in real-time using KinectV2 and showed very promising results. The results of our proposed network structures and their comparison with the state-of-the-art RGB tracking model demonstrate that adding depth significantly improves the accuracy of tracking in a more challenging environment (i.e., cluttered and camouflaged environments). Furthermore, the results of depth-based models showed that depth data can provide enough information for accurate tracking, even without RGB information. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A High-Computational Efficiency Human Detection and Flow Estimation Method Based on TOF Measurements
Sensors 2019, 19(3), 729; https://doi.org/10.3390/s19030729
Received: 29 December 2018 / Revised: 30 January 2019 / Accepted: 8 February 2019 / Published: 11 February 2019
PDF Full-text (7621 KB) | HTML Full-text | XML Full-text
Abstract
State-of-the-art human detection methods focus on deep network architectures to achieve higher recognition performance, at the expense of huge computation. However, computational efficiency and real-time performance are also important evaluation indicators. This paper presents a fast real-time human detection and flow estimation method [...] Read more.
State-of-the-art human detection methods focus on deep network architectures to achieve higher recognition performance, at the expense of huge computation. However, computational efficiency and real-time performance are also important evaluation indicators. This paper presents a fast real-time human detection and flow estimation method using depth images captured by a top-view TOF camera. The proposed algorithm mainly consists of head detection based on local pooling and searching, classification refinement based on human morphological features, and tracking assignment filter based on dynamic multi-dimensional feature. A depth image dataset record with more than 10k entries and departure events with detailed human location annotations is established. Taking full advantage of the distance information implied in the depth image, we achieve high-accuracy human detection and people counting with accuracy of 97.73% and significantly reduce the running time. Experiments demonstrate that our algorithm can run at 23.10 ms per frame on a CPU platform. In addition, the proposed robust approach is effective in complex situations such as fast walking, occlusion, crowded scenes, etc. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Picking Towels in Point Clouds
Sensors 2019, 19(3), 713; https://doi.org/10.3390/s19030713
Received: 20 December 2018 / Revised: 31 January 2019 / Accepted: 4 February 2019 / Published: 10 February 2019
PDF Full-text (2168 KB) | HTML Full-text | XML Full-text
Abstract
Picking clothing has always been a great challenge in laundry or textile industry automation, especially when some clothes are of the same colors, material and entangled with each other. In order to solve the problem, we present a grasp pose determination method to [...] Read more.
Picking clothing has always been a great challenge in laundry or textile industry automation, especially when some clothes are of the same colors, material and entangled with each other. In order to solve the problem, we present a grasp pose determination method to pick towels placed in a laundry basket or on a table. In our method, it is not needed to segment towels into independent items and the target towels are not necessarily distinguishable in color. The proposed algorithm firstly segments point clouds into several convex wrinkles, and then selects the appropriate grasp point on the candidate convex wrinkle. Moreover, we plan the grasp orientation with respect to the wrinkle which can effectively reduce the grasp failure caused by the inappropriate grasp direction. We evaluate our method on picking white towels and square towels, respectively, and achieved an average success rate of about 80%. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Embedded Processing and Compression of 3D Sensor Data for Large Scale Industrial Environments
Sensors 2019, 19(3), 636; https://doi.org/10.3390/s19030636
Received: 20 December 2018 / Revised: 20 January 2019 / Accepted: 29 January 2019 / Published: 2 February 2019
PDF Full-text (7423 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
This paper presents a scalable embedded solution for processing and transferring 3D point cloud data. Sensors based on the time-of-flight principle generate data which are processed on a local embedded computer and compressed using an octree-based scheme. The compressed data is transferred to [...] Read more.
This paper presents a scalable embedded solution for processing and transferring 3D point cloud data. Sensors based on the time-of-flight principle generate data which are processed on a local embedded computer and compressed using an octree-based scheme. The compressed data is transferred to a central node where the individual point clouds from several nodes are decompressed and filtered based on a novel method for generating intensity values for sensors which do not natively produce such a value. The paper presents experimental results from a relatively large industrial robot cell with an approximate size of 10 m × 10 m × 4 m. The main advantage of processing point cloud data locally on the nodes is scalability. The proposed solution could, with a dedicated Gigabit Ethernet local network, be scaled up to approximately 440 sensor nodes, only limited by the processing power of the central node that is receiving the compressed data from the local nodes. A compression ratio of 40.5 was obtained when compressing a point cloud stream from a single Microsoft Kinect V2 sensor using an octree resolution of 4 cm. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Occluded-Object 3D Reconstruction Using Camera Array Synthetic Aperture Imaging
Sensors 2019, 19(3), 607; https://doi.org/10.3390/s19030607
Received: 28 December 2018 / Revised: 25 January 2019 / Accepted: 28 January 2019 / Published: 31 January 2019
PDF Full-text (51235 KB) | HTML Full-text | XML Full-text
Abstract
With the three-dimensional (3D) coordinates of objects captured by a sequence of images taken in different views, object reconstruction is a technique which aims to recover the shape and appearance information of objects. Although great progress in object reconstruction has been made over [...] Read more.
With the three-dimensional (3D) coordinates of objects captured by a sequence of images taken in different views, object reconstruction is a technique which aims to recover the shape and appearance information of objects. Although great progress in object reconstruction has been made over the past few years, object reconstruction in occlusion situations remains a challenging problem. In this paper, we propose a novel method to reconstruct occluded objects based on synthetic aperture imaging. Unlike most existing methods, which either assume that there is no occlusion in the scene or remove the occlusion from the reconstructed result, our method uses the characteristics of synthetic aperture imaging that can effectively reduce the influence of occlusion to reconstruct the scene with occlusion. The proposed method labels occlusion pixels according to variance and reconstructs the 3D point cloud based on synthetic aperture imaging. Accuracies of the point cloud are tested by calculating the spatial difference between occlusion and non-occlusion conditions. The experiment results show that the proposed method can handle the occluded situation well and demonstrates a promising performance. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Novel Mobile Structured Light System in Food 3D Reconstruction and Volume Estimation
Sensors 2019, 19(3), 564; https://doi.org/10.3390/s19030564
Received: 20 December 2018 / Revised: 16 January 2019 / Accepted: 28 January 2019 / Published: 29 January 2019
PDF Full-text (8146 KB) | HTML Full-text | XML Full-text
Abstract
Over the past ten years, diabetes has rapidly become more prevalent in all age demographics and especially in children. Improved dietary assessment techniques are necessary for epidemiological studies that investigate the relationship between diet and disease. Current nutritional research is hindered by the [...] Read more.
Over the past ten years, diabetes has rapidly become more prevalent in all age demographics and especially in children. Improved dietary assessment techniques are necessary for epidemiological studies that investigate the relationship between diet and disease. Current nutritional research is hindered by the low accuracy of traditional dietary intake estimation methods used for portion size assessment. This paper presents the development and validation of a novel instrumentation system for measuring accurate dietary intake for diabetic patients. This instrument uses a mobile Structured Light System (SLS), which measures the food volume and portion size of a patient’s diet in daily living conditions. The SLS allows for the accurate determination of the volume and portion size of a scanned food item. Once the volume of a food item is calculated, the nutritional content of the item can be estimated using existing nutritional databases. The system design includes a volume estimation algorithm and a hardware add-on that consists of a laser module and a diffraction lens. The experimental results demonstrate an improvement of around 40% in the accuracy of the volume or portion size measurement when compared to manual calculation. The limitations and shortcomings of the system are discussed in this manuscript. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Graphical abstract

Open AccessArticle High Level 3D Structure Extraction from a Single Image Using a CNN-Based Approach
Sensors 2019, 19(3), 563; https://doi.org/10.3390/s19030563
Received: 21 December 2018 / Revised: 19 January 2019 / Accepted: 25 January 2019 / Published: 29 January 2019
PDF Full-text (15982 KB) | HTML Full-text | XML Full-text
Abstract
High-Level Structure (HLS) extraction in a set of images consists of recognizing 3D elements with useful information to the user or application. There are several approaches to HLS extraction. However, most of these approaches are based on processing two or more images captured [...] Read more.
High-Level Structure (HLS) extraction in a set of images consists of recognizing 3D elements with useful information to the user or application. There are several approaches to HLS extraction. However, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds extracted from the camera images. In contrast and motivated by the extensive work developed for the problem of depth estimation in a single image, where parallax constraints are not required, in this work, we propose a novel methodology towards HLS extraction from a single image with promising results. For that, our method has four steps. First, we use a CNN to predict the depth for a single image. Second, we propose a region-wise analysis to refine depth estimates. Third, we introduce a graph analysis to segment the depth in semantic orientations aiming at identifying potential HLS. Finally, the depth sections are provided to a new CNN architecture that predicts HLS in the shape of cubes and rectangular parallelepipeds. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Graphical abstract

Open AccessArticle Robust Depth Estimation for Light Field Microscopy
Sensors 2019, 19(3), 500; https://doi.org/10.3390/s19030500
Received: 18 December 2018 / Revised: 21 January 2019 / Accepted: 22 January 2019 / Published: 25 January 2019
PDF Full-text (10482 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages [...] Read more.
Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Metrological and Critical Characterization of the Intel D415 Stereo Depth Camera
Sensors 2019, 19(3), 489; https://doi.org/10.3390/s19030489
Received: 21 December 2018 / Revised: 18 January 2019 / Accepted: 22 January 2019 / Published: 25 January 2019
Cited by 1 | PDF Full-text (5580 KB) | HTML Full-text | XML Full-text
Abstract
Low-cost RGB-D cameras are increasingly being used in several research fields, including human–machine interaction, safety, robotics, biomedical engineering and even reverse engineering applications. Among the plethora of commercial devices, the Intel RealSense cameras have proven to be among the most suitable devices, providing [...] Read more.
Low-cost RGB-D cameras are increasingly being used in several research fields, including human–machine interaction, safety, robotics, biomedical engineering and even reverse engineering applications. Among the plethora of commercial devices, the Intel RealSense cameras have proven to be among the most suitable devices, providing a good compromise between cost, ease of use, compactness and precision. Released on the market in January 2018, the new Intel model RealSense D415 has a wide acquisition range (i.e., ~160–10,000 mm) and a narrow field of view to capture objects in rapid motion. Given the unexplored potential of this new device, especially when used as a 3D scanner, the present work aims to characterize and to provide metrological considerations for the RealSense D415. In particular, tests are carried out to assess the device performance in the near range (i.e., 100–1000 mm). Characterization is performed by integrating the guidelines of the existing standard (i.e., the German VDI/VDE 2634 Part 2) with a number of literature-based strategies. Performance analysis is finally compared against the latest close-range sensors, thus providing a useful guidance for researchers and practitioners aiming to use RGB-D cameras in reverse engineering applications. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Three-Dimensional Face Reconstruction Using Multi-View-Based Bilinear Model
Sensors 2019, 19(3), 459; https://doi.org/10.3390/s19030459
Received: 4 December 2018 / Revised: 14 January 2019 / Accepted: 18 January 2019 / Published: 23 January 2019
PDF Full-text (12543 KB) | HTML Full-text | XML Full-text
Abstract
Face reconstruction is a popular topic in 3D vision system. However, traditional methods often depend on monocular cues, which contain few feature pixels and only use their location information while ignoring a lot of textural information. Furthermore, they are affected by the accuracy [...] Read more.
Face reconstruction is a popular topic in 3D vision system. However, traditional methods often depend on monocular cues, which contain few feature pixels and only use their location information while ignoring a lot of textural information. Furthermore, they are affected by the accuracy of the feature extraction method and occlusion. Here, we propose a novel facial reconstruction framework that accurately extracts the 3D shapes and poses of faces from images captured at multi-views. It extends the traditional method using the monocular bilinear model to the multi-view-based bilinear model by incorporating the feature prior constraint and the texture constraint, which are learned from multi-view images. The feature prior constraint is used as a shape prior to allowing us to estimate accurate 3D facial contours. Furthermore, the texture constraint extracts a high-precision 3D facial shape where traditional methods fail because of their limited number of feature points or the mostly texture-less and texture-repetitive nature of the input images. Meanwhile, it fully explores the implied 3D information of the multi-view images, which also enhances the robustness of the results. Additionally, the proposed method uses only two or more uncalibrated images with an arbitrary baseline, estimating calibration and shape simultaneously. A comparison with the state-of-the-art monocular bilinear model-based method shows that the proposed method has a significantly higher level of accuracy. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Combining Non-Uniform Time Slice and Finite Difference to Improve 3D Ghost Imaging
Sensors 2019, 19(2), 418; https://doi.org/10.3390/s19020418
Received: 29 November 2018 / Revised: 15 January 2019 / Accepted: 18 January 2019 / Published: 21 January 2019
PDF Full-text (5339 KB) | HTML Full-text | XML Full-text
Abstract
Three-dimensional ghost imaging (3DGI) using a detector is widely used in many applications. The performance of 3DGI based on a uniform time slice is difficult to improve because obtaining an accurate time-slice position remains a challenge. This paper reports a novel structure based [...] Read more.
Three-dimensional ghost imaging (3DGI) using a detector is widely used in many applications. The performance of 3DGI based on a uniform time slice is difficult to improve because obtaining an accurate time-slice position remains a challenge. This paper reports a novel structure based on non-uniform time slice combined with finite difference. In this approach, finite difference is beneficial to improving sensitivity of zero crossing to accurately obtain the position of the target in the field of view. Simultaneously, non-uniform time slice is used to quickly obtain 3DGI on an interesting target. Results show that better performances of 3DGI are obtained by our proposed method compared to the traditional method. Moreover, the relation between time slice and the signal-noise-ratio of 3DGI is discussed, and the optimal differential distance is obtained, thus motivating the development of a high-performance 3DGI. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Graph Cut-Based Human Body Segmentation in Color Images Using Skeleton Information from the Depth Sensor
Sensors 2019, 19(2), 393; https://doi.org/10.3390/s19020393
Received: 30 November 2018 / Revised: 13 January 2019 / Accepted: 17 January 2019 / Published: 18 January 2019
PDF Full-text (3576 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Segmentation of human bodies in images is useful for a variety of applications, including background substitution, human activity recognition, security, and video surveillance applications. However, human body segmentation has been a challenging problem, due to the complicated shape and motion of a non-rigid [...] Read more.
Segmentation of human bodies in images is useful for a variety of applications, including background substitution, human activity recognition, security, and video surveillance applications. However, human body segmentation has been a challenging problem, due to the complicated shape and motion of a non-rigid human body. Meanwhile, depth sensors with advanced pattern recognition algorithms provide human body skeletons in real time with reasonable accuracy. In this study, we propose an algorithm that projects the human body skeleton from a depth image to a color image, where the human body region is segmented in the color image by using the projected skeleton as a segmentation cue. Experimental results using the Kinect sensor demonstrate that the proposed method provides high quality segmentation results and outperforms the conventional methods. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Pixelwise Phase Unwrapping Based on Ordered Periods Phase Shift
Sensors 2019, 19(2), 377; https://doi.org/10.3390/s19020377
Received: 20 December 2018 / Revised: 14 January 2019 / Accepted: 14 January 2019 / Published: 17 January 2019
PDF Full-text (23823 KB) | HTML Full-text | XML Full-text
Abstract
The existing phase-shift methods are effective in achieving high-speed, high-precision, high-resolution, real-time shape measurement of moving objects; however, a phase-unwrapping method that can handle the motion of target objects in a real environment and is robust against global illumination as well is yet [...] Read more.
The existing phase-shift methods are effective in achieving high-speed, high-precision, high-resolution, real-time shape measurement of moving objects; however, a phase-unwrapping method that can handle the motion of target objects in a real environment and is robust against global illumination as well is yet to be established. Accordingly, a robust and highly accurate method for determining the absolute phase, using a minimum of three steps, is proposed in this study. In this proposed method, an order structure that rearranges the projection pattern for each period of the sine wave is introduced, so that solving the phase unwrapping problem comes down to calculating the pattern order. Using simulation experiments, it has been confirmed that the proposed method can be used in high-speed, high-precision, high-resolution, three-dimensional shape measurements even in situations with high-speed moving objects and presence of global illumination. In this study, an experimental measurement system was configured with a high-speed camera and projector, and real-time measurements were performed with a processing time of 1.05 ms and a throughput of 500 fps. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Novel Method for Extrinsic Calibration of Multiple RGB-D Cameras Using Descriptor-Based Patterns
Sensors 2019, 19(2), 349; https://doi.org/10.3390/s19020349
Received: 2 December 2018 / Revised: 21 December 2018 / Accepted: 14 January 2019 / Published: 16 January 2019
PDF Full-text (8064 KB) | HTML Full-text | XML Full-text
Abstract
This paper presents a novel method to estimate the relative poses between RGB-D cameras with minimal overlapping fields of view. This calibration problem is relevant to applications such as indoor 3D mapping and robot navigation that can benefit from a wider field of [...] Read more.
This paper presents a novel method to estimate the relative poses between RGB-D cameras with minimal overlapping fields of view. This calibration problem is relevant to applications such as indoor 3D mapping and robot navigation that can benefit from a wider field of view using multiple RGB-D cameras. The proposed approach relies on descriptor-based patterns to provide well-matched 2D keypoints in the case of a minimal overlapping field of view between cameras. Integrating the matched 2D keypoints with corresponding depth values, a set of 3D matched keypoints are constructed to calibrate multiple RGB-D cameras. Experiments validated the accuracy and efficiency of the proposed calibration approach. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle 3D Affine: An Embedding of Local Image Features for Viewpoint Invariance Using RGB-D Sensor Data
Sensors 2019, 19(2), 291; https://doi.org/10.3390/s19020291
Received: 28 October 2018 / Revised: 20 December 2018 / Accepted: 7 January 2019 / Published: 12 January 2019
PDF Full-text (17375 KB) | HTML Full-text | XML Full-text
Abstract
Local image features are invariant to in-plane rotations and robust to minor viewpoint changes. However, the current detectors and descriptors for local image features fail to accommodate out-of-plane rotations larger than 25°–30°. Invariance to such viewpoint changes is essential for numerous applications, including [...] Read more.
Local image features are invariant to in-plane rotations and robust to minor viewpoint changes. However, the current detectors and descriptors for local image features fail to accommodate out-of-plane rotations larger than 25°–30°. Invariance to such viewpoint changes is essential for numerous applications, including wide baseline matching, 6D pose estimation, and object reconstruction. In this study, we present a general embedding that wraps a detector/descriptor pair in order to increase viewpoint invariance by exploiting input depth maps. The proposed embedding locates smooth surfaces within the input RGB-D images and projects them into a viewpoint invariant representation, enabling the detection and description of more viewpoint invariant features. Our embedding can be utilized with different combinations of descriptor/detector pairs, according to the desired application. Using synthetic and real-world objects, we evaluated the viewpoint invariance of various detectors and descriptors, for both standalone and embedded approaches. While standalone local image features fail to accommodate average viewpoint changes beyond 33.3°, our proposed embedding boosted the viewpoint invariance to different levels, depending on the scene geometry. Objects with distinct surface discontinuities were on average invariant up to 52.8°, and the overall average for all evaluated datasets was 45.4°. Similarly, out of a total of 140 combinations involving 20 local image features and various objects with distinct surface discontinuities, only a single standalone local image feature exceeded the goal of 60° viewpoint difference in just two combinations, as compared with 19 different local image features succeeding in 73 combinations when wrapped in the proposed embedding. Furthermore, the proposed approach operates robustly in the presence of input depth noise, even that of low-cost commodity depth sensors, and well beyond. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors
Sensors 2019, 19(2), 282; https://doi.org/10.3390/s19020282
Received: 13 December 2018 / Revised: 5 January 2019 / Accepted: 7 January 2019 / Published: 11 January 2019
PDF Full-text (13415 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. [...] Read more.
In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust 3D optical data extraction. The subject’s motion is efficiently captured by applying a template-based fitting technique on the extracted optical data. Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D). The FCN model outperforms its competitors on the DMC2.5D dataset using 2D Percentage of Correct Keypoints (PCK) metric, while the motion capture outcome is evaluated against RGB-D and inertial data fusion approaches on DMC3D, outperforming the next best method by 4.5 % in total 3D PCK accuracy. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Graphical abstract

Open AccessArticle Incremental 3D Cuboid Modeling with Drift Compensation
Sensors 2019, 19(1), 178; https://doi.org/10.3390/s19010178
Received: 3 December 2018 / Revised: 26 December 2018 / Accepted: 28 December 2018 / Published: 6 January 2019
PDF Full-text (6573 KB) | HTML Full-text | XML Full-text
Abstract
This paper presents a framework of incremental 3D cuboid modeling by using the mapping results of an RGB-D camera based simultaneous localization and mapping (SLAM) system. This framework is useful in accurately creating cuboid CAD models from a point cloud in an online [...] Read more.
This paper presents a framework of incremental 3D cuboid modeling by using the mapping results of an RGB-D camera based simultaneous localization and mapping (SLAM) system. This framework is useful in accurately creating cuboid CAD models from a point cloud in an online manner. While performing the RGB-D SLAM, planes are incrementally reconstructed from a point cloud in each frame to create a plane map. Then, cuboids are detected in the plane map by analyzing the positional relationships between the planes, such as orthogonality, convexity, and proximity. Finally, the position, pose, and size of a cuboid are determined by computing the intersection of three perpendicular planes. To suppress the false detection of the cuboids, the cuboid shapes are incrementally updated with sequential measurements to check the uncertainty of the cuboids. In addition, the drift error of the SLAM is compensated by the registration of the cuboids. As an application of our framework, an augmented reality-based interactive cuboid modeling system was developed. In the evaluation at cluttered environments, the precision and recall of the cuboid detection were investigated, compared with a batch-based cuboid detection method, so that the advantages of our proposed method were clarified. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Depth from a Motion Algorithm and a Hardware Architecture for Smart Cameras
Sensors 2019, 19(1), 53; https://doi.org/10.3390/s19010053
Received: 3 November 2018 / Revised: 27 November 2018 / Accepted: 29 November 2018 / Published: 23 December 2018
Cited by 1 | PDF Full-text (9792 KB) | HTML Full-text | XML Full-text
Abstract
Applications such as autonomous navigation, robot vision, and autonomous flying require depth map information of a scene. Depth can be estimated by using a single moving camera (depth from motion). However, the traditional depth from motion algorithms have low processing speeds and high [...] Read more.
Applications such as autonomous navigation, robot vision, and autonomous flying require depth map information of a scene. Depth can be estimated by using a single moving camera (depth from motion). However, the traditional depth from motion algorithms have low processing speeds and high hardware requirements that limit the embedded capabilities. In this work, we propose a hardware architecture for depth from motion that consists of a flow/depth transformation and a new optical flow algorithm. Our optical flow formulation consists in an extension of the stereo matching problem. A pixel-parallel/window-parallel approach where a correlation function based on the sum of absolute difference (SAD) computes the optical flow is proposed. Further, in order to improve the SAD, the curl of the intensity gradient as a preprocessing step is proposed. Experimental results demonstrated that it is possible to reach higher accuracy (90% of accuracy) compared with previous Field Programmable Gate Array (FPGA)-based optical flow algorithms. For the depth estimation, our algorithm delivers dense maps with motion and depth information on all image pixels, with a processing speed up to 128 times faster than that of previous work, making it possible to achieve high performance in the context of embedded applications. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Have I Seen This Place Before? A Fast and Robust Loop Detection and Correction Method for 3D Lidar SLAM
Sensors 2019, 19(1), 23; https://doi.org/10.3390/s19010023
Received: 21 November 2018 / Revised: 11 December 2018 / Accepted: 19 December 2018 / Published: 21 December 2018
PDF Full-text (10081 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, we present a complete loop detection and correction system developed for data originating from lidar scanners. Regarding detection, we propose a combination of a global point cloud matcher with a novel registration algorithm to determine loop candidates in a highly [...] Read more.
In this paper, we present a complete loop detection and correction system developed for data originating from lidar scanners. Regarding detection, we propose a combination of a global point cloud matcher with a novel registration algorithm to determine loop candidates in a highly effective way. The registration method can deal with point clouds that are largely deviating in orientation while improving the efficiency over existing techniques. In addition, we accelerated the computation of the global point cloud matcher by a factor of 2–4, exploiting the GPU to its maximum. Experiments demonstrated that our combined approach more reliably detects loops in lidar data compared to other point cloud matchers as it leads to better precision–recall trade-offs: for nearly 100% recall, we gain up to 7% in precision. Finally, we present a novel loop correction algorithm that leads to an improvement by a factor of 2 on the average and median pose error, while at the same time only requires a handful of seconds to complete. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Robot-Driven 3D Shape Measurement System for Automatic Quality Inspection of Thermal Objects on a Forging Production Line
Sensors 2018, 18(12), 4368; https://doi.org/10.3390/s18124368
Received: 1 November 2018 / Revised: 2 December 2018 / Accepted: 5 December 2018 / Published: 10 December 2018
Cited by 2 | PDF Full-text (8004 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
The three-dimensional (3D) geometric evaluation of large thermal forging parts online is critical to quality control and energy conservation. However, this online 3D measurement task is extremely challenging for commercially available 3D sensors because of the enormous amount of heat radiation and complexity [...] Read more.
The three-dimensional (3D) geometric evaluation of large thermal forging parts online is critical to quality control and energy conservation. However, this online 3D measurement task is extremely challenging for commercially available 3D sensors because of the enormous amount of heat radiation and complexity of the online environment. To this end, an automatic and accurate 3D shape measurement system integrated with a fringe projection-based 3D scanner and an industrial robot is presented. To resist thermal radiation, a double filter set and an intelligent temperature control loop are employed in the system. In addition, a time-division-multiplexing trigger is implemented in the system to accelerate pattern projection and capture, and an improved multi-frequency phase-shifting method is proposed to reduce the number of patterns required for 3D reconstruction. Thus, the 3D measurement efficiency is drastically improved and the exposure to the thermal environment is reduced. To perform data alignment in a complex online environment, a view integration method is used in the system to align non-overlapping 3D data from different views based on the repeatability of the robot motion. Meanwhile, a robust 3D registration algorithm is used to align 3D data accurately in the presence of irrelevant background data. These components and algorithms were evaluated by experiments. The system was deployed in a forging factory on a production line and performed a stable online 3D quality inspection for thermal axles. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Simultaneous All-Parameters Calibration and Assessment of a Stereo Camera Pair Using a Scale Bar
Sensors 2018, 18(11), 3964; https://doi.org/10.3390/s18113964
Received: 26 September 2018 / Revised: 25 October 2018 / Accepted: 10 November 2018 / Published: 15 November 2018
PDF Full-text (4245 KB) | HTML Full-text | XML Full-text
Abstract
Highly accurate and easy-to-operate calibration (to determine the interior and distortion parameters) and orientation (to determine the exterior parameters) methods for cameras in large volume is a very important topic for expanding the application scope of 3D vision and photogrammetry techniques. This paper [...] Read more.
Highly accurate and easy-to-operate calibration (to determine the interior and distortion parameters) and orientation (to determine the exterior parameters) methods for cameras in large volume is a very important topic for expanding the application scope of 3D vision and photogrammetry techniques. This paper proposes a method for simultaneously calibrating, orienting and assessing multi-camera 3D measurement systems in large measurement volume scenarios. The primary idea is building 3D point and length arrays by moving a scale bar in the measurement volume and then conducting a self-calibrating bundle adjustment that involves all the image points and lengths of both cameras. Relative exterior parameters between the camera pair are estimated by the five point relative orientation method. The interior, distortion parameters of each camera and the relative exterior parameters are optimized through bundle adjustment of the network geometry that is strengthened through applying the distance constraints. This method provides both internal precision and external accuracy assessment of the calibration performance. Simulations and real data experiments are designed and conducted to validate the effectivity of the method and analyze its performance under different network geometries. The RMSE of length measurement is less than 0.25 mm and the relative precision is higher than 1/25,000 for a two camera system calibrated by the proposed method in a volume of 12 m × 8 m × 4 m. Compared with the state-of-the-art point array self-calibrating bundle adjustment method, the proposed method is easier to operate and can significantly reduce systematic errors caused by wrong scaling. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A FAST-BRISK Feature Detector with Depth Information
Sensors 2018, 18(11), 3908; https://doi.org/10.3390/s18113908
Received: 7 October 2018 / Revised: 3 November 2018 / Accepted: 7 November 2018 / Published: 13 November 2018
Cited by 1 | PDF Full-text (4732 KB) | HTML Full-text | XML Full-text
Abstract
RGB-D cameras offer both color and depth images of the surrounding environment, making them an attractive option for robotic and vision applications. This work introduces the BRISK_D algorithm, which efficiently combines Features from Accelerated Segment Test (FAST) and Binary Robust Invariant Scalable Keypoints [...] Read more.
RGB-D cameras offer both color and depth images of the surrounding environment, making them an attractive option for robotic and vision applications. This work introduces the BRISK_D algorithm, which efficiently combines Features from Accelerated Segment Test (FAST) and Binary Robust Invariant Scalable Keypoints (BRISK) methods. In the BRISK_D algorithm, the keypoints are detected by the FAST algorithm and the location of the keypoint is refined in the scale and the space. The scale factor of the keypoint is directly computed with the depth information of the image. In the experiment, we have made a detailed comparative analysis of the three algorithms SURF, BRISK and BRISK_D from the aspects of scaling, rotation, perspective and blur. The BRISK_D algorithm combines depth information and has good algorithm performance. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Microscopic Three-Dimensional Measurement Based on Telecentric Stereo and Speckle Projection Methods
Sensors 2018, 18(11), 3882; https://doi.org/10.3390/s18113882
Received: 20 September 2018 / Revised: 18 October 2018 / Accepted: 9 November 2018 / Published: 11 November 2018
PDF Full-text (2756 KB) | HTML Full-text | XML Full-text
Abstract
Three-dimensional (3D) measurement of microstructures has become increasingly important, and many microscopic measurement methods have been developed. For the dimension in several millimeters together with the accuracy at sub-pixel or sub-micron level, there is almost no effective measurement method now. Here we present [...] Read more.
Three-dimensional (3D) measurement of microstructures has become increasingly important, and many microscopic measurement methods have been developed. For the dimension in several millimeters together with the accuracy at sub-pixel or sub-micron level, there is almost no effective measurement method now. Here we present a method combining the microscopic stereo measurement with the digital speckle projection. A microscopy experimental setup mainly composed of two telecentric cameras and an industrial projection module is established and a telecentric binocular stereo reconstruction procedure is carried out. The measurement accuracy has firstly been verified by performing 3D measurements of grid arrays at different locations and cylinder arrays with different height differences. Then two Mitutoyo step masters have been used for further verification. The experimental results show that the proposed method can obtain 3D information of the microstructure with a sub-pixel and even sub-micron measuring accuracy in millimeter scale. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Temperature Compensation Method for Digital Cameras in 2D and 3D Measurement Applications
Sensors 2018, 18(11), 3685; https://doi.org/10.3390/s18113685
Received: 12 September 2018 / Revised: 26 October 2018 / Accepted: 27 October 2018 / Published: 30 October 2018
PDF Full-text (38344 KB) | HTML Full-text | XML Full-text
Abstract
This paper presents the results of several studies concerning the effect of temperature on digital cameras. Experiments were performed using three different camera models. The presented results conclusively demonstrate that the typical camera design does not adequately take into account the effect of [...] Read more.
This paper presents the results of several studies concerning the effect of temperature on digital cameras. Experiments were performed using three different camera models. The presented results conclusively demonstrate that the typical camera design does not adequately take into account the effect of temperature variation on the device’s performance. In this regard, a modified camera design is proposed that exhibits a highly predictable behavior under varying ambient temperature and facilitates thermal compensation. A novel temperature compensation method is also proposed. This compensation model can be applied in almost every existing camera application, as it is compatible with every camera calibration model. A two-dimensional (2D) and three-dimensional (3D) application of the proposed compensation model is also described. The results of the application of the proposed compensation approach are presented herein. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Graphical abstract

Open AccessArticle Robust and Efficient CPU-Based RGB-D Scene Reconstruction
Sensors 2018, 18(11), 3652; https://doi.org/10.3390/s18113652
Received: 29 September 2018 / Revised: 25 October 2018 / Accepted: 25 October 2018 / Published: 28 October 2018
Cited by 2 | PDF Full-text (4364 KB) | HTML Full-text | XML Full-text
Abstract
3D scene reconstruction is an important topic in computer vision. A complete scene is reconstructed from views acquired along the camera trajectory, each view containing a small part of the scene. Tracking in textureless scenes is well known to be a Gordian knot [...] Read more.
3D scene reconstruction is an important topic in computer vision. A complete scene is reconstructed from views acquired along the camera trajectory, each view containing a small part of the scene. Tracking in textureless scenes is well known to be a Gordian knot of camera tracking, and how to obtain accurate 3D models quickly is a major challenge for existing systems. For the application of robotics, we propose a robust CPU-based approach to reconstruct indoor scenes efficiently with a consumer RGB-D camera. The proposed approach bridges feature-based camera tracking and volumetric-based data integration together and has a good reconstruction performance in terms of both robustness and efficiency. The key points in our approach include: (i) a robust and fast camera tracking method combining points and edges, which improves tracking stability in textureless scenes; (ii) an efficient data fusion strategy to select camera views and integrate RGB-D images on multiple scales, which enhances the efficiency of volumetric integration; (iii) a novel RGB-D scene reconstruction system, which can be quickly implemented on a standard CPU. Experimental results demonstrate that our approach reconstructs scenes with higher robustness and efficiency compared to state-of-the-art reconstruction systems. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Assessment of Fringe Pattern Decomposition with a Cross-Correlation Index for Phase Retrieval in Fringe Projection 3D Measurements
Sensors 2018, 18(10), 3578; https://doi.org/10.3390/s18103578
Received: 8 September 2018 / Revised: 17 October 2018 / Accepted: 18 October 2018 / Published: 22 October 2018
PDF Full-text (20492 KB) | HTML Full-text | XML Full-text
Abstract
Phase retrieval from single frame projection fringe patterns, a fundamental and challenging problem in fringe projection measurement, attracts wide attention and various new methods have emerged to address this challenge. Many phase retrieval methods are based on the decomposition of fringe patterns into [...] Read more.
Phase retrieval from single frame projection fringe patterns, a fundamental and challenging problem in fringe projection measurement, attracts wide attention and various new methods have emerged to address this challenge. Many phase retrieval methods are based on the decomposition of fringe patterns into a background part and a fringe part, and then the phase is obtained from the decomposed fringe part. However, the decomposition results are subject to the selection of model parameters, which is usually performed manually by trial and error due to the lack of decomposition assessment rules under a no ground truth data situation. In this paper, we propose a cross-correlation index to assess the decomposition and phase retrieval results without the need of ground truth data. The feasibility of the proposed metric is verified by simulated and real fringe patterns with the well-known Fourier transform method and recently proposed Shearlet transform method. This work contributes to the automatic phase retrieval and three-dimensional (3D) measurement with less human intervention, and can be potentially employed in other fields such as phase retrieval in digital holography. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Person Re-Identification with RGB-D Camera in Top-View Configuration through Multiple Nearest Neighbor Classifiers and Neighborhood Component Features Selection
Sensors 2018, 18(10), 3471; https://doi.org/10.3390/s18103471
Received: 30 August 2018 / Revised: 2 October 2018 / Accepted: 11 October 2018 / Published: 15 October 2018
PDF Full-text (4866 KB) | HTML Full-text | XML Full-text
Abstract
Person re-identification is an important topic in retail, scene monitoring, human-computer interaction, people counting, ambient assisted living and many other application fields. A dataset for person re-identification TVPR (Top View Person Re-Identification) based on a number of significant features derived from both depth [...] Read more.
Person re-identification is an important topic in retail, scene monitoring, human-computer interaction, people counting, ambient assisted living and many other application fields. A dataset for person re-identification TVPR (Top View Person Re-Identification) based on a number of significant features derived from both depth and color images has been previously built. This dataset uses an RGB-D camera in a top-view configuration to extract anthropometric features for the recognition of people in view of the camera, reducing the problem of occlusions while being privacy preserving. In this paper, we introduce a machine learning method for person re-identification using the TVPR dataset. In particular, we propose the combination of multiple k-nearest neighbor classifiers based on different distance functions and feature subsets derived from depth and color images. Moreover, the neighborhood component feature selection is used to learn the depth features’ weighting vector by minimizing the leave-one-out regularized training error. The classification process is performed by selecting the first passage under the camera for training and using the others as the testing set. Experimental results show that the proposed methodology outperforms standard supervised classifiers widely used for the re-identification task. This improvement encourages the application of this approach in the retail context in order to improve retail analytics, customer service and shopping space management. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Large Depth-of-Field Integral Microscopy by Use of a Liquid Lens
Sensors 2018, 18(10), 3383; https://doi.org/10.3390/s18103383
Received: 4 September 2018 / Revised: 28 September 2018 / Accepted: 5 October 2018 / Published: 10 October 2018
PDF Full-text (3202 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Integral microscopy is a 3D imaging technique that permits the recording of spatial and angular information of microscopic samples. From this information it is possible to calculate a collection of orthographic views with full parallax and to refocus computationally, at will, through the [...] Read more.
Integral microscopy is a 3D imaging technique that permits the recording of spatial and angular information of microscopic samples. From this information it is possible to calculate a collection of orthographic views with full parallax and to refocus computationally, at will, through the 3D specimen. An important drawback of integral microscopy, especially when dealing with thick samples, is the limited depth of field (DOF) of the perspective views. This imposes a significant limitation on the depth range of computationally refocused images. To overcome this problem, we propose here a new method that is based on the insertion, at the pupil plane of the microscope objective, of an electrically controlled liquid lens (LL) whose optical power can be changed by simply tuning the voltage. This new apparatus has the advantage of controlling the axial position of the objective focal plane while keeping constant the essential parameters of the integral microscope, that is, the magnification, the numerical aperture and the amount of parallax. Thus, given a 3D sample, the new microscope can provide a stack of integral images with complementary depth ranges. The fusion of the set of refocused images permits to enlarge the reconstruction range, obtaining images in focus over the whole region. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Direct Depth SLAM: Sparse Geometric Feature Enhanced Direct Depth SLAM System for Low-Texture Environments
Sensors 2018, 18(10), 3339; https://doi.org/10.3390/s18103339
Received: 28 July 2018 / Revised: 22 September 2018 / Accepted: 24 September 2018 / Published: 6 October 2018
Cited by 2 | PDF Full-text (37906 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
This paper presents a real-time, robust and low-drift depth-only SLAM (simultaneous localization and mapping) method for depth cameras by utilizing both dense range flow and sparse geometry features from sequential depth images. The proposed method is mainly composed of three optimization layers, namely [...] Read more.
This paper presents a real-time, robust and low-drift depth-only SLAM (simultaneous localization and mapping) method for depth cameras by utilizing both dense range flow and sparse geometry features from sequential depth images. The proposed method is mainly composed of three optimization layers, namely Direct Depth layer, ICP (Iterative closest point) Refined layer and Graph Optimization layer. The Direct Depth layer uses a range flow constraint equation to solve the fast 6-DOF (six degrees of freedom) frame-to-frame pose estimation problem. Then, the ICP Refined layer is used to reduce the local drift by applying local map based motion estimation strategy. After that, we propose a loop closure detection algorithm by extracting and matching sparse geometric features and construct a pose graph for the purpose of global pose optimization. We evaluate the performance of our method using benchmark datasets and real scene data. Experiment results show that our front-end algorithm clearly over performs the classic methods and our back-end algorithm is robust to find loop closures and reduce the global drift. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle GesID: 3D Gesture Authentication Based on Depth Camera and One-Class Classification
Sensors 2018, 18(10), 3265; https://doi.org/10.3390/s18103265
Received: 18 August 2018 / Revised: 20 September 2018 / Accepted: 26 September 2018 / Published: 28 September 2018
Cited by 1 | PDF Full-text (11263 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Biometric authentication is popular in authentication systems, and gesture as a carrier of behavior characteristics has the advantages of being difficult to imitate and containing abundant information. This research aims to use three-dimensional (3D) depth information of gesture movement to perform authentication with [...] Read more.
Biometric authentication is popular in authentication systems, and gesture as a carrier of behavior characteristics has the advantages of being difficult to imitate and containing abundant information. This research aims to use three-dimensional (3D) depth information of gesture movement to perform authentication with less user effort. We propose an approach based on depth cameras, which satisfies three requirements: Can authenticate from a single, customized gesture; achieves high accuracy without an excessive number of gestures for training; and continues learning the gesture during use of the system. To satisfy these requirements respectively: We use a sparse autoencoder to memorize the single gesture; we employ data augmentation technology to solve the problem of insufficient data; and we use incremental learning technology for allowing the system to memorize the gesture incrementally over time. An experiment has been performed on different gestures in different user situations that demonstrates the accuracy of one-class classification (OCC), and proves the effectiveness and reliability of the approach. Gesture authentication based on 3D depth cameras could be achieved with reduced user effort. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle RGB Colour Encoding Improvement for Three-Dimensional Shapes and Displacement Measurement Using the Integration of Fringe Projection and Digital Image Correlation
Sensors 2018, 18(9), 3130; https://doi.org/10.3390/s18093130
Received: 1 August 2018 / Revised: 10 September 2018 / Accepted: 13 September 2018 / Published: 17 September 2018
PDF Full-text (6471 KB) | HTML Full-text | XML Full-text
Abstract
Three-dimensional digital image correlation (3D-DIC) has become the most popular full-field optical technique for measuring 3D shapes and displacements in experimental mechanics. The integration of fringe projection (FP) and two-dimensional digital image correlation (FP + DIC) has been recently established as an intelligent [...] Read more.
Three-dimensional digital image correlation (3D-DIC) has become the most popular full-field optical technique for measuring 3D shapes and displacements in experimental mechanics. The integration of fringe projection (FP) and two-dimensional digital image correlation (FP + DIC) has been recently established as an intelligent low-cost alternative to 3D-DIC, overcoming the drawbacks of a stereoscopic system. Its experimentation is based on the colour encoding of the characterized fringe and speckle patterns required for FP and DIC implementation, respectively. In the present work, innovations in experimentation using FP + DIC for more accurate results are presented. Specifically, they are based on the improvement of the colour pattern encoding. To achieve this, in this work, a multisensor camera and/or laser structural illumination were employed. Both alternatives are analysed and evaluated. Results show that improvements both in three-dimensional and in-plane displacement are obtained with the proposed alternatives. Nonetheless, multisensor high-speed cameras are uncommon, and laser structural illumination is established as an important improvement when low uncertainty is required for 2D-displacement measurement. Hence, the uncertainty has been demonstrated to be reduced by up to 50% compared with results obtained in previous experimental approaches of FP + DIC. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Versatile Method for Depth Data Error Estimation in RGB-D Sensors
Sensors 2018, 18(9), 3122; https://doi.org/10.3390/s18093122
Received: 8 August 2018 / Revised: 10 September 2018 / Accepted: 13 September 2018 / Published: 16 September 2018
PDF Full-text (860 KB) | HTML Full-text | XML Full-text
Abstract
We propose a versatile method for estimating the RMS error of depth data provided by generic 3D sensors with the capability of generating RGB and depth (D) data of the scene, i.e., the ones based on techniques such as structured light, [...] Read more.
We propose a versatile method for estimating the RMS error of depth data provided by generic 3D sensors with the capability of generating RGB and depth (D) data of the scene, i.e., the ones based on techniques such as structured light, time of flight and stereo. A common checkerboard is used, the corners are detected and two point clouds are created, one with the real coordinates of the pattern corners and one with the corner coordinates given by the device. After a registration of these two clouds, the RMS error is computed. Then, using curve fittings methods, an equation is obtained that generalizes the RMS error as a function of the distance between the sensor and the checkerboard pattern. The depth errors estimated by our method are compared to those estimated by state-of-the-art approaches, validating its accuracy and utility. This method can be used to rapidly estimate the quality of RGB-D sensors, facilitating robotics applications as SLAM and object recognition. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Development and Experimental Evaluation of a 3D Vision System for Grinding Robot
Sensors 2018, 18(9), 3078; https://doi.org/10.3390/s18093078
Received: 3 July 2018 / Revised: 7 September 2018 / Accepted: 10 September 2018 / Published: 13 September 2018
Cited by 2 | PDF Full-text (1567 KB) | HTML Full-text | XML Full-text
Abstract
If the grinding robot can automatically position and measure the machining target on the workpiece, it will significantly improve its machining efficiency and intelligence level. However, unfortunately, the current grinding robot cannot do this because of economic and precision reasons. This paper proposes [...] Read more.
If the grinding robot can automatically position and measure the machining target on the workpiece, it will significantly improve its machining efficiency and intelligence level. However, unfortunately, the current grinding robot cannot do this because of economic and precision reasons. This paper proposes a 3D vision system mounted on the robot’s fourth joint, which is used to detect the machining target of the grinding robot. Also, the hardware architecture and data processing method of the 3D vision system is described in detail. In the data processing process, we first use the voxel grid filter to preprocess the point cloud and obtain the feature descriptor. Then use fast library for approximate nearest neighbors (FLANN) to search out the difference point cloud from the precisely registered point cloud pair and use the point cloud segmentation method proposed in this paper to extract machining path points. Finally, the detection error compensation model is used to accurately calibrate the 3D vision system to transform the machining information into the grinding robot base frame. Experimental results show that the absolute average error of repeated measurements at different locations is 0.154 mm, and the absolute measurement error of the vision system caused by compound error is usually less than 0.25 mm. The proposed 3D vision system could easily integrate into an intelligent grinding system and may be suitable for industrial sites. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A System for In-Line 3D Inspection without Hidden Surfaces
Sensors 2018, 18(9), 2993; https://doi.org/10.3390/s18092993
Received: 24 July 2018 / Revised: 1 September 2018 / Accepted: 4 September 2018 / Published: 7 September 2018
PDF Full-text (32448 KB) | HTML Full-text | XML Full-text
Abstract
This work presents a 3D scanner able to reconstruct a complete object without occlusions, including its surface appearance. The technique presents a number of differences in relation to current scanners: it does not require mechanical handling like robot arms or spinning plates, it [...] Read more.
This work presents a 3D scanner able to reconstruct a complete object without occlusions, including its surface appearance. The technique presents a number of differences in relation to current scanners: it does not require mechanical handling like robot arms or spinning plates, it is free of occlusions since the scanned part is not resting on any surface and, unlike stereo-based methods, the object does not need to have visual singularities on its surface. This system, among other applications, allows its integration in production lines that require the inspection of a large volume of parts or products, especially if there is an important variability of the objects to be inspected, since there is no mechanical manipulation. The scanner consists of a variable number of industrial quality cameras conveniently distributed so that they can capture all the surfaces of the object without any blind spot. The object is dropped through the common visual field of all the cameras, so no surface or tool occludes the views that are captured simultaneously when the part is in the center of the visible volume. A carving procedure that uses the silhouettes segmented from each image gives rise to a volumetric representation and, by means of isosurface generation techniques, to a 3D model. These techniques have certain limitations on the reconstruction of object regions with particular geometric configurations. Estimating the inherent maximum error in each area is important to bound the precision of the reconstruction. A number of experiments are presented reporting the differences between ideal and reconstructed objects in the system. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Structured-Light-Based System for Shape Measurement of the Human Body in Motion
Sensors 2018, 18(9), 2827; https://doi.org/10.3390/s18092827
Received: 20 July 2018 / Revised: 21 August 2018 / Accepted: 23 August 2018 / Published: 27 August 2018
Cited by 2 | PDF Full-text (7091 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
The existing methods for measuring the shape of the human body in motion are limited in their practical application owing to immaturity, complexity, and/or high price. Therefore, we propose a method based on structured light supported by multispectral separation to achieve multidirectional and [...] Read more.
The existing methods for measuring the shape of the human body in motion are limited in their practical application owing to immaturity, complexity, and/or high price. Therefore, we propose a method based on structured light supported by multispectral separation to achieve multidirectional and parallel acquisition. Single-frame fringe projection is employed in this method for detailed geometry reconstruction. An extended phase unwrapping method adapted for measurement of the human body is also proposed. This method utilizes local fringe parameter information to identify the optimal unwrapping path for reconstruction. Subsequently, we present a prototype 4DBODY system with a working volume of 2.0 × 1.5 × 1.5 m3, a measurement uncertainty less than 0.5 mm and an average spatial resolution of 1.0 mm for three-dimensional (3D) points. The system consists of eight directional 3D scanners functioning synchronously with an acquisition frequency of 120 Hz. The efficacy of the proposed system is demonstrated by presenting the measurement results obtained for known geometrical objects moving at various speeds as well actual human movements. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle New Method of Microimages Generation for 3D Display
Sensors 2018, 18(9), 2805; https://doi.org/10.3390/s18092805
Received: 4 August 2018 / Revised: 22 August 2018 / Accepted: 23 August 2018 / Published: 25 August 2018
Cited by 1 | PDF Full-text (14534 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, we propose a new method for the generation of microimages, which processes real 3D scenes captured with any method that permits the extraction of its depth information. The depth map of the scene, together with its color information, is used [...] Read more.
In this paper, we propose a new method for the generation of microimages, which processes real 3D scenes captured with any method that permits the extraction of its depth information. The depth map of the scene, together with its color information, is used to create a point cloud. A set of elemental images of this point cloud is captured synthetically and from it the microimages are computed. The main feature of this method is that the reference plane of displayed images can be set at will, while the empty pixels are avoided. Another advantage of the method is that the center point of displayed images and also their scale and field of view can be set. To show the final results, a 3D InI display prototype is implemented through a tablet and a microlens array. We demonstrate that this new technique overcomes the drawbacks of previous similar ones and provides more flexibility setting the characteristics of the final image. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Robust Fusion of LiDAR and Wide-Angle Camera Data for Autonomous Mobile Robots
Sensors 2018, 18(8), 2730; https://doi.org/10.3390/s18082730
Received: 9 May 2018 / Revised: 6 August 2018 / Accepted: 15 August 2018 / Published: 20 August 2018
PDF Full-text (7360 KB) | HTML Full-text | XML Full-text
Abstract
Autonomous robots that assist humans in day to day living tasks are becoming increasingly popular. Autonomous mobile robots operate by sensing and perceiving their surrounding environment to make accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors [...] Read more.
Autonomous robots that assist humans in day to day living tasks are becoming increasingly popular. Autonomous mobile robots operate by sensing and perceiving their surrounding environment to make accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors and cameras are utilized to sense the surrounding environment of autonomous vehicles. These heterogeneous sensors simultaneously capture various physical attributes of the environment. Such multimodality and redundancy of sensing need to be positively utilized for reliable and consistent perception of the environment through sensor data fusion. However, these multimodal sensor data streams are different from each other in many ways, such as temporal and spatial resolution, data format, and geometric alignment. For the subsequent perception algorithms to utilize the diversity offered by multimodal sensing, the data streams need to be spatially, geometrically and temporally aligned with each other. In this paper, we address the problem of fusing the outputs of a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image sensor for free space detection. The outputs of LiDAR scanner and the image sensor are of different spatial resolutions and need to be aligned with each other. A geometrical model is used to spatially align the two sensor outputs, followed by a Gaussian Process (GP) regression-based resolution matching algorithm to interpolate the missing data with quantifiable uncertainty. The results indicate that the proposed sensor data fusion framework significantly aids the subsequent perception steps, as illustrated by the performance improvement of a uncertainty aware free space detection algorithm. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle A Method for 6D Pose Estimation of Free-Form Rigid Objects Using Point Pair Features on Range Data
Sensors 2018, 18(8), 2678; https://doi.org/10.3390/s18082678
Received: 10 July 2018 / Revised: 12 August 2018 / Accepted: 13 August 2018 / Published: 15 August 2018
PDF Full-text (7026 KB) | HTML Full-text | XML Full-text
Abstract
Pose estimation of free-form objects is a crucial task towards flexible and reliable highly complex autonomous systems. Recently, methods based on range and RGB-D data have shown promising results with relatively high recognition rates and fast running times. On this line, this paper [...] Read more.
Pose estimation of free-form objects is a crucial task towards flexible and reliable highly complex autonomous systems. Recently, methods based on range and RGB-D data have shown promising results with relatively high recognition rates and fast running times. On this line, this paper presents a feature-based method for 6D pose estimation of rigid objects based on the Point Pair Features voting approach. The presented solution combines a novel preprocessing step, which takes into consideration the discriminative value of surface information, with an improved matching method for Point Pair Features. In addition, an improved clustering step and a novel view-dependent re-scoring process are proposed alongside two scene consistency verification steps. The proposed method performance is evaluated against 15 state-of-the-art solutions on a set of extensive and variate publicly available datasets with real-world scenarios under clutter and occlusion. The presented results show that the proposed method outperforms all tested state-of-the-art methods for all datasets with an overall 6.6% relative improvement compared to the second best method. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Towards a Meaningful 3D Map Using a 3D Lidar and a Camera
Sensors 2018, 18(8), 2571; https://doi.org/10.3390/s18082571
Received: 30 May 2018 / Revised: 27 July 2018 / Accepted: 1 August 2018 / Published: 6 August 2018
PDF Full-text (9701 KB) | HTML Full-text | XML Full-text
Abstract
Semantic 3D maps are required for various applications including robot navigation and surveying, and their importance has significantly increased. Generally, existing studies on semantic mapping were camera-based approaches that could not be operated in large-scale environments owing to their computational burden. Recently, a [...] Read more.
Semantic 3D maps are required for various applications including robot navigation and surveying, and their importance has significantly increased. Generally, existing studies on semantic mapping were camera-based approaches that could not be operated in large-scale environments owing to their computational burden. Recently, a method of combining a 3D Lidar with a camera was introduced to address this problem, and a 3D Lidar and a camera were also utilized for semantic 3D mapping. In this study, our algorithm consists of semantic mapping and map refinement. In the semantic mapping, a GPS and an IMU are integrated to estimate the odometry of the system, and subsequently, the point clouds measured from a 3D Lidar are registered by using this information. Furthermore, we use the latest CNN-based semantic segmentation to obtain semantic information on the surrounding environment. To integrate the point cloud with semantic information, we developed incremental semantic labeling including coordinate alignment, error minimization, and semantic information fusion. Additionally, to improve the quality of the generated semantic map, the map refinement is processed in a batch. It enhances the spatial distribution of labels and removes traces produced by moving vehicles effectively. We conduct experiments on challenging sequences to demonstrate that our algorithm outperforms state-of-the-art methods in terms of accuracy and intersection over union. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1

Open AccessArticle Relative Pose Based Redundancy Removal: Collaborative RGB-D Data Transmission in Mobile Visual Sensor Networks
Sensors 2018, 18(8), 2430; https://doi.org/10.3390/s18082430
Received: 13 June 2018 / Revised: 18 July 2018 / Accepted: 20 July 2018 / Published: 26 July 2018
PDF Full-text (16567 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, the Relative Pose based Redundancy Removal (RPRR) scheme is presented, which has been designed for mobile RGB-D sensor networks operating under bandwidth-constrained operational scenarios. The scheme considers a multiview scenario in which pairs of sensors observe the same scene from [...] Read more.
In this paper, the Relative Pose based Redundancy Removal (RPRR) scheme is presented, which has been designed for mobile RGB-D sensor networks operating under bandwidth-constrained operational scenarios. The scheme considers a multiview scenario in which pairs of sensors observe the same scene from different viewpoints, and detect the redundant visual and depth information to prevent their transmission leading to a significant improvement in wireless channel usage efficiency and power savings. We envisage applications in which the environment is static, and rapid 3D mapping of an enclosed area of interest is required, such as disaster recovery and support operations after earthquakes or industrial accidents. Experimental results show that wireless channel utilization is improved by 250% and battery consumption is halved when the RPRR scheme is used instead of sending the sensor images independently. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Figures

Figure 1