sensors-logo

Journal Browser

Journal Browser

Special Issue "Depth Sensors and 3D Vision"

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Physical Sensors".

Deadline for manuscript submissions: closed (20 December 2018) | Viewed by 293762

Special Issue Editor

Department of Engineering (DIEF), University of Modena and Reggio Emilia, 41125 Modena, Italy
Interests: computer vision; deep learning; vision based HCI; IoT
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The recent diffusion of inexpensive RGB-D sensors has encouraged the computer vision community to explore new solutions based on depth images. Depth information provides a significant contribution to solve or simplify several challenging tasks, such as shape analysis and classification, scene reconstruction, object segmentation, people detection, and body part recognition. The intrinsic metric information as well as the ability to handle texture and illumination variations of objects and scenes are only two of the advantages with respect to pure RGB images.

For example, hardware and software technologies included in the Microsoft Kinect framework allow an easy estimation of the 3D positions of skeleton joints, providing a new compact and expressive representation of the human body.

Although the Kinect failed as a gaming-first device, it has been a launch pad for the spread of depth sensors and, contextually, 3D vision. From a hardware perspective, several stereo, structured IR light, and ToF sensors have appeared on the market, and are studied by the scientific community. At the same time, computer vision and machine learning communities have proposed new solutions to process depth data, individually or fused with other information such as RGB images.

This Special Issue seeks innovative work to explore new hardware and software solutions for the generation and analysis of depth data, including representation models, machine learning approaches, datasets, and benchmarks.

The particular topics of interest include, but are not limited to:

  • Depth acquisition techniques
  • Depth data processing
  • Analysis of depth data
  • Fusion of depth data with other modalities
  • From and to depth domain translation
  • 3D scene reconstruction
  • 3D shape modeling and retrieval
  • 3D object recognition
  • 3D biometrics
  • 3D imaging for cultural heritage applications
  • Point cloud modelling and processing
  • Human action recognition on depth data
  • Biomedical applications of depth data
  • Other applications of depth data analysis
  • Depth datasets and benchmarks
  • Depth data visualization

Prof. Roberto Vezzani
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Depth sensors
  • 3D vision
  • Depth data generation
  • Depth data analysis
  • Depth datasets

Published Papers (63 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

Article
Development of an Active High-Speed 3-D Vision System
Sensors 2019, 19(7), 1572; https://doi.org/10.3390/s19071572 - 01 Apr 2019
Cited by 8 | Viewed by 3302
Abstract
High-speed recognition of the shape of a target object is indispensable for robots to perform various kinds of dexterous tasks in real time. In this paper, we propose a high-speed 3-D sensing system with active target-tracking. The system consists of a high-speed camera [...] Read more.
High-speed recognition of the shape of a target object is indispensable for robots to perform various kinds of dexterous tasks in real time. In this paper, we propose a high-speed 3-D sensing system with active target-tracking. The system consists of a high-speed camera head and a high-speed projector, which are mounted on a two-axis active vision system. By measuring a projected coded pattern, 3-D measurement at a rate of 500 fps was achieved. The measurement range was increased as a result of the active tracking, and the shape of the target was accurately observed even when it moved quickly. In addition, to obtain the position and orientation of the target, 500 fps real-time model matching was achieved. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Automatic Calibration of an Industrial RGB-D Camera Network Using Retroreflective Fiducial Markers
Sensors 2019, 19(7), 1561; https://doi.org/10.3390/s19071561 - 31 Mar 2019
Cited by 10 | Viewed by 5047
Abstract
This paper describes a non-invasive, automatic, and robust method for calibrating a scalable RGB-D sensor network based on retroreflective ArUco markers and the iterative closest point (ICP) scheme. We demonstrate the system by calibrating a sensor network comprised of six sensor nodes positioned [...] Read more.
This paper describes a non-invasive, automatic, and robust method for calibrating a scalable RGB-D sensor network based on retroreflective ArUco markers and the iterative closest point (ICP) scheme. We demonstrate the system by calibrating a sensor network comprised of six sensor nodes positioned in a relatively large industrial robot cell with an approximate size of 10   m × 10   m × 4 m . Here, the automatic calibration achieved an average Euclidean error of 3 c m at distances up to 9.45 m . To achieve robustness, we apply several innovative techniques: Firstly, we mitigate the ambiguity problem that occurs when detecting a marker at long range or low resolution by comparing the camera projection with depth data. Secondly, we use retroreflective fiducial markers in the RGB-D calibration for improved accuracy and detectability. Finally, the repeating ICP refinement uses an exact region of interest such that we employ the precise depth measurements of the retroreflective surfaces only. The complete calibration software and a recorded dataset are publically available and open source. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Graphical abstract

Article
Construction of All-in-Focus Images Assisted by Depth Sensing
Sensors 2019, 19(6), 1409; https://doi.org/10.3390/s19061409 - 22 Mar 2019
Cited by 5 | Viewed by 2903
Abstract
Multi-focus image fusion is a technique for obtaining an all-in-focus image in which all objects are in focus to extend the limited depth of field (DoF) of an imaging system. Different from traditional RGB-based methods, this paper presents a new multi-focus image fusion [...] Read more.
Multi-focus image fusion is a technique for obtaining an all-in-focus image in which all objects are in focus to extend the limited depth of field (DoF) of an imaging system. Different from traditional RGB-based methods, this paper presents a new multi-focus image fusion method assisted by depth sensing. In this work, a depth sensor is used together with a colour camera to capture images of a scene. A graph-based segmentation algorithm is used to segment the depth map from the depth sensor, and the segmented regions are used to guide a focus algorithm to locate in-focus image blocks from among multi-focus source images to construct the reference all-in-focus image. Five test scenes and six evaluation metrics were used to compare the proposed method and representative state-of-the-art algorithms. Experimental results quantitatively demonstrate that this method outperforms existing methods in both speed and quality (in terms of comprehensive fusion metrics). The generated images can potentially be used as reference all-in-focus images. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A Novel Calibration Method of Articulated Laser Sensor for Trans-Scale 3D Measurement
Sensors 2019, 19(5), 1083; https://doi.org/10.3390/s19051083 - 03 Mar 2019
Cited by 16 | Viewed by 3009
Abstract
The articulated laser sensor is a new kind of trans-scale and non-contact measurement instrument in regular-size space and industrial applications. These sensors overcome many deficiencies and application limitations of traditional measurement methods. The articulated laser sensor consists of two articulated laser sensing modules, [...] Read more.
The articulated laser sensor is a new kind of trans-scale and non-contact measurement instrument in regular-size space and industrial applications. These sensors overcome many deficiencies and application limitations of traditional measurement methods. The articulated laser sensor consists of two articulated laser sensing modules, and each module is made up of two rotary tables and one collimated laser. The three axes represent a non-orthogonal shaft architecture. The calibration method of system parameters for traditional instruments is no longer suitable. A novel high-accuracy calibration method of an articulated laser sensor for trans-scale 3D measurement is proposed. Based on perspective projection models and image processing techniques, the calibration method of the laser beam is the key innovative aspect of this study and is introduced in detail. The experimental results show that a maximum distance error of 0.05 mm was detected with the articulated laser sensor. We demonstrate that the proposed high-accuracy calibration method is feasible and effective, particularly for the calibration of laser beams. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Recognition of Fingerspelling Sequences in Polish Sign Language Using Point Clouds Obtained from Depth Images
Sensors 2019, 19(5), 1078; https://doi.org/10.3390/s19051078 - 03 Mar 2019
Cited by 17 | Viewed by 3225
Abstract
The paper presents a method for recognizing sequences of static letters of the Polish finger alphabet using the point cloud descriptors: viewpoint feature histogram, eigenvalues-based descriptors, ensemble of shape functions, and global radius-based surface descriptor. Each sequence is understood as quick highly coarticulated [...] Read more.
The paper presents a method for recognizing sequences of static letters of the Polish finger alphabet using the point cloud descriptors: viewpoint feature histogram, eigenvalues-based descriptors, ensemble of shape functions, and global radius-based surface descriptor. Each sequence is understood as quick highly coarticulated motions, and the classification is performed by networks of hidden Markov models trained by transitions between postures corresponding to particular letters. Three kinds of the left-to-right Markov models of the transitions, two networks of the transition models—independent and dependent on a dictionary—as well as various combinations of point cloud descriptors are examined on a publicly available dataset of 4200 executions (registered as depth map sequences) prepared by the authors. The hand shape representation proposed in our method can also be applied for recognition of hand postures in single frames. We confirmed this using a known, challenging American finger alphabet dataset with about 60,000 depth images. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
On-Line Laser Triangulation Scanner for Wood Logs Surface Geometry Measurement
Sensors 2019, 19(5), 1074; https://doi.org/10.3390/s19051074 - 02 Mar 2019
Cited by 20 | Viewed by 4245
Abstract
The paper presents the automated on-line system for wood logs 3D geometry scanning. The system consists of 6 laser triangulation scanners and is able to scan full wood logs which can have the diameter ranging from 250 mm to 500 mm and the [...] Read more.
The paper presents the automated on-line system for wood logs 3D geometry scanning. The system consists of 6 laser triangulation scanners and is able to scan full wood logs which can have the diameter ranging from 250 mm to 500 mm and the length up to 4000 mm. The system was developed as a part of the BIOSTRATEG project aiming to optimize the cutting of logs in the process of wood planks manufacturing by intelligent positioning in sawmill operation. This paper illustrates the detailed description of scanner construction, full measurement process, system calibration and data processing schemes. The full 3D surface geometry of products and their applied portion of selected wood logs formed after cutting out the cant is also demonstrated. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Graphical abstract

Article
Multi-Channel Convolutional Neural Network Based 3D Object Detection for Indoor Robot Environmental Perception
Sensors 2019, 19(4), 893; https://doi.org/10.3390/s19040893 - 21 Feb 2019
Cited by 18 | Viewed by 5082
Abstract
Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more [...] Read more.
Environmental perception is a vital feature for service robots when working in an indoor environment for a long time. The general 3D reconstruction is a low-level geometric information description that cannot convey semantics. In contrast, higher level perception similar to humans requires more abstract concepts, such as objects and scenes. Moreover, the 2D object detection based on images always fails to provide the actual position and size of an object, which is quite important for a robot’s operation. In this paper, we focus on the 3D object detection to regress the object’s category, 3D size, and spatial position through a convolutional neural network (CNN). We propose a multi-channel CNN for 3D object detection, which fuses three input channels including RGB, depth, and bird’s eye view (BEV) images. We also propose a method to generate 3D proposals based on 2D ones in the RGB image and semantic prior. Training and test are conducted on the modified NYU V2 dataset and SUN RGB-D dataset in order to verify the effectiveness of the algorithm. We also carry out the actual experiments in a service robot to utilize the proposed 3D object detection method to enhance the environmental perception of the robot. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Measurement of Human Gait Symmetry using Body Surface Normals Extracted from Depth Maps
Sensors 2019, 19(4), 891; https://doi.org/10.3390/s19040891 - 21 Feb 2019
Cited by 1 | Viewed by 2880
Abstract
In this paper, we introduce an approach for measuring human gait symmetry where the input is a sequence of depth maps of subject walking on a treadmill. Body surface normals are used to describe 3D information of the walking subject in each frame. [...] Read more.
In this paper, we introduce an approach for measuring human gait symmetry where the input is a sequence of depth maps of subject walking on a treadmill. Body surface normals are used to describe 3D information of the walking subject in each frame. Two different schemes for embedding the temporal factor into a symmetry index are proposed. Experiments on the whole body, as well as the lower limbs, were also considered to assess the usefulness of upper body information in this task. The potential of our method was demonstrated with a dataset of 97,200 depth maps of nine different walking gaits. An ROC analysis for abnormal gait detection gave the best result ( AUC = 0.958 ) compared with other related studies. The experimental results provided by our method confirm the contribution of upper body in gait analysis as well as the reliability of approximating average gait symmetry index without explicitly considering individual gait cycles for asymmetry detection. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Exploring RGB+Depth Fusion for Real-Time Object Detection
Sensors 2019, 19(4), 866; https://doi.org/10.3390/s19040866 - 19 Feb 2019
Cited by 49 | Viewed by 7588
Abstract
In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a [...] Read more.
In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a Kinect or stereo setups. We investigate the optimal manner to perform this sensor fusion with a special focus on lightweight single-pass convolutional neural network (CNN) architectures, enabling real-time processing on limited hardware. For this, we implement a network architecture allowing us to parameterize at which network layer both information sources are fused together. We performed exhaustive experiments to determine the optimal fusion point in the network, from which we can conclude that fusing towards the mid to late layers provides the best results. Our best fusion models significantly outperform the baseline RGB network in both accuracy and localization of the detections. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A Fast Calibration Method for Photonic Mixer Device Solid-State Array Lidars
Sensors 2019, 19(4), 822; https://doi.org/10.3390/s19040822 - 17 Feb 2019
Cited by 2 | Viewed by 3175
Abstract
The photonic mixer device (PMD) solid-state array lidar, as a three-dimensional imaging technology, has attracted research attention in recent years because of its low cost, high frame rate, and high reliability. To address the disadvantages of traditional PMD solid-state array lidar calibration methods, [...] Read more.
The photonic mixer device (PMD) solid-state array lidar, as a three-dimensional imaging technology, has attracted research attention in recent years because of its low cost, high frame rate, and high reliability. To address the disadvantages of traditional PMD solid-state array lidar calibration methods, including low calibration efficiency and accuracy, and serious human error factors, this paper first proposes a calibration method for an array complementary metal–oxide–semiconductor photodetector using a black-box calibration device and an electrical analog delay method; it then proposes a modular lens distortion correction method based on checkerboard calibration and pixel point adaptive interpolation optimization. Specifically, the ranging error source is analyzed based on the PMD solid-state array lidar imaging mechanism; the black-box calibration device is specifically designed for the calibration requirements of anti-ambient light and an echo reflection route; a dynamic distance simulation system integrating the laser emission unit, laser receiving unit, and delay control unit is designed to calibrate the photodetector echo demodulation; the checkerboard calibration method is used to correct external lens distortion in grayscale mode; and the pixel adaptive interpolation strategy is used to reduce distortion of distance images. Through analysis of the calibration process and results, the proposed method effectively reduces the calibration scene requirements and human factors, meets the needs of different users of the lens, and improves both calibration efficiency and measurement accuracy. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Deep Attention Models for Human Tracking Using RGBD
Sensors 2019, 19(4), 750; https://doi.org/10.3390/s19040750 - 13 Feb 2019
Cited by 10 | Viewed by 2965
Abstract
Visual tracking performance has long been limited by the lack of better appearance models. These models fail either where they tend to change rapidly, like in motion-based tracking, or where accurate information of the object may not be available, like in color camouflage [...] Read more.
Visual tracking performance has long been limited by the lack of better appearance models. These models fail either where they tend to change rapidly, like in motion-based tracking, or where accurate information of the object may not be available, like in color camouflage (where background and foreground colors are similar). This paper proposes a robust, adaptive appearance model which works accurately in situations of color camouflage, even in the presence of complex natural objects. The proposed model includes depth as an additional feature in a hierarchical modular neural framework for online object tracking. The model adapts to the confusing appearance by identifying the stable property of depth between the target and the surrounding object(s). The depth complements the existing RGB features in scenarios when RGB features fail to adapt, hence becoming unstable over a long duration of time. The parameters of the model are learned efficiently in the Deep network, which consists of three modules: (1) The spatial attention layer, which discards the majority of the background by selecting a region containing the object of interest; (2) the appearance attention layer, which extracts appearance and spatial information about the tracked object; and (3) the state estimation layer, which enables the framework to predict future object appearance and location. Three different models were trained and tested to analyze the effect of depth along with RGB information. Also, a model is proposed to utilize only depth as a standalone input for tracking purposes. The proposed models were also evaluated in real-time using KinectV2 and showed very promising results. The results of our proposed network structures and their comparison with the state-of-the-art RGB tracking model demonstrate that adding depth significantly improves the accuracy of tracking in a more challenging environment (i.e., cluttered and camouflaged environments). Furthermore, the results of depth-based models showed that depth data can provide enough information for accurate tracking, even without RGB information. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A High-Computational Efficiency Human Detection and Flow Estimation Method Based on TOF Measurements
Sensors 2019, 19(3), 729; https://doi.org/10.3390/s19030729 - 11 Feb 2019
Cited by 9 | Viewed by 3361
Abstract
State-of-the-art human detection methods focus on deep network architectures to achieve higher recognition performance, at the expense of huge computation. However, computational efficiency and real-time performance are also important evaluation indicators. This paper presents a fast real-time human detection and flow estimation method [...] Read more.
State-of-the-art human detection methods focus on deep network architectures to achieve higher recognition performance, at the expense of huge computation. However, computational efficiency and real-time performance are also important evaluation indicators. This paper presents a fast real-time human detection and flow estimation method using depth images captured by a top-view TOF camera. The proposed algorithm mainly consists of head detection based on local pooling and searching, classification refinement based on human morphological features, and tracking assignment filter based on dynamic multi-dimensional feature. A depth image dataset record with more than 10k entries and departure events with detailed human location annotations is established. Taking full advantage of the distance information implied in the depth image, we achieve high-accuracy human detection and people counting with accuracy of 97.73% and significantly reduce the running time. Experiments demonstrate that our algorithm can run at 23.10 ms per frame on a CPU platform. In addition, the proposed robust approach is effective in complex situations such as fast walking, occlusion, crowded scenes, etc. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Picking Towels in Point Clouds
Sensors 2019, 19(3), 713; https://doi.org/10.3390/s19030713 - 10 Feb 2019
Cited by 2 | Viewed by 2930
Abstract
Picking clothing has always been a great challenge in laundry or textile industry automation, especially when some clothes are of the same colors, material and entangled with each other. In order to solve the problem, we present a grasp pose determination method to [...] Read more.
Picking clothing has always been a great challenge in laundry or textile industry automation, especially when some clothes are of the same colors, material and entangled with each other. In order to solve the problem, we present a grasp pose determination method to pick towels placed in a laundry basket or on a table. In our method, it is not needed to segment towels into independent items and the target towels are not necessarily distinguishable in color. The proposed algorithm firstly segments point clouds into several convex wrinkles, and then selects the appropriate grasp point on the candidate convex wrinkle. Moreover, we plan the grasp orientation with respect to the wrinkle which can effectively reduce the grasp failure caused by the inappropriate grasp direction. We evaluate our method on picking white towels and square towels, respectively, and achieved an average success rate of about 80%. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Embedded Processing and Compression of 3D Sensor Data for Large Scale Industrial Environments
Sensors 2019, 19(3), 636; https://doi.org/10.3390/s19030636 - 02 Feb 2019
Cited by 9 | Viewed by 3842
Abstract
This paper presents a scalable embedded solution for processing and transferring 3D point cloud data. Sensors based on the time-of-flight principle generate data which are processed on a local embedded computer and compressed using an octree-based scheme. The compressed data is transferred to [...] Read more.
This paper presents a scalable embedded solution for processing and transferring 3D point cloud data. Sensors based on the time-of-flight principle generate data which are processed on a local embedded computer and compressed using an octree-based scheme. The compressed data is transferred to a central node where the individual point clouds from several nodes are decompressed and filtered based on a novel method for generating intensity values for sensors which do not natively produce such a value. The paper presents experimental results from a relatively large industrial robot cell with an approximate size of 10 m × 10 m × 4 m. The main advantage of processing point cloud data locally on the nodes is scalability. The proposed solution could, with a dedicated Gigabit Ethernet local network, be scaled up to approximately 440 sensor nodes, only limited by the processing power of the central node that is receiving the compressed data from the local nodes. A compression ratio of 40.5 was obtained when compressing a point cloud stream from a single Microsoft Kinect V2 sensor using an octree resolution of 4 cm. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Occluded-Object 3D Reconstruction Using Camera Array Synthetic Aperture Imaging
Sensors 2019, 19(3), 607; https://doi.org/10.3390/s19030607 - 31 Jan 2019
Cited by 11 | Viewed by 4839
Abstract
With the three-dimensional (3D) coordinates of objects captured by a sequence of images taken in different views, object reconstruction is a technique which aims to recover the shape and appearance information of objects. Although great progress in object reconstruction has been made over [...] Read more.
With the three-dimensional (3D) coordinates of objects captured by a sequence of images taken in different views, object reconstruction is a technique which aims to recover the shape and appearance information of objects. Although great progress in object reconstruction has been made over the past few years, object reconstruction in occlusion situations remains a challenging problem. In this paper, we propose a novel method to reconstruct occluded objects based on synthetic aperture imaging. Unlike most existing methods, which either assume that there is no occlusion in the scene or remove the occlusion from the reconstructed result, our method uses the characteristics of synthetic aperture imaging that can effectively reduce the influence of occlusion to reconstruct the scene with occlusion. The proposed method labels occlusion pixels according to variance and reconstructs the 3D point cloud based on synthetic aperture imaging. Accuracies of the point cloud are tested by calculating the spatial difference between occlusion and non-occlusion conditions. The experiment results show that the proposed method can handle the occluded situation well and demonstrates a promising performance. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A Novel Mobile Structured Light System in Food 3D Reconstruction and Volume Estimation
Sensors 2019, 19(3), 564; https://doi.org/10.3390/s19030564 - 29 Jan 2019
Cited by 18 | Viewed by 4447
Abstract
Over the past ten years, diabetes has rapidly become more prevalent in all age demographics and especially in children. Improved dietary assessment techniques are necessary for epidemiological studies that investigate the relationship between diet and disease. Current nutritional research is hindered by the [...] Read more.
Over the past ten years, diabetes has rapidly become more prevalent in all age demographics and especially in children. Improved dietary assessment techniques are necessary for epidemiological studies that investigate the relationship between diet and disease. Current nutritional research is hindered by the low accuracy of traditional dietary intake estimation methods used for portion size assessment. This paper presents the development and validation of a novel instrumentation system for measuring accurate dietary intake for diabetic patients. This instrument uses a mobile Structured Light System (SLS), which measures the food volume and portion size of a patient’s diet in daily living conditions. The SLS allows for the accurate determination of the volume and portion size of a scanned food item. Once the volume of a food item is calculated, the nutritional content of the item can be estimated using existing nutritional databases. The system design includes a volume estimation algorithm and a hardware add-on that consists of a laser module and a diffraction lens. The experimental results demonstrate an improvement of around 40% in the accuracy of the volume or portion size measurement when compared to manual calculation. The limitations and shortcomings of the system are discussed in this manuscript. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Graphical abstract

Article
High Level 3D Structure Extraction from a Single Image Using a CNN-Based Approach
Sensors 2019, 19(3), 563; https://doi.org/10.3390/s19030563 - 29 Jan 2019
Cited by 7 | Viewed by 3450
Abstract
High-Level Structure (HLS) extraction in a set of images consists of recognizing 3D elements with useful information to the user or application. There are several approaches to HLS extraction. However, most of these approaches are based on processing two or more images captured [...] Read more.
High-Level Structure (HLS) extraction in a set of images consists of recognizing 3D elements with useful information to the user or application. There are several approaches to HLS extraction. However, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds extracted from the camera images. In contrast and motivated by the extensive work developed for the problem of depth estimation in a single image, where parallax constraints are not required, in this work, we propose a novel methodology towards HLS extraction from a single image with promising results. For that, our method has four steps. First, we use a CNN to predict the depth for a single image. Second, we propose a region-wise analysis to refine depth estimates. Third, we introduce a graph analysis to segment the depth in semantic orientations aiming at identifying potential HLS. Finally, the depth sections are provided to a new CNN architecture that predicts HLS in the shape of cubes and rectangular parallelepipeds. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Graphical abstract

Article
Robust Depth Estimation for Light Field Microscopy
Sensors 2019, 19(3), 500; https://doi.org/10.3390/s19030500 - 25 Jan 2019
Cited by 20 | Viewed by 5751
Abstract
Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages [...] Read more.
Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Metrological and Critical Characterization of the Intel D415 Stereo Depth Camera
Sensors 2019, 19(3), 489; https://doi.org/10.3390/s19030489 - 25 Jan 2019
Cited by 60 | Viewed by 7900
Abstract
Low-cost RGB-D cameras are increasingly being used in several research fields, including human–machine interaction, safety, robotics, biomedical engineering and even reverse engineering applications. Among the plethora of commercial devices, the Intel RealSense cameras have proven to be among the most suitable devices, providing [...] Read more.
Low-cost RGB-D cameras are increasingly being used in several research fields, including human–machine interaction, safety, robotics, biomedical engineering and even reverse engineering applications. Among the plethora of commercial devices, the Intel RealSense cameras have proven to be among the most suitable devices, providing a good compromise between cost, ease of use, compactness and precision. Released on the market in January 2018, the new Intel model RealSense D415 has a wide acquisition range (i.e., ~160–10,000 mm) and a narrow field of view to capture objects in rapid motion. Given the unexplored potential of this new device, especially when used as a 3D scanner, the present work aims to characterize and to provide metrological considerations for the RealSense D415. In particular, tests are carried out to assess the device performance in the near range (i.e., 100–1000 mm). Characterization is performed by integrating the guidelines of the existing standard (i.e., the German VDI/VDE 2634 Part 2) with a number of literature-based strategies. Performance analysis is finally compared against the latest close-range sensors, thus providing a useful guidance for researchers and practitioners aiming to use RGB-D cameras in reverse engineering applications. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Three-Dimensional Face Reconstruction Using Multi-View-Based Bilinear Model
Sensors 2019, 19(3), 459; https://doi.org/10.3390/s19030459 - 23 Jan 2019
Cited by 3 | Viewed by 6358
Abstract
Face reconstruction is a popular topic in 3D vision system. However, traditional methods often depend on monocular cues, which contain few feature pixels and only use their location information while ignoring a lot of textural information. Furthermore, they are affected by the accuracy [...] Read more.
Face reconstruction is a popular topic in 3D vision system. However, traditional methods often depend on monocular cues, which contain few feature pixels and only use their location information while ignoring a lot of textural information. Furthermore, they are affected by the accuracy of the feature extraction method and occlusion. Here, we propose a novel facial reconstruction framework that accurately extracts the 3D shapes and poses of faces from images captured at multi-views. It extends the traditional method using the monocular bilinear model to the multi-view-based bilinear model by incorporating the feature prior constraint and the texture constraint, which are learned from multi-view images. The feature prior constraint is used as a shape prior to allowing us to estimate accurate 3D facial contours. Furthermore, the texture constraint extracts a high-precision 3D facial shape where traditional methods fail because of their limited number of feature points or the mostly texture-less and texture-repetitive nature of the input images. Meanwhile, it fully explores the implied 3D information of the multi-view images, which also enhances the robustness of the results. Additionally, the proposed method uses only two or more uncalibrated images with an arbitrary baseline, estimating calibration and shape simultaneously. A comparison with the state-of-the-art monocular bilinear model-based method shows that the proposed method has a significantly higher level of accuracy. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Combining Non-Uniform Time Slice and Finite Difference to Improve 3D Ghost Imaging
Sensors 2019, 19(2), 418; https://doi.org/10.3390/s19020418 - 21 Jan 2019
Cited by 3 | Viewed by 3086
Abstract
Three-dimensional ghost imaging (3DGI) using a detector is widely used in many applications. The performance of 3DGI based on a uniform time slice is difficult to improve because obtaining an accurate time-slice position remains a challenge. This paper reports a novel structure based [...] Read more.
Three-dimensional ghost imaging (3DGI) using a detector is widely used in many applications. The performance of 3DGI based on a uniform time slice is difficult to improve because obtaining an accurate time-slice position remains a challenge. This paper reports a novel structure based on non-uniform time slice combined with finite difference. In this approach, finite difference is beneficial to improving sensitivity of zero crossing to accurately obtain the position of the target in the field of view. Simultaneously, non-uniform time slice is used to quickly obtain 3DGI on an interesting target. Results show that better performances of 3DGI are obtained by our proposed method compared to the traditional method. Moreover, the relation between time slice and the signal-noise-ratio of 3DGI is discussed, and the optimal differential distance is obtained, thus motivating the development of a high-performance 3DGI. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Graph Cut-Based Human Body Segmentation in Color Images Using Skeleton Information from the Depth Sensor
Sensors 2019, 19(2), 393; https://doi.org/10.3390/s19020393 - 18 Jan 2019
Cited by 4 | Viewed by 3032
Abstract
Segmentation of human bodies in images is useful for a variety of applications, including background substitution, human activity recognition, security, and video surveillance applications. However, human body segmentation has been a challenging problem, due to the complicated shape and motion of a non-rigid [...] Read more.
Segmentation of human bodies in images is useful for a variety of applications, including background substitution, human activity recognition, security, and video surveillance applications. However, human body segmentation has been a challenging problem, due to the complicated shape and motion of a non-rigid human body. Meanwhile, depth sensors with advanced pattern recognition algorithms provide human body skeletons in real time with reasonable accuracy. In this study, we propose an algorithm that projects the human body skeleton from a depth image to a color image, where the human body region is segmented in the color image by using the projected skeleton as a segmentation cue. Experimental results using the Kinect sensor demonstrate that the proposed method provides high quality segmentation results and outperforms the conventional methods. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Pixelwise Phase Unwrapping Based on Ordered Periods Phase Shift
Sensors 2019, 19(2), 377; https://doi.org/10.3390/s19020377 - 17 Jan 2019
Cited by 5 | Viewed by 3544
Abstract
The existing phase-shift methods are effective in achieving high-speed, high-precision, high-resolution, real-time shape measurement of moving objects; however, a phase-unwrapping method that can handle the motion of target objects in a real environment and is robust against global illumination as well is yet [...] Read more.
The existing phase-shift methods are effective in achieving high-speed, high-precision, high-resolution, real-time shape measurement of moving objects; however, a phase-unwrapping method that can handle the motion of target objects in a real environment and is robust against global illumination as well is yet to be established. Accordingly, a robust and highly accurate method for determining the absolute phase, using a minimum of three steps, is proposed in this study. In this proposed method, an order structure that rearranges the projection pattern for each period of the sine wave is introduced, so that solving the phase unwrapping problem comes down to calculating the pattern order. Using simulation experiments, it has been confirmed that the proposed method can be used in high-speed, high-precision, high-resolution, three-dimensional shape measurements even in situations with high-speed moving objects and presence of global illumination. In this study, an experimental measurement system was configured with a high-speed camera and projector, and real-time measurements were performed with a processing time of 1.05 ms and a throughput of 500 fps. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A Novel Method for Extrinsic Calibration of Multiple RGB-D Cameras Using Descriptor-Based Patterns
Sensors 2019, 19(2), 349; https://doi.org/10.3390/s19020349 - 16 Jan 2019
Cited by 10 | Viewed by 3170
Abstract
This paper presents a novel method to estimate the relative poses between RGB-D cameras with minimal overlapping fields of view. This calibration problem is relevant to applications such as indoor 3D mapping and robot navigation that can benefit from a wider field of [...] Read more.
This paper presents a novel method to estimate the relative poses between RGB-D cameras with minimal overlapping fields of view. This calibration problem is relevant to applications such as indoor 3D mapping and robot navigation that can benefit from a wider field of view using multiple RGB-D cameras. The proposed approach relies on descriptor-based patterns to provide well-matched 2D keypoints in the case of a minimal overlapping field of view between cameras. Integrating the matched 2D keypoints with corresponding depth values, a set of 3D matched keypoints are constructed to calibrate multiple RGB-D cameras. Experiments validated the accuracy and efficiency of the proposed calibration approach. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
3D Affine: An Embedding of Local Image Features for Viewpoint Invariance Using RGB-D Sensor Data
Sensors 2019, 19(2), 291; https://doi.org/10.3390/s19020291 - 12 Jan 2019
Cited by 4 | Viewed by 3868
Abstract
Local image features are invariant to in-plane rotations and robust to minor viewpoint changes. However, the current detectors and descriptors for local image features fail to accommodate out-of-plane rotations larger than 25°–30°. Invariance to such viewpoint changes is essential for numerous applications, including [...] Read more.
Local image features are invariant to in-plane rotations and robust to minor viewpoint changes. However, the current detectors and descriptors for local image features fail to accommodate out-of-plane rotations larger than 25°–30°. Invariance to such viewpoint changes is essential for numerous applications, including wide baseline matching, 6D pose estimation, and object reconstruction. In this study, we present a general embedding that wraps a detector/descriptor pair in order to increase viewpoint invariance by exploiting input depth maps. The proposed embedding locates smooth surfaces within the input RGB-D images and projects them into a viewpoint invariant representation, enabling the detection and description of more viewpoint invariant features. Our embedding can be utilized with different combinations of descriptor/detector pairs, according to the desired application. Using synthetic and real-world objects, we evaluated the viewpoint invariance of various detectors and descriptors, for both standalone and embedded approaches. While standalone local image features fail to accommodate average viewpoint changes beyond 33.3°, our proposed embedding boosted the viewpoint invariance to different levels, depending on the scene geometry. Objects with distinct surface discontinuities were on average invariant up to 52.8°, and the overall average for all evaluated datasets was 45.4°. Similarly, out of a total of 140 combinations involving 20 local image features and various objects with distinct surface discontinuities, only a single standalone local image feature exceeded the goal of 60° viewpoint difference in just two combinations, as compared with 19 different local image features succeeding in 73 combinations when wrapped in the proposed embedding. Furthermore, the proposed approach operates robustly in the presence of input depth noise, even that of low-cost commodity depth sensors, and well beyond. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors
Sensors 2019, 19(2), 282; https://doi.org/10.3390/s19020282 - 11 Jan 2019
Cited by 10 | Viewed by 8877
Abstract
In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. [...] Read more.
In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust 3D optical data extraction. The subject’s motion is efficiently captured by applying a template-based fitting technique on the extracted optical data. Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D). The FCN model outperforms its competitors on the DMC2.5D dataset using 2D Percentage of Correct Keypoints (PCK) metric, while the motion capture outcome is evaluated against RGB-D and inertial data fusion approaches on DMC3D, outperforming the next best method by 4.5 % in total 3D PCK accuracy. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Graphical abstract

Article
Incremental 3D Cuboid Modeling with Drift Compensation
Sensors 2019, 19(1), 178; https://doi.org/10.3390/s19010178 - 06 Jan 2019
Cited by 2 | Viewed by 4894
Abstract
This paper presents a framework of incremental 3D cuboid modeling by using the mapping results of an RGB-D camera based simultaneous localization and mapping (SLAM) system. This framework is useful in accurately creating cuboid CAD models from a point cloud in an online [...] Read more.
This paper presents a framework of incremental 3D cuboid modeling by using the mapping results of an RGB-D camera based simultaneous localization and mapping (SLAM) system. This framework is useful in accurately creating cuboid CAD models from a point cloud in an online manner. While performing the RGB-D SLAM, planes are incrementally reconstructed from a point cloud in each frame to create a plane map. Then, cuboids are detected in the plane map by analyzing the positional relationships between the planes, such as orthogonality, convexity, and proximity. Finally, the position, pose, and size of a cuboid are determined by computing the intersection of three perpendicular planes. To suppress the false detection of the cuboids, the cuboid shapes are incrementally updated with sequential measurements to check the uncertainty of the cuboids. In addition, the drift error of the SLAM is compensated by the registration of the cuboids. As an application of our framework, an augmented reality-based interactive cuboid modeling system was developed. In the evaluation at cluttered environments, the precision and recall of the cuboid detection were investigated, compared with a batch-based cuboid detection method, so that the advantages of our proposed method were clarified. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Depth from a Motion Algorithm and a Hardware Architecture for Smart Cameras
Sensors 2019, 19(1), 53; https://doi.org/10.3390/s19010053 - 23 Dec 2018
Cited by 8 | Viewed by 3749
Abstract
Applications such as autonomous navigation, robot vision, and autonomous flying require depth map information of a scene. Depth can be estimated by using a single moving camera (depth from motion). However, the traditional depth from motion algorithms have low processing speeds and high [...] Read more.
Applications such as autonomous navigation, robot vision, and autonomous flying require depth map information of a scene. Depth can be estimated by using a single moving camera (depth from motion). However, the traditional depth from motion algorithms have low processing speeds and high hardware requirements that limit the embedded capabilities. In this work, we propose a hardware architecture for depth from motion that consists of a flow/depth transformation and a new optical flow algorithm. Our optical flow formulation consists in an extension of the stereo matching problem. A pixel-parallel/window-parallel approach where a correlation function based on the sum of absolute difference (SAD) computes the optical flow is proposed. Further, in order to improve the SAD, the curl of the intensity gradient as a preprocessing step is proposed. Experimental results demonstrated that it is possible to reach higher accuracy (90% of accuracy) compared with previous Field Programmable Gate Array (FPGA)-based optical flow algorithms. For the depth estimation, our algorithm delivers dense maps with motion and depth information on all image pixels, with a processing speed up to 128 times faster than that of previous work, making it possible to achieve high performance in the context of embedded applications. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Have I Seen This Place Before? A Fast and Robust Loop Detection and Correction Method for 3D Lidar SLAM
Sensors 2019, 19(1), 23; https://doi.org/10.3390/s19010023 - 21 Dec 2018
Cited by 13 | Viewed by 4152
Abstract
In this paper, we present a complete loop detection and correction system developed for data originating from lidar scanners. Regarding detection, we propose a combination of a global point cloud matcher with a novel registration algorithm to determine loop candidates in a highly [...] Read more.
In this paper, we present a complete loop detection and correction system developed for data originating from lidar scanners. Regarding detection, we propose a combination of a global point cloud matcher with a novel registration algorithm to determine loop candidates in a highly effective way. The registration method can deal with point clouds that are largely deviating in orientation while improving the efficiency over existing techniques. In addition, we accelerated the computation of the global point cloud matcher by a factor of 2–4, exploiting the GPU to its maximum. Experiments demonstrated that our combined approach more reliably detects loops in lidar data compared to other point cloud matchers as it leads to better precision–recall trade-offs: for nearly 100% recall, we gain up to 7% in precision. Finally, we present a novel loop correction algorithm that leads to an improvement by a factor of 2 on the average and median pose error, while at the same time only requires a handful of seconds to complete. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A Robot-Driven 3D Shape Measurement System for Automatic Quality Inspection of Thermal Objects on a Forging Production Line
Sensors 2018, 18(12), 4368; https://doi.org/10.3390/s18124368 - 10 Dec 2018
Cited by 23 | Viewed by 3836
Abstract
The three-dimensional (3D) geometric evaluation of large thermal forging parts online is critical to quality control and energy conservation. However, this online 3D measurement task is extremely challenging for commercially available 3D sensors because of the enormous amount of heat radiation and complexity [...] Read more.
The three-dimensional (3D) geometric evaluation of large thermal forging parts online is critical to quality control and energy conservation. However, this online 3D measurement task is extremely challenging for commercially available 3D sensors because of the enormous amount of heat radiation and complexity of the online environment. To this end, an automatic and accurate 3D shape measurement system integrated with a fringe projection-based 3D scanner and an industrial robot is presented. To resist thermal radiation, a double filter set and an intelligent temperature control loop are employed in the system. In addition, a time-division-multiplexing trigger is implemented in the system to accelerate pattern projection and capture, and an improved multi-frequency phase-shifting method is proposed to reduce the number of patterns required for 3D reconstruction. Thus, the 3D measurement efficiency is drastically improved and the exposure to the thermal environment is reduced. To perform data alignment in a complex online environment, a view integration method is used in the system to align non-overlapping 3D data from different views based on the repeatability of the robot motion. Meanwhile, a robust 3D registration algorithm is used to align 3D data accurately in the presence of irrelevant background data. These components and algorithms were evaluated by experiments. The system was deployed in a forging factory on a production line and performed a stable online 3D quality inspection for thermal axles. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Simultaneous All-Parameters Calibration and Assessment of a Stereo Camera Pair Using a Scale Bar
Sensors 2018, 18(11), 3964; https://doi.org/10.3390/s18113964 - 15 Nov 2018
Cited by 12 | Viewed by 2665
Abstract
Highly accurate and easy-to-operate calibration (to determine the interior and distortion parameters) and orientation (to determine the exterior parameters) methods for cameras in large volume is a very important topic for expanding the application scope of 3D vision and photogrammetry techniques. This paper [...] Read more.
Highly accurate and easy-to-operate calibration (to determine the interior and distortion parameters) and orientation (to determine the exterior parameters) methods for cameras in large volume is a very important topic for expanding the application scope of 3D vision and photogrammetry techniques. This paper proposes a method for simultaneously calibrating, orienting and assessing multi-camera 3D measurement systems in large measurement volume scenarios. The primary idea is building 3D point and length arrays by moving a scale bar in the measurement volume and then conducting a self-calibrating bundle adjustment that involves all the image points and lengths of both cameras. Relative exterior parameters between the camera pair are estimated by the five point relative orientation method. The interior, distortion parameters of each camera and the relative exterior parameters are optimized through bundle adjustment of the network geometry that is strengthened through applying the distance constraints. This method provides both internal precision and external accuracy assessment of the calibration performance. Simulations and real data experiments are designed and conducted to validate the effectivity of the method and analyze its performance under different network geometries. The RMSE of length measurement is less than 0.25 mm and the relative precision is higher than 1/25,000 for a two camera system calibrated by the proposed method in a volume of 12 m × 8 m × 4 m. Compared with the state-of-the-art point array self-calibrating bundle adjustment method, the proposed method is easier to operate and can significantly reduce systematic errors caused by wrong scaling. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A FAST-BRISK Feature Detector with Depth Information
Sensors 2018, 18(11), 3908; https://doi.org/10.3390/s18113908 - 13 Nov 2018
Cited by 17 | Viewed by 3733
Abstract
RGB-D cameras offer both color and depth images of the surrounding environment, making them an attractive option for robotic and vision applications. This work introduces the BRISK_D algorithm, which efficiently combines Features from Accelerated Segment Test (FAST) and Binary Robust Invariant Scalable Keypoints [...] Read more.
RGB-D cameras offer both color and depth images of the surrounding environment, making them an attractive option for robotic and vision applications. This work introduces the BRISK_D algorithm, which efficiently combines Features from Accelerated Segment Test (FAST) and Binary Robust Invariant Scalable Keypoints (BRISK) methods. In the BRISK_D algorithm, the keypoints are detected by the FAST algorithm and the location of the keypoint is refined in the scale and the space. The scale factor of the keypoint is directly computed with the depth information of the image. In the experiment, we have made a detailed comparative analysis of the three algorithms SURF, BRISK and BRISK_D from the aspects of scaling, rotation, perspective and blur. The BRISK_D algorithm combines depth information and has good algorithm performance. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Microscopic Three-Dimensional Measurement Based on Telecentric Stereo and Speckle Projection Methods
Sensors 2018, 18(11), 3882; https://doi.org/10.3390/s18113882 - 11 Nov 2018
Cited by 12 | Viewed by 3081
Abstract
Three-dimensional (3D) measurement of microstructures has become increasingly important, and many microscopic measurement methods have been developed. For the dimension in several millimeters together with the accuracy at sub-pixel or sub-micron level, there is almost no effective measurement method now. Here we present [...] Read more.
Three-dimensional (3D) measurement of microstructures has become increasingly important, and many microscopic measurement methods have been developed. For the dimension in several millimeters together with the accuracy at sub-pixel or sub-micron level, there is almost no effective measurement method now. Here we present a method combining the microscopic stereo measurement with the digital speckle projection. A microscopy experimental setup mainly composed of two telecentric cameras and an industrial projection module is established and a telecentric binocular stereo reconstruction procedure is carried out. The measurement accuracy has firstly been verified by performing 3D measurements of grid arrays at different locations and cylinder arrays with different height differences. Then two Mitutoyo step masters have been used for further verification. The experimental results show that the proposed method can obtain 3D information of the microstructure with a sub-pixel and even sub-micron measuring accuracy in millimeter scale. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Temperature Compensation Method for Digital Cameras in 2D and 3D Measurement Applications
Sensors 2018, 18(11), 3685; https://doi.org/10.3390/s18113685 - 30 Oct 2018
Cited by 11 | Viewed by 3994
Abstract
This paper presents the results of several studies concerning the effect of temperature on digital cameras. Experiments were performed using three different camera models. The presented results conclusively demonstrate that the typical camera design does not adequately take into account the effect of [...] Read more.
This paper presents the results of several studies concerning the effect of temperature on digital cameras. Experiments were performed using three different camera models. The presented results conclusively demonstrate that the typical camera design does not adequately take into account the effect of temperature variation on the device’s performance. In this regard, a modified camera design is proposed that exhibits a highly predictable behavior under varying ambient temperature and facilitates thermal compensation. A novel temperature compensation method is also proposed. This compensation model can be applied in almost every existing camera application, as it is compatible with every camera calibration model. A two-dimensional (2D) and three-dimensional (3D) application of the proposed compensation model is also described. The results of the application of the proposed compensation approach are presented herein. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Graphical abstract

Article
Robust and Efficient CPU-Based RGB-D Scene Reconstruction
Sensors 2018, 18(11), 3652; https://doi.org/10.3390/s18113652 - 28 Oct 2018
Cited by 6 | Viewed by 3177
Abstract
3D scene reconstruction is an important topic in computer vision. A complete scene is reconstructed from views acquired along the camera trajectory, each view containing a small part of the scene. Tracking in textureless scenes is well known to be a Gordian knot [...] Read more.
3D scene reconstruction is an important topic in computer vision. A complete scene is reconstructed from views acquired along the camera trajectory, each view containing a small part of the scene. Tracking in textureless scenes is well known to be a Gordian knot of camera tracking, and how to obtain accurate 3D models quickly is a major challenge for existing systems. For the application of robotics, we propose a robust CPU-based approach to reconstruct indoor scenes efficiently with a consumer RGB-D camera. The proposed approach bridges feature-based camera tracking and volumetric-based data integration together and has a good reconstruction performance in terms of both robustness and efficiency. The key points in our approach include: (i) a robust and fast camera tracking method combining points and edges, which improves tracking stability in textureless scenes; (ii) an efficient data fusion strategy to select camera views and integrate RGB-D images on multiple scales, which enhances the efficiency of volumetric integration; (iii) a novel RGB-D scene reconstruction system, which can be quickly implemented on a standard CPU. Experimental results demonstrate that our approach reconstructs scenes with higher robustness and efficiency compared to state-of-the-art reconstruction systems. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Assessment of Fringe Pattern Decomposition with a Cross-Correlation Index for Phase Retrieval in Fringe Projection 3D Measurements
Sensors 2018, 18(10), 3578; https://doi.org/10.3390/s18103578 - 22 Oct 2018
Cited by 1 | Viewed by 3471
Abstract
Phase retrieval from single frame projection fringe patterns, a fundamental and challenging problem in fringe projection measurement, attracts wide attention and various new methods have emerged to address this challenge. Many phase retrieval methods are based on the decomposition of fringe patterns into [...] Read more.
Phase retrieval from single frame projection fringe patterns, a fundamental and challenging problem in fringe projection measurement, attracts wide attention and various new methods have emerged to address this challenge. Many phase retrieval methods are based on the decomposition of fringe patterns into a background part and a fringe part, and then the phase is obtained from the decomposed fringe part. However, the decomposition results are subject to the selection of model parameters, which is usually performed manually by trial and error due to the lack of decomposition assessment rules under a no ground truth data situation. In this paper, we propose a cross-correlation index to assess the decomposition and phase retrieval results without the need of ground truth data. The feasibility of the proposed metric is verified by simulated and real fringe patterns with the well-known Fourier transform method and recently proposed Shearlet transform method. This work contributes to the automatic phase retrieval and three-dimensional (3D) measurement with less human intervention, and can be potentially employed in other fields such as phase retrieval in digital holography. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Person Re-Identification with RGB-D Camera in Top-View Configuration through Multiple Nearest Neighbor Classifiers and Neighborhood Component Features Selection
Sensors 2018, 18(10), 3471; https://doi.org/10.3390/s18103471 - 15 Oct 2018
Cited by 26 | Viewed by 3496
Abstract
Person re-identification is an important topic in retail, scene monitoring, human-computer interaction, people counting, ambient assisted living and many other application fields. A dataset for person re-identification TVPR (Top View Person Re-Identification) based on a number of significant features derived from both depth [...] Read more.
Person re-identification is an important topic in retail, scene monitoring, human-computer interaction, people counting, ambient assisted living and many other application fields. A dataset for person re-identification TVPR (Top View Person Re-Identification) based on a number of significant features derived from both depth and color images has been previously built. This dataset uses an RGB-D camera in a top-view configuration to extract anthropometric features for the recognition of people in view of the camera, reducing the problem of occlusions while being privacy preserving. In this paper, we introduce a machine learning method for person re-identification using the TVPR dataset. In particular, we propose the combination of multiple k-nearest neighbor classifiers based on different distance functions and feature subsets derived from depth and color images. Moreover, the neighborhood component feature selection is used to learn the depth features’ weighting vector by minimizing the leave-one-out regularized training error. The classification process is performed by selecting the first passage under the camera for training and using the others as the testing set. Experimental results show that the proposed methodology outperforms standard supervised classifiers widely used for the re-identification task. This improvement encourages the application of this approach in the retail context in order to improve retail analytics, customer service and shopping space management. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Large Depth-of-Field Integral Microscopy by Use of a Liquid Lens
Sensors 2018, 18(10), 3383; https://doi.org/10.3390/s18103383 - 10 Oct 2018
Cited by 8 | Viewed by 2718
Abstract
Integral microscopy is a 3D imaging technique that permits the recording of spatial and angular information of microscopic samples. From this information it is possible to calculate a collection of orthographic views with full parallax and to refocus computationally, at will, through the [...] Read more.
Integral microscopy is a 3D imaging technique that permits the recording of spatial and angular information of microscopic samples. From this information it is possible to calculate a collection of orthographic views with full parallax and to refocus computationally, at will, through the 3D specimen. An important drawback of integral microscopy, especially when dealing with thick samples, is the limited depth of field (DOF) of the perspective views. This imposes a significant limitation on the depth range of computationally refocused images. To overcome this problem, we propose here a new method that is based on the insertion, at the pupil plane of the microscope objective, of an electrically controlled liquid lens (LL) whose optical power can be changed by simply tuning the voltage. This new apparatus has the advantage of controlling the axial position of the objective focal plane while keeping constant the essential parameters of the integral microscope, that is, the magnification, the numerical aperture and the amount of parallax. Thus, given a 3D sample, the new microscope can provide a stack of integral images with complementary depth ranges. The fusion of the set of refocused images permits to enlarge the reconstruction range, obtaining images in focus over the whole region. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Direct Depth SLAM: Sparse Geometric Feature Enhanced Direct Depth SLAM System for Low-Texture Environments
Sensors 2018, 18(10), 3339; https://doi.org/10.3390/s18103339 - 06 Oct 2018
Cited by 14 | Viewed by 6650
Abstract
This paper presents a real-time, robust and low-drift depth-only SLAM (simultaneous localization and mapping) method for depth cameras by utilizing both dense range flow and sparse geometry features from sequential depth images. The proposed method is mainly composed of three optimization layers, namely [...] Read more.
This paper presents a real-time, robust and low-drift depth-only SLAM (simultaneous localization and mapping) method for depth cameras by utilizing both dense range flow and sparse geometry features from sequential depth images. The proposed method is mainly composed of three optimization layers, namely Direct Depth layer, ICP (Iterative closest point) Refined layer and Graph Optimization layer. The Direct Depth layer uses a range flow constraint equation to solve the fast 6-DOF (six degrees of freedom) frame-to-frame pose estimation problem. Then, the ICP Refined layer is used to reduce the local drift by applying local map based motion estimation strategy. After that, we propose a loop closure detection algorithm by extracting and matching sparse geometric features and construct a pose graph for the purpose of global pose optimization. We evaluate the performance of our method using benchmark datasets and real scene data. Experiment results show that our front-end algorithm clearly over performs the classic methods and our back-end algorithm is robust to find loop closures and reduce the global drift. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
GesID: 3D Gesture Authentication Based on Depth Camera and One-Class Classification
Sensors 2018, 18(10), 3265; https://doi.org/10.3390/s18103265 - 28 Sep 2018
Cited by 13 | Viewed by 2887
Abstract
Biometric authentication is popular in authentication systems, and gesture as a carrier of behavior characteristics has the advantages of being difficult to imitate and containing abundant information. This research aims to use three-dimensional (3D) depth information of gesture movement to perform authentication with [...] Read more.
Biometric authentication is popular in authentication systems, and gesture as a carrier of behavior characteristics has the advantages of being difficult to imitate and containing abundant information. This research aims to use three-dimensional (3D) depth information of gesture movement to perform authentication with less user effort. We propose an approach based on depth cameras, which satisfies three requirements: Can authenticate from a single, customized gesture; achieves high accuracy without an excessive number of gestures for training; and continues learning the gesture during use of the system. To satisfy these requirements respectively: We use a sparse autoencoder to memorize the single gesture; we employ data augmentation technology to solve the problem of insufficient data; and we use incremental learning technology for allowing the system to memorize the gesture incrementally over time. An experiment has been performed on different gestures in different user situations that demonstrates the accuracy of one-class classification (OCC), and proves the effectiveness and reliability of the approach. Gesture authentication based on 3D depth cameras could be achieved with reduced user effort. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
RGB Colour Encoding Improvement for Three-Dimensional Shapes and Displacement Measurement Using the Integration of Fringe Projection and Digital Image Correlation
Sensors 2018, 18(9), 3130; https://doi.org/10.3390/s18093130 - 17 Sep 2018
Cited by 8 | Viewed by 3632
Abstract
Three-dimensional digital image correlation (3D-DIC) has become the most popular full-field optical technique for measuring 3D shapes and displacements in experimental mechanics. The integration of fringe projection (FP) and two-dimensional digital image correlation (FP + DIC) has been recently established as an intelligent [...] Read more.
Three-dimensional digital image correlation (3D-DIC) has become the most popular full-field optical technique for measuring 3D shapes and displacements in experimental mechanics. The integration of fringe projection (FP) and two-dimensional digital image correlation (FP + DIC) has been recently established as an intelligent low-cost alternative to 3D-DIC, overcoming the drawbacks of a stereoscopic system. Its experimentation is based on the colour encoding of the characterized fringe and speckle patterns required for FP and DIC implementation, respectively. In the present work, innovations in experimentation using FP + DIC for more accurate results are presented. Specifically, they are based on the improvement of the colour pattern encoding. To achieve this, in this work, a multisensor camera and/or laser structural illumination were employed. Both alternatives are analysed and evaluated. Results show that improvements both in three-dimensional and in-plane displacement are obtained with the proposed alternatives. Nonetheless, multisensor high-speed cameras are uncommon, and laser structural illumination is established as an important improvement when low uncertainty is required for 2D-displacement measurement. Hence, the uncertainty has been demonstrated to be reduced by up to 50% compared with results obtained in previous experimental approaches of FP + DIC. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
A Versatile Method for Depth Data Error Estimation in RGB-D Sensors
Sensors 2018, 18(9), 3122; https://doi.org/10.3390/s18093122 - 16 Sep 2018
Cited by 14 | Viewed by 4529
Abstract
We propose a versatile method for estimating the RMS error of depth data provided by generic 3D sensors with the capability of generating RGB and depth (D) data of the scene, i.e., the ones based on techniques such as structured light, [...] Read more.
We propose a versatile method for estimating the RMS error of depth data provided by generic 3D sensors with the capability of generating RGB and depth (D) data of the scene, i.e., the ones based on techniques such as structured light, time of flight and stereo. A common checkerboard is used, the corners are detected and two point clouds are created, one with the real coordinates of the pattern corners and one with the corner coordinates given by the device. After a registration of these two clouds, the RMS error is computed. Then, using curve fittings methods, an equation is obtained that generalizes the RMS error as a function of the distance between the sensor and the checkerboard pattern. The depth errors estimated by our method are compared to those estimated by state-of-the-art approaches, validating its accuracy and utility. This method can be used to rapidly estimate the quality of RGB-D sensors, facilitating robotics applications as SLAM and object recognition. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures

Figure 1

Article
Development and Experimental Evaluation of a 3D Vision System for Grinding Robot
Sensors 2018, 18(9), 3078; https://doi.org/10.3390/s18093078 - 13 Sep 2018
Cited by 14 | Viewed by 4526
Abstract
If the grinding robot can automatically position and measure the machining target on the workpiece, it will significantly improve its machining efficiency and intelligence level. However, unfortunately, the current grinding robot cannot do this because of economic and precision reasons. This paper proposes [...] Read more.
If the grinding robot can automatically position and measure the machining target on the workpiece, it will significantly improve its machining efficiency and intelligence level. However, unfortunately, the current grinding robot cannot do this because of economic and precision reasons. This paper proposes a 3D vision system mounted on the robot’s fourth joint, which is used to detect the machining target of the grinding robot. Also, the hardware architecture and data processing method of the 3D vision system is described in detail. In the data processing process, we first use the voxel grid filter to preprocess the point cloud and obtain the feature descriptor. Then use fast library for approximate nearest neighbors (FLANN) to search out the difference point cloud from the precisely registered point cloud pair and use the point cloud segmentation method proposed in this paper to extract machining path points. Finally, the detection error compensation model is used to accurately calibrate the 3D vision system to transform the machining information into the grinding robot base frame. Experimental results show that the absolute average error of repeated measurements at different locations is 0.154 mm, and the absolute measurement error of the vision system caused by compound error is usually less than 0.25 mm. The proposed 3D vision system could easily integrate into an intelligent grinding system and may be suitable for industrial sites. Full article
(This article belongs to the Special Issue Depth Sensors and 3D Vision)
Show Figures