MDPI - Publisher of Open Access Journals

23 pages, 2042 KB

Open AccessArticle

StructScan3D v1: A First RGB-D Dataset for Indoor Building Elements Segmentation and BIM Modeling

by Ishraq Rached, Rafika Hajji, Tania Landes and Rashid Haffadi

Sensors 2025, 25(11), 3461; https://doi.org/10.3390/s25113461 - 30 May 2025

Viewed by 2265

The integration of computer vision and deep learning into Building Information Modeling (BIM) workflows has created a growing need for structured datasets that enable the semantic segmentation of indoor building elements. This paper presents StructScan3D v1, the first version of an RGB-D dataset [...] Read more.

The integration of computer vision and deep learning into Building Information Modeling (BIM) workflows has created a growing need for structured datasets that enable the semantic segmentation of indoor building elements. This paper presents StructScan3D v1, the first version of an RGB-D dataset specifically designed to facilitate the automated segmentation and modeling of architectural and structural components. Captured using the Kinect Azure sensor, StructScan3D v1 comprises 2594 annotated frames from diverse indoor environments, including residential and office spaces. The dataset focuses on six key building elements: walls, floors, ceilings, windows, doors, and miscellaneous objects. To establish a benchmark for indoor RGB-D semantic segmentation, we evaluate D-Former, a transformer-based model that leverages self-attention mechanisms for enhanced spatial understanding. Additionally, we compare its performance against state-of-the-art models such as Gemini and TokenFusion, providing a comprehensive analysis of segmentation accuracy. Experimental results show that D-Former achieves a mean Intersection over Union (mIoU) of 67.5%, demonstrating strong segmentation capabilities despite challenges like occlusions and depth variations. As an evolving dataset, StructScan3D v1 lays the foundation for future expansions, including increased scene diversity and refined annotations. By bridging the gap between deep learning-driven segmentation and real-world BIM applications, this dataset provides researchers and practitioners with a valuable resource for advancing indoor scene reconstruction, robotics, and augmented reality. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

17 pages, 15387 KB

Open AccessEditor’s ChoiceArticle

Improving 3D Reconstruction Through RGB-D Sensor Noise Modeling

by Fahira Afzal Maken, Sundaram Muthu, Chuong Nguyen, Changming Sun, Jinguang Tong, Shan Wang, Russell Tsuchida, David Howard, Simon Dunstall and Lars Petersson

Sensors 2025, 25(3), 950; https://doi.org/10.3390/s25030950 - 5 Feb 2025

Cited by 3 | Viewed by 2739

Abstract

High-resolution RGB-D sensors are widely used in computer vision, manufacturing, and robotics. The depth maps from these sensors have inherently high measurement uncertainty that includes both systematic and non-systematic noise. These noisy depth estimates degrade the quality of scans, resulting in less accurate [...] Read more.

High-resolution RGB-D sensors are widely used in computer vision, manufacturing, and robotics. The depth maps from these sensors have inherently high measurement uncertainty that includes both systematic and non-systematic noise. These noisy depth estimates degrade the quality of scans, resulting in less accurate 3D reconstruction, making them unsuitable for some high-precision applications. In this paper, we focus on quantifying the uncertainty in the depth maps of high-resolution RGB-D sensors for the purpose of improving 3D reconstruction accuracy. To this end, we estimate the noise model for a recent high-precision RGB-D structured light sensor called Zivid when mounted on a robot arm. Our proposed noise model takes into account the measurement distance and angle between the sensor and the measured surface. We additionally analyze the effect of background light, exposure time, and the number of captures on the quality of the depth maps obtained. Our noise model seamlessly integrates with well-known classical and modern neural rendering-based algorithms, from KinectFusion to Point-SLAM methods using bilinear interpolation as well as 3D analytical functions. We collect a high-resolution RGB-D dataset and apply our noise model to improve tracking and produce higher-resolution 3D models. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

23 pages, 11804 KB

Open AccessArticle

Therapeutic Exercise Recognition Using a Single UWB Radar with AI-Driven Feature Fusion and ML Techniques in a Real Environment

by Shahzad Hussain, Hafeez Ur Rehman Siddiqui, Adil Ali Saleem, Muhammad Amjad Raza, Josep Alemany Iturriaga, Alvaro Velarde-Sotres and Isabel De la Torre Díez

Sensors 2024, 24(17), 5533; https://doi.org/10.3390/s24175533 - 27 Aug 2024

Cited by 1 | Viewed by 1777

Abstract

Physiotherapy plays a crucial role in the rehabilitation of damaged or defective organs due to injuries or illnesses, often requiring long-term supervision by a physiotherapist in clinical settings or at home. AI-based support systems have been developed to enhance the precision and effectiveness [...] Read more.

Physiotherapy plays a crucial role in the rehabilitation of damaged or defective organs due to injuries or illnesses, often requiring long-term supervision by a physiotherapist in clinical settings or at home. AI-based support systems have been developed to enhance the precision and effectiveness of physiotherapy, particularly during the COVID-19 pandemic. These systems, which include game-based or tele-rehabilitation monitoring using camera-based optical systems like Vicon and Microsoft Kinect, face challenges such as privacy concerns, occlusion, and sensitivity to environmental light. Non-optical sensor alternatives, such as Inertial Movement Units (IMUs), Wi-Fi, ultrasound sensors, and ultrawide band (UWB) radar, have emerged to address these issues. Although IMUs are portable and cost-effective, they suffer from disadvantages like drift over time, limited range, and susceptibility to magnetic interference. In this study, a single UWB radar was utilized to recognize five therapeutic exercises related to the upper limb, performed by 34 male volunteers in a real environment. A novel feature fusion approach was developed to extract distinguishing features for these exercises. Various machine learning methods were applied, with the EnsembleRRGraBoost ensemble method achieving the highest recognition accuracy of 99.45%. The performance of the EnsembleRRGraBoost model was further validated using five-fold cross-validation, maintaining its high accuracy. Full article

(This article belongs to the Special Issue Short-Range Radar-Based Techniques for Remote Monitoring and Medical Related Applications)

► Show Figures

Figure 1

19 pages, 2949 KB

Open AccessArticle

Sensor Fusion-Based Anthropomorphic Control of a Robotic Arm

by Furong Chen, Feilong Wang, Yanling Dong, Qi Yong, Xiaolong Yang, Long Zheng, Yi Gao and Hang Su

Bioengineering 2023, 10(11), 1243; https://doi.org/10.3390/bioengineering10111243 - 24 Oct 2023

Cited by 7 | Viewed by 4616

Abstract

The main goal of this research is to develop a highly advanced anthropomorphic control system utilizing multiple sensor technologies to achieve precise control of a robotic arm. Combining Kinect and IMU sensors, together with a data glove, we aim to create a multimodal [...] Read more.

The main goal of this research is to develop a highly advanced anthropomorphic control system utilizing multiple sensor technologies to achieve precise control of a robotic arm. Combining Kinect and IMU sensors, together with a data glove, we aim to create a multimodal sensor system for capturing rich information of human upper body movements. Specifically, the four angles of upper limb joints are collected using the Kinect sensor and IMU sensor. In order to improve the accuracy and stability of motion tracking, we use the Kalman filter method to fuse the Kinect and IMU data. In addition, we introduce data glove technology to collect the angle information of the wrist and fingers in seven different directions. The integration and fusion of multiple sensors provides us with full control over the robotic arm, giving it flexibility with 11 degrees of freedom. We successfully achieved a variety of anthropomorphic movements, including shoulder flexion, abduction, rotation, elbow flexion, and fine movements of the wrist and fingers. Most importantly, our experimental results demonstrate that the anthropomorphic control system we developed is highly accurate, real-time, and operable. In summary, the contribution of this study lies in the creation of a multimodal sensor system capable of capturing and precisely controlling human upper limb movements, which provides a solid foundation for the future development of anthropomorphic control technologies. This technology has a wide range of application prospects and can be used for rehabilitation in the medical field, robot collaboration in industrial automation, and immersive experience in virtual reality environments. Full article

► Show Figures

Figure 1

18 pages, 13035 KB

Open AccessArticle

Reflectance Measurement Method Based on Sensor Fusion of Frame-Based Hyperspectral Imager and Time-of-Flight Depth Camera

by Samuli Rahkonen, Leevi Lind, Anna-Maria Raita-Hakola, Sampsa Kiiskinen and Ilkka Pölönen

Sensors 2022, 22(22), 8668; https://doi.org/10.3390/s22228668 - 10 Nov 2022

Cited by 3 | Viewed by 3696

Abstract

Hyperspectral imaging and distance data have previously been used in aerial, forestry, agricultural, and medical imaging applications. Extracting meaningful information from a combination of different imaging modalities is difficult, as the image sensor fusion requires knowing the optical properties of the sensors, selecting [...] Read more.

Hyperspectral imaging and distance data have previously been used in aerial, forestry, agricultural, and medical imaging applications. Extracting meaningful information from a combination of different imaging modalities is difficult, as the image sensor fusion requires knowing the optical properties of the sensors, selecting the right optics and finding the sensors’ mutual reference frame through calibration. In this research we demonstrate a method for fusing data from Fabry–Perot interferometer hyperspectral camera and a Kinect V2 time-of-flight depth sensing camera. We created an experimental application to demonstrate utilizing the depth augmented hyperspectral data to measure emission angle dependent reflectance from a multi-view inferred point cloud. We determined the intrinsic and extrinsic camera parameters through calibration, used global and local registration algorithms to combine point clouds from different viewpoints, created a dense point cloud and determined the angle dependent reflectances from it. The method could successfully combine the 3D point cloud data and hyperspectral data from different viewpoints of a reference colorchecker board. The point cloud registrations gained

0.29

–

0.36

fitness for inlier point correspondences and RMSE was approx. 2, which refers a quite reliable registration result. The RMSE of the measured reflectances between the front view and side views of the targets varied between

0.01

and

0.05

on average and the spectral angle between

1.5

and

3.2

degrees. The results suggest that changing emission angle has very small effect on the surface reflectance intensity and spectrum shapes, which was expected with the used colorchecker. Full article

(This article belongs to the Special Issue Kinect Sensor and Its Application)

► Show Figures

Figure 1

23 pages, 9847 KB

Open AccessArticle

CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition

by Ing-Jr Ding and Nai-Wei Zheng

Sensors 2022, 22(3), 803; https://doi.org/10.3390/s22030803 - 21 Jan 2022

Cited by 21 | Viewed by 4324

Abstract

Pixel-based images captured by a charge-coupled device (CCD) with infrared (IR) LEDs around the image sensor are the well-known CCD Red–Green–Blue IR (the so-called CCD RGB-IR) data. The CCD RGB-IR data are generally acquired for video surveillance applications. Currently, CCD RGB-IR information has [...] Read more.

Pixel-based images captured by a charge-coupled device (CCD) with infrared (IR) LEDs around the image sensor are the well-known CCD Red–Green–Blue IR (the so-called CCD RGB-IR) data. The CCD RGB-IR data are generally acquired for video surveillance applications. Currently, CCD RGB-IR information has been further used to perform human gesture recognition on surveillance. Gesture recognition, including hand gesture intention recognition, is attracting great attention in the field of deep neural network (DNN) calculations. For further enhancing conventional CCD RGB-IR gesture recognition by DNN, this work proposes a deep learning framework for gesture recognition where a convolution neural network (CNN) incorporated with wavelet image fusion of CCD RGB-IR and additional depth-based depth-grayscale images (captured from depth sensors of the famous Microsoft Kinect device) is constructed for gesture intention recognition. In the proposed CNN with wavelet image fusion, a five-level discrete wavelet transformation (DWT) with three different wavelet decomposition merge strategies, namely, max-min, min-max and mean-mean, is employed; the visual geometry group (VGG)-16 CNN is used for deep learning and recognition of the wavelet fused gesture images. Experiments on the classifications of ten hand gesture intention actions (specified in a scenario of laboratory interactions) show that by additionally incorporating depth-grayscale data into CCD RGB-IR gesture recognition one will be able to further increase the averaged recognition accuracy to 83.88% for the VGG-16 CNN with min-max wavelet image fusion of the CCD RGB-IR and depth-grayscale data, which is obviously superior to the 75.33% of VGG-16 CNN with only CCD RGB-IR. Full article

(This article belongs to the Special Issue Electronic Materials and Sensors Innovation and Application)

► Show Figures

Figure 1

16 pages, 3681 KB

Open AccessArticle

Towards Hybrid Multimodal Manual and Non-Manual Arabic Sign Language Recognition: mArSL Database and Pilot Study

by Hamzah Luqman and El-Sayed M. El-Alfy

Electronics 2021, 10(14), 1739; https://doi.org/10.3390/electronics10141739 - 20 Jul 2021

Cited by 42 | Viewed by 4693

Abstract

Sign languages are the main visual communication medium between hard-hearing people and their societies. Similar to spoken languages, they are not universal and vary from region to region, but they are relatively under-resourced. Arabic sign language (ArSL) is one of these languages that [...] Read more.

Sign languages are the main visual communication medium between hard-hearing people and their societies. Similar to spoken languages, they are not universal and vary from region to region, but they are relatively under-resourced. Arabic sign language (ArSL) is one of these languages that has attracted increasing attention in the research community. However, most of the existing and available works on sign language recognition systems focus on manual gestures, ignoring other non-manual information needed for other language signals such as facial expressions. One of the main challenges of not considering these modalities is the lack of suitable datasets. In this paper, we propose a new multi-modality ArSL dataset that integrates various types of modalities. It consists of 6748 video samples of fifty signs performed by four signers and collected using Kinect V2 sensors. This dataset will be freely available for researchers to develop and benchmark their techniques for further advancement of the field. In addition, we evaluated the fusion of spatial and temporal features of different modalities, manual and non-manual, for sign language recognition using the state-of-the-art deep learning techniques. This fusion boosted the accuracy of the recognition system at the signer-independent mode by 3.6% compared with manual gestures. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

26 pages, 5422 KB

Open AccessArticle

Validation of Marker-Less System for the Assessment of Upper Joints Reaction Forces in Exoskeleton Users

by Simone Pasinetti, Cristina Nuzzi, Nicola Covre, Alessandro Luchetti, Luca Maule, Mauro Serpelloni and Matteo Lancini

Sensors 2020, 20(14), 3899; https://doi.org/10.3390/s20143899 - 13 Jul 2020

Cited by 18 | Viewed by 5553

Abstract

This paper presents the validation of a marker-less motion capture system used to evaluate the upper limb stress of subjects using exoskeletons for locomotion. The system fuses the human skeletonization provided by commercial 3D cameras with forces exchanged by the user to the [...] Read more.

This paper presents the validation of a marker-less motion capture system used to evaluate the upper limb stress of subjects using exoskeletons for locomotion. The system fuses the human skeletonization provided by commercial 3D cameras with forces exchanged by the user to the ground through upper limbs utilizing instrumented crutches. The aim is to provide a low cost, accurate, and reliable technology useful to provide the trainer a quantitative evaluation of the impact of assisted gait on the subject without the need to use an instrumented gait lab. The reaction forces at the upper limbs’ joints are measured to provide a validation focused on clinically relevant quantities for this application. The system was used simultaneously with a reference motion capture system inside a clinical gait analysis lab. An expert user performed 20 walking tests using instrumented crutches and force platforms inside the observed volume. The mechanical model was applied to data from the system and the reference motion capture, and numerical simulations were performed to assess the internal joint reaction of the subject’s upper limbs. A comparison between the two results shows a root mean square error of less than 2% of the subject’s body weight. Full article

(This article belongs to the Special Issue Sensor-Based Systems for Kinematics and Kinetics)

► Show Figures

Figure 1

12 pages, 4710 KB

Open AccessArticle

Three-Dimensional Morphological Measurement Method for a Fruit Tree Canopy Based on Kinect Sensor Self-Calibration

by Haihui Yang, Xiaochan Wang and Guoxiang Sun

Agronomy 2019, 9(11), 741; https://doi.org/10.3390/agronomy9110741 - 11 Nov 2019

Cited by 25 | Viewed by 4797

Abstract

Perception of the fruit tree canopy is a vital technology for the intelligent control of a modern standardized orchard. Due to the complex three-dimensional (3D) structure of the fruit tree canopy, morphological parameters extracted from two-dimensional (2D) or single-perspective 3D images are not [...] Read more.

Perception of the fruit tree canopy is a vital technology for the intelligent control of a modern standardized orchard. Due to the complex three-dimensional (3D) structure of the fruit tree canopy, morphological parameters extracted from two-dimensional (2D) or single-perspective 3D images are not comprehensive enough. Three-dimensional information from different perspectives must be combined in order to perceive the canopy information efficiently and accurately in complex orchard field environment. The algorithms used for the registration and fusion of data from different perspectives and the subsequent extraction of fruit tree canopy related parameters are the keys to the problem. This study proposed a 3D morphological measurement method for a fruit tree canopy based on Kinect sensor self-calibration, including 3D point cloud generation, point cloud registration and canopy information extraction of apple tree canopy. Using 32 apple trees (Yanfu 3 variety) morphological parameters of the height (H), maximum canopy width (W) and canopy thickness (D) were calculated. The accuracy and applicability of this method for extraction of morphological parameters were statistically analyzed. The results showed that, on both sides of the fruit trees, the average relative error (ARE) values of the morphological parameters including the fruit tree height (H), maximum tree width (W) and canopy thickness (D) between the calculated values and measured values were 3.8%, 12.7% and 5.0%, respectively, under the V1 mode; the ARE values under the V2 mode were 3.3%, 9.5% and 4.9%, respectively; and the ARE values under the V1 and V2 merged mode were 2.5%, 3.6% and 3.2%, respectively. The measurement accuracy of the tree width (W) under the double visual angle mode had a significant advantage over that under the single visual angle mode. The 3D point cloud reconstruction method based on Kinect self-calibration proposed in this study has high precision and stable performance, and the auxiliary calibration objects are readily portable and easy to install. It can be applied to different experimental scenes to extract 3D information of fruit tree canopies and has important implications to achieve the intelligent control of standardized orchards. Full article

(This article belongs to the Special Issue Application of Remote Sensing in Orchard Management)

► Show Figures

Figure 1

17 pages, 2085 KB

Open AccessArticle

An Augmented Reality Based Human-Robot Interaction Interface Using Kalman Filter Sensor Fusion

by Chunxu Li, Ashraf Fahmy and Johann Sienz

Sensors 2019, 19(20), 4586; https://doi.org/10.3390/s19204586 - 22 Oct 2019

Cited by 43 | Viewed by 8317

Abstract

In this paper, the application of Augmented Reality (AR) for the control and adjustment of robots has been developed, with the aim of making interaction and adjustment of robots easier and more accurate from a remote location. A LeapMotion sensor based controller has [...] Read more.

In this paper, the application of Augmented Reality (AR) for the control and adjustment of robots has been developed, with the aim of making interaction and adjustment of robots easier and more accurate from a remote location. A LeapMotion sensor based controller has been investigated to track the movement of the operator hands. The data from the controller allows gestures and the position of the hand palm’s central point to be detected and tracked. A Kinect V2 camera is able to measure the corresponding motion velocities in x, y, z directions after our investigated post-processing algorithm is fulfilled. Unreal Engine 4 is used to create an AR environment for the user to monitor the control process immersively. Kalman filtering (KF) algorithm is employed to fuse the position signals from the LeapMotion sensor with the velocity signals from the Kinect camera sensor, respectively. The fused/optimal data are sent to teleoperate a Baxter robot in real-time by User Datagram Protocol (UDP). Several experiments have been conducted to test the validation of the proposed method. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

16 pages, 2147 KB

Open AccessArticle

A Differential Evolution Approach to Optimize Weights of Dynamic Time Warping for Multi-Sensor Based Gesture Recognition

by James Rwigema, Hyo-Rim Choi and TaeYong Kim

Sensors 2019, 19(5), 1007; https://doi.org/10.3390/s19051007 - 27 Feb 2019

Cited by 18 | Viewed by 4173

Abstract

In this research, we present a differential evolution approach to optimize the weights of dynamic time warping for multi-sensory based gesture recognition. Mainly, we aimed to develop a robust gesture recognition method that can be used in various environments. Both a wearable inertial [...] Read more.

In this research, we present a differential evolution approach to optimize the weights of dynamic time warping for multi-sensory based gesture recognition. Mainly, we aimed to develop a robust gesture recognition method that can be used in various environments. Both a wearable inertial sensor and a depth camera (Kinect Sensor) were used as heterogeneous sensors to verify and collect the data. The proposed approach was used for the calculation of optimal weight values and different characteristic features of heterogeneous sensor data, while having different effects during gesture recognition. In this research, we studied 27 different actions to analyze the data. As finding the optimal value of the data from numerous sensors became more complex, a differential evolution approach was used during the fusion and optimization of the data. To verify the performance accuracy of the presented method in this study, a University of Texas at Dallas Multimodal Human Action Datasets (UTD-MHAD) from previous research was used. However, the average recognition rates presented by previous research using respective methods were still low, due to the complexity in the calculation of the optimal values of the acquired data from sensors, as well as the installation environment. Our contribution was based on a method that enabled us to adjust the number of depth cameras and combine this data with inertial sensors (multi-sensors in this study). We applied a differential evolution approach to calculate the optimal values of the added weights. The proposed method achieved an accuracy 10% higher than the previous research results using the same database, indicating a much improved accuracy rate of motion recognition. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

13 pages, 5191 KB

Open AccessArticle

Exploring RGB+Depth Fusion for Real-Time Object Detection

by Tanguy Ophoff, Kristof Van Beeck and Toon Goedemé

Sensors 2019, 19(4), 866; https://doi.org/10.3390/s19040866 - 19 Feb 2019

Cited by 66 | Viewed by 12173

Abstract

In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a [...] Read more.

In this paper, we investigate whether fusing depth information on top of normal RGB data for camera-based object detection can help to increase the performance of current state-of-the-art single-shot detection networks. Indeed, depth sensing is easily acquired using depth cameras such as a Kinect or stereo setups. We investigate the optimal manner to perform this sensor fusion with a special focus on lightweight single-pass convolutional neural network (CNN) architectures, enabling real-time processing on limited hardware. For this, we implement a network architecture allowing us to parameterize at which network layer both information sources are fused together. We performed exhaustive experiments to determine the optimal fusion point in the network, from which we can conclude that fusing towards the mid to late layers provides the best results. Our best fusion models significantly outperform the baseline RGB network in both accuracy and localization of the detections. Full article

(This article belongs to the Special Issue Depth Sensors and 3D Vision)

► Show Figures

Figure 1

17 pages, 38386 KB

Open AccessArticle

Real-Time Underwater StereoFusion

by Matija Rossi, Petar Trslić, Satja Sivčev, James Riordan, Daniel Toal and Gerard Dooly

Sensors 2018, 18(11), 3936; https://doi.org/10.3390/s18113936 - 14 Nov 2018

Cited by 22 | Viewed by 6869

Abstract

Many current and future applications of underwater robotics require real-time sensing and interpretation of the environment. As the vast majority of robots are equipped with cameras, computer vision is playing an increasingly important role it this field. This paper presents the implementation and [...] Read more.

Many current and future applications of underwater robotics require real-time sensing and interpretation of the environment. As the vast majority of robots are equipped with cameras, computer vision is playing an increasingly important role it this field. This paper presents the implementation and experimental results of underwater StereoFusion, an algorithm for real-time 3D dense reconstruction and camera tracking. Unlike KinectFusion on which it is based, StereoFusion relies on a stereo camera as its main sensor. The algorithm uses the depth map obtained from the stereo camera to incrementally build a volumetric 3D model of the environment, while simultaneously using the model for camera tracking. It has been successfully tested both in a lake and in the ocean, using two different state-of-the-art underwater Remotely Operated Vehicles (ROVs). Ongoing work focuses on applying the same algorithm to acoustic sensors, and on the implementation of a vision based monocular system with the same capabilities. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

23 pages, 6781 KB

Open AccessArticle

Body Weight Estimation for Dose-Finding and Health Monitoring of Lying, Standing and Walking Patients Based on RGB-D Data

by Christian Pfitzner, Stefan May and Andreas Nüchter

Sensors 2018, 18(5), 1311; https://doi.org/10.3390/s18051311 - 24 Apr 2018

Cited by 24 | Viewed by 8286

Abstract

This paper describes the estimation of the body weight of a person in front of an RGB-D camera. A survey of different methods for body weight estimation based on depth sensors is given. First, an estimation of people standing in front of a [...] Read more.

This paper describes the estimation of the body weight of a person in front of an RGB-D camera. A survey of different methods for body weight estimation based on depth sensors is given. First, an estimation of people standing in front of a camera is presented. Second, an approach based on a stream of depth images is used to obtain the body weight of a person walking towards a sensor. The algorithm first extracts features from a point cloud and forwards them to an artificial neural network (ANN) to obtain an estimation of body weight. Besides the algorithm for the estimation, this paper further presents an open-access dataset based on measurements from a trauma room in a hospital as well as data from visitors of a public event. In total, the dataset contains 439 measurements. The article illustrates the efficiency of the approach with experiments with persons lying down in a hospital, standing persons, and walking persons. Applicable scenarios for the presented algorithm are body weight-related dosing of emergency patients. Full article

(This article belongs to the Special Issue Optical Methods in Sensing and Imaging for Medical and Biological Applications)

► Show Figures

Figure 1

23 pages, 14902 KB

Open AccessArticle

An Improved Indoor Robot Human-Following Navigation Model Using Depth Camera, Active IR Marker and Proximity Sensors Fusion

by Mark Tee Kit Tsun, Bee Theng Lau and Hudyjaya Siswoyo Jo

Robotics 2018, 7(1), 4; https://doi.org/10.3390/robotics7010004 - 6 Jan 2018

Cited by 24 | Viewed by 10067

Abstract

Creating a navigation system for autonomous companion robots has always been a difficult process, which must contend with a dynamically changing environment, which is populated by a myriad of obstructions and an unspecific number of people, other than the intended person, to follow. [...] Read more.

Creating a navigation system for autonomous companion robots has always been a difficult process, which must contend with a dynamically changing environment, which is populated by a myriad of obstructions and an unspecific number of people, other than the intended person, to follow. This study documents the implementation of an indoor autonomous robot navigation model, based on multi-sensor fusion, using Microsoft Robotics Developer Studio 4 (MRDS). The model relies on a depth camera, a limited array of proximity sensors and an active IR marker tracking system. This allows the robot to lock onto the correct target for human-following, while approximating the best starting direction to begin maneuvering around obstacles for minimum required motion. The system is implemented according to a navigation algorithm that transforms the data from all three types of sensors into tendency arrays and fuses them to determine whether to take a leftward or rightward route around an encountered obstacle. The decision process considers visible short, medium and long-range obstructions and the current position of the target person. The system is implemented using MRDS and its functional test performance is presented over a series of Virtual Simulation Environment scenarios, greenlighting further extensive benchmark simulations. Full article

► Show Figures

Figure 1

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI