sensors-logo

Journal Browser

Journal Browser

Sensors and Computer Vision Techniques for 3D Object Modeling

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Physical Sensors".

Deadline for manuscript submissions: closed (15 December 2020) | Viewed by 68100

Special Issue Editor


E-Mail Website
Guest Editor
Computer Science & Engineering Department, Polytechnic University of Bucharest and a Senior Researcher at the Institute of Mathematics of the Romanian Academy (IMAR), Bucharest, Romania
Interests: computer vision; machine learning; robotics; artificial intelligence
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Perceiving and predicting the complex 3D structure of the dynamic world requires a combination of sophisticated geometric modeling, efficient learning abilities, as well as having the right 3D sensors and imaging devices. Today’s research in robotics, computer vision, 3D sensing, and machine learning offers a plethora of fast algorithms, accurate 3D sensing capabilities, as well as powerful deep neural networks, which leverage the statistical properties of the moving 3D world. This Special Issue aims to bring together state-of-the-art research in all these directions for 3D learning, modeling, and prediction to establish a strong common foundation and create bridges towards next-generation models and methods.

Topics of interest include but are not limited to the following:

  1. Advanced sensors and analytical techniques for 3D object modeling and depth estimation;
  2. Deep learning and computer vision approaches to 3D object recognition, depth, pose and trajectory estimation in static or dynamic scenes with multiple objects;
  3. Efficient algorithms and computational models for 3D robot perception, mapping, localization, obstacle avoidance, navigation, and semantic reasoning;
  4. 3D scene estimation for self-driving cars and autonomous aerial vehicles;
  5. Sensor fusion and multitask learning for 3D modeling and estimation;
  6. 3D object modeling and prediction on embedded platforms;
  7. Unsupervised learning of depth and 3D structures in space and time;
  8. Active reinforcement learning approaches to 3D perception and prediction;
  9. Sensors, techniques, and advanced machine learning models for 3D learning, modeling, and estimation with applications to medicine, environment, agriculture, transportation, aerospace, civil structures, and different other industries;
  10. Machine Learning Techniques for Augmented and Virtual Reality;
  11. Efficient Learning and Computation for Embedded Vision Systems.

Assoc. Prof. Marius Leordeanu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

29 pages, 9856 KiB  
Article
3D Object Detection and Instance Segmentation from 3D Range and 2D Color Images
by Xiaoke Shen and Ioannis Stamos
Sensors 2021, 21(4), 1213; https://doi.org/10.3390/s21041213 - 9 Feb 2021
Cited by 8 | Viewed by 4842
Abstract
Instance segmentation and object detection are significant problems in the fields of computer vision and robotics. We address those problems by proposing a novel object segmentation and detection system. First, we detect 2D objects based on RGB, depth only, or RGB-D images. A [...] Read more.
Instance segmentation and object detection are significant problems in the fields of computer vision and robotics. We address those problems by proposing a novel object segmentation and detection system. First, we detect 2D objects based on RGB, depth only, or RGB-D images. A 3D convolutional-based system, named Frustum VoxNet, is proposed. This system generates frustums from 2D detection results, proposes 3D candidate voxelized images for each frustum, and uses a 3D convolutional neural network (CNN) based on these candidates voxelized images to perform the 3D instance segmentation and object detection. Results on the SUN RGB-D dataset show that our RGB-D-based system’s 3D inference is much faster than state-of-the-art methods, without a significant loss of accuracy. At the same time, we can provide segmentation and detection results using depth only images, with accuracy comparable to RGB-D-based systems. This is important since our methods can also work well in low lighting conditions, or with sensors that do not acquire RGB images. Finally, the use of segmentation as part of our pipeline increases detection accuracy, while providing at the same time 3D instance segmentation. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

16 pages, 8505 KiB  
Article
An Anatomical Thermal 3D Model in Preclinical Research: Combining CT and Thermal Images
by Franziska Schollemann, Carina Barbosa Pereira, Stefanie Rosenhain, Andreas Follmann, Felix Gremse, Fabian Kiessling, Michael Czaplik and Mauren Abreu de Souza
Sensors 2021, 21(4), 1200; https://doi.org/10.3390/s21041200 - 9 Feb 2021
Cited by 5 | Viewed by 2871
Abstract
Even though animal trials are a controversial topic, they provide knowledge about diseases and the course of infections in a medical context. To refine the detection of abnormalities that can cause pain and stress to the animal as early as possible, new processes [...] Read more.
Even though animal trials are a controversial topic, they provide knowledge about diseases and the course of infections in a medical context. To refine the detection of abnormalities that can cause pain and stress to the animal as early as possible, new processes must be developed. Due to its noninvasive nature, thermal imaging is increasingly used for severity assessment in animal-based research. Within a multimodal approach, thermal images combined with anatomical information could be used to simulate the inner temperature profile, thereby allowing the detection of deep-seated infections. This paper presents the generation of anatomical thermal 3D models, forming the underlying multimodal model in this simulation. These models combine anatomical 3D information based on computed tomography (CT) data with a registered thermal shell measured with infrared thermography. The process of generating these models consists of data acquisition (both thermal images and CT), camera calibration, image processing methods, and structure from motion (SfM), among others. Anatomical thermal 3D models were successfully generated using three anesthetized mice. Due to the image processing improvement, the process was also realized for areas with few features, which increases the transferability of the process. The result of this multimodal registration in 3D space can be viewed and analyzed within a visualization tool. Individual CT slices can be analyzed axially, sagittally, and coronally with the corresponding superficial skin temperature distribution. This is an important and successfully implemented milestone on the way to simulating the internal temperature profile. Using this temperature profile, deep-seated infections and inflammation can be detected in order to reduce animal suffering. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

13 pages, 24161 KiB  
Article
Automatic Annotation of Change Detection Images
by Nathalie Neptune and Josiane Mothe
Sensors 2021, 21(4), 1110; https://doi.org/10.3390/s21041110 - 5 Feb 2021
Cited by 6 | Viewed by 2338
Abstract
Earth observation satellites have been capturing a variety of data about our planet for several decades, making many environmental applications possible such as change detection. Recently, deep learning methods have been proposed for urban change detection. However, there has been limited work done [...] Read more.
Earth observation satellites have been capturing a variety of data about our planet for several decades, making many environmental applications possible such as change detection. Recently, deep learning methods have been proposed for urban change detection. However, there has been limited work done on the application of such methods to the annotation of unlabeled images in the case of change detection in forests. This annotation task consists of predicting semantic labels for a given image of a forested area where change has been detected. Currently proposed methods typically do not provide other semantic information beyond the change that is detected. To address these limitations we first demonstrate that deep learning methods can be effectively used to detect changes in a forested area with a pair of pre and post-change satellite images. We show that by using visual semantic embeddings we can automatically annotate the change images with labels extracted from scientific documents related to the study area. We investigated the effect of different corpora and found that best performances in the annotation prediction task are reached with a corpus that is related to the type of change of interest and is of medium size (over ten thousand documents). Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

15 pages, 1838 KiB  
Article
Extrinsic Camera Calibration with Line-Laser Projection
by Izaak Van Crombrugge, Rudi Penne and Steve Vanlanduit
Sensors 2021, 21(4), 1091; https://doi.org/10.3390/s21041091 - 5 Feb 2021
Cited by 8 | Viewed by 3056
Abstract
Knowledge of precise camera poses is vital for multi-camera setups. Camera intrinsics can be obtained for each camera separately in lab conditions. For fixed multi-camera setups, the extrinsic calibration can only be done in situ. Usually, some markers are used, like checkerboards, requiring [...] Read more.
Knowledge of precise camera poses is vital for multi-camera setups. Camera intrinsics can be obtained for each camera separately in lab conditions. For fixed multi-camera setups, the extrinsic calibration can only be done in situ. Usually, some markers are used, like checkerboards, requiring some level of overlap between cameras. In this work, we propose a method for cases with little or no overlap. Laser lines are projected on a plane (e.g., floor or wall) using a laser line projector. The pose of the plane and cameras is then optimized using bundle adjustment to match the lines seen by the cameras. To find the extrinsic calibration, only a partial overlap between the laser lines and the field of view of the cameras is needed. Real-world experiments were conducted both with and without overlapping fields of view, resulting in rotation errors below 0.5°. We show that the accuracy is comparable to other state-of-the-art methods while offering a more practical procedure. The method can also be used in large-scale applications and can be fully automated. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

17 pages, 31154 KiB  
Article
Realworld 3D Object Recognition Using a 3D Extension of the HOG Descriptor and a Depth Camera
by Cristian Vilar, Silvia Krug and Mattias O’Nils
Sensors 2021, 21(3), 910; https://doi.org/10.3390/s21030910 - 29 Jan 2021
Cited by 8 | Viewed by 2876
Abstract
3D object recognition is an generic task in robotics and autonomous vehicles. In this paper, we propose a 3D object recognition approach using a 3D extension of the histogram-of-gradients object descriptor with data captured with a depth camera. The presented method makes use [...] Read more.
3D object recognition is an generic task in robotics and autonomous vehicles. In this paper, we propose a 3D object recognition approach using a 3D extension of the histogram-of-gradients object descriptor with data captured with a depth camera. The presented method makes use of synthetic objects for training the object classifier, and classify real objects captured by the depth camera. The preprocessing methods include operations to achieve rotational invariance as well as to maximize the recognition accuracy while reducing the feature dimensionality at the same time. By studying different preprocessing options, we show challenges that need to be addressed when moving from synthetic to real data. The recognition performance was evaluated with a real dataset captured by a depth camera and the results show a maximum recognition accuracy of 81.5%. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

22 pages, 5810 KiB  
Article
Driven by Vision: Learning Navigation by Visual Localization and Trajectory Prediction
by Marius Leordeanu and Iulia Paraicu
Sensors 2021, 21(3), 852; https://doi.org/10.3390/s21030852 - 27 Jan 2021
Cited by 4 | Viewed by 2439
Abstract
When driving, people make decisions based on current traffic as well as their desired route. They have a mental map of known routes and are often able to navigate without needing directions. Current published self-driving models improve their performances when using additional GPS [...] Read more.
When driving, people make decisions based on current traffic as well as their desired route. They have a mental map of known routes and are often able to navigate without needing directions. Current published self-driving models improve their performances when using additional GPS information. Here we aim to push forward self-driving research and perform route planning even in the complete absence of GPS at inference time. Our system learns to predict in real-time vehicle’s current location and future trajectory, on a known map, given only the raw video stream and the final destination. Trajectories consist of instant steering commands that depend on present traffic, as well as longer-term navigation decisions towards a specific destination. Along with our novel proposed approach to localization and navigation from visual data, we also introduce a novel large dataset in an urban environment, which consists of video and GPS streams collected with a smartphone while driving. The GPS is automatically processed to obtain supervision labels and to create an analytical representation of the traversed map. In tests, our solution outperforms published state of the art methods on visual localization and steering and provides reliable navigation assistance between any two known locations. We also show that our system can adapt to short and long-term changes in weather conditions or the structure of the urban environment. We make the entire dataset and the code publicly available. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

15 pages, 3092 KiB  
Article
Transfer of Learning from Vision to Touch: A Hybrid Deep Convolutional Neural Network for Visuo-Tactile 3D Object Recognition
by Ghazal Rouhafzay, Ana-Maria Cretu and Pierre Payeur
Sensors 2021, 21(1), 113; https://doi.org/10.3390/s21010113 - 27 Dec 2020
Cited by 11 | Viewed by 3448
Abstract
Transfer of learning or leveraging a pre-trained network and fine-tuning it to perform new tasks has been successfully applied in a variety of machine intelligence fields, including computer vision, natural language processing and audio/speech recognition. Drawing inspiration from neuroscience research that suggests that [...] Read more.
Transfer of learning or leveraging a pre-trained network and fine-tuning it to perform new tasks has been successfully applied in a variety of machine intelligence fields, including computer vision, natural language processing and audio/speech recognition. Drawing inspiration from neuroscience research that suggests that both visual and tactile stimuli rouse similar neural networks in the human brain, in this work, we explore the idea of transferring learning from vision to touch in the context of 3D object recognition. In particular, deep convolutional neural networks (CNN) pre-trained on visual images are adapted and evaluated for the classification of tactile data sets. To do so, we ran experiments with five different pre-trained CNN architectures and on five different datasets acquired with different technologies of tactile sensors including BathTip, Gelsight, force-sensing resistor (FSR) array, a high-resolution virtual FSR sensor, and tactile sensors on the Barrett robotic hand. The results obtained confirm the transferability of learning from vision to touch to interpret 3D models. Due to its higher resolution, tactile data from optical tactile sensors was demonstrated to achieve higher classification rates based on visual features compared to other technologies relying on pressure measurements. Further analysis of the weight updates in the convolutional layer is performed to measure the similarity between visual and tactile features for each technology of tactile sensing. Comparing the weight updates in different convolutional layers suggests that by updating a few convolutional layers of a pre-trained CNN on visual data, it can be efficiently used to classify tactile data. Accordingly, we propose a hybrid architecture performing both visual and tactile 3D object recognition with a MobileNetV2 backbone. MobileNetV2 is chosen due to its smaller size and thus its capability to be implemented on mobile devices, such that the network can classify both visual and tactile data. An accuracy of 100% for visual and 77.63% for tactile data are achieved by the proposed architecture. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

31 pages, 5864 KiB  
Article
Integrating Biosignals Measurement in Virtual Reality Environments for Anxiety Detection
by Livia Petrescu, Cătălin Petrescu, Oana Mitruț, Gabriela Moise, Alin Moldoveanu, Florica Moldoveanu and Marius Leordeanu
Sensors 2020, 20(24), 7088; https://doi.org/10.3390/s20247088 - 10 Dec 2020
Cited by 27 | Viewed by 4696
Abstract
This paper proposes a protocol for the acquisition and processing of biophysical signals in virtual reality applications, particularly in phobia therapy experiments. This protocol aims to ensure that the measurement and processing phases are performed effectively, to obtain clean data that can be [...] Read more.
This paper proposes a protocol for the acquisition and processing of biophysical signals in virtual reality applications, particularly in phobia therapy experiments. This protocol aims to ensure that the measurement and processing phases are performed effectively, to obtain clean data that can be used to estimate the users’ anxiety levels. The protocol has been designed after analyzing the experimental data of seven subjects who have been exposed to heights in a virtual reality environment. The subjects’ level of anxiety has been estimated based on the real-time evaluation of a nonlinear function that has as parameters various features extracted from the biophysical signals. The highest classification accuracy was obtained using a combination of seven heart rate and electrodermal activity features in the time domain and frequency domain. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

16 pages, 6678 KiB  
Article
A Single Image 3D Reconstruction Method Based on a Novel Monocular Vision System
by Fupei Wu, Shukai Zhu and Weilin Ye
Sensors 2020, 20(24), 7045; https://doi.org/10.3390/s20247045 - 9 Dec 2020
Cited by 6 | Viewed by 2216
Abstract
Three-dimensional (3D) reconstruction and measurement are popular techniques in precision manufacturing processes. In this manuscript, a single image 3D reconstruction method is proposed based on a novel monocular vision system, which includes a three-level charge coupled device (3-CCD) camera and a ring structured [...] Read more.
Three-dimensional (3D) reconstruction and measurement are popular techniques in precision manufacturing processes. In this manuscript, a single image 3D reconstruction method is proposed based on a novel monocular vision system, which includes a three-level charge coupled device (3-CCD) camera and a ring structured multi-color light emitting diode (LED) illumination. Firstly, a procedure for the calibration of the illumination’s parameters, including LEDs’ mounted angles, distribution density and incident angles, is proposed. Secondly, the incident light information, the color distribution information and gray level information are extracted from the acquired image, and the 3D reconstruction model is built based on the camera imaging model. Thirdly, the surface height information of the detected object within the field of view is computed based on the built model. The proposed method aims at solving the uncertainty and the slow convergence issues arising in 3D surface topography reconstruction using current shape-from-shading (SFS) methods. Three-dimensional reconstruction experimental tests are carried out on convex, concave, angular surfaces and on a mobile subscriber identification module (SIM) card slot, showing relative errors less than 3.6%, respectively. Advantages of the proposed method include a reduced time for 3D surface reconstruction compared to other methods, demonstrating good suitability of the proposed method in reconstructing surface 3D morphology. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

23 pages, 5887 KiB  
Article
Scale-Aware Multi-View Reconstruction Using an Active Triple-Camera System
by Hang Luo, Christian Pape and Eduard Reithmeier
Sensors 2020, 20(23), 6726; https://doi.org/10.3390/s20236726 - 25 Nov 2020
Cited by 3 | Viewed by 2204
Abstract
This paper presents an active wide-baseline triple-camera measurement system designed especially for 3D modeling in general outdoor environments, as well as a novel parallel surface refinement algorithm within the multi-view stereo (MVS) framework. Firstly, the pre-processing module converts the synchronized raw triple images [...] Read more.
This paper presents an active wide-baseline triple-camera measurement system designed especially for 3D modeling in general outdoor environments, as well as a novel parallel surface refinement algorithm within the multi-view stereo (MVS) framework. Firstly, the pre-processing module converts the synchronized raw triple images from one single-shot acquisition of our setup to aligned RGB-Depth frames, which are then used for camera pose estimation using iterative closest point (ICP) and RANSAC perspective-n-point (PnP) approaches. Afterwards, an efficient dense reconstruction method, mostly implemented on the GPU in a grid manner, takes the raw depth data as input and optimizes the per-pixel depth values based on the multi-view photographic evidence, surface curvature and depth priors. Through a basic fusion scheme, an accurate and complete 3D model can be obtained from these enhanced depth maps. For a comprehensive test, the proposed MVS implementation is evaluated on benchmark and synthetic datasets, and a real-world reconstruction experiment is also conducted using our measurement system in an outdoor scenario. The results demonstrate that (1) our MVS method achieves very competitive performance in terms of modeling accuracy, surface completeness and noise reduction, given an input coarse geometry; and (2) despite some limitations, our triple-camera setup in combination with the proposed reconstruction routine, can be applied to some practical 3D modeling tasks operated in outdoor environments where conventional stereo or depth senors would normally suffer. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

17 pages, 10195 KiB  
Article
Reconstruction of High-Precision Semantic Map
by Xinyuan Tu, Jian Zhang, Runhao Luo, Kai Wang, Qingji Zeng, Yu Zhou, Yao Yu and Sidan Du
Sensors 2020, 20(21), 6264; https://doi.org/10.3390/s20216264 - 3 Nov 2020
Cited by 3 | Viewed by 2834
Abstract
We present a real-time Truncated Signed Distance Field (TSDF)-based three-dimensional (3D) semantic reconstruction for LiDAR point cloud, which achieves incremental surface reconstruction and highly accurate semantic segmentation. The high-precise 3D semantic reconstruction in real time on LiDAR data is important but challenging. Lighting [...] Read more.
We present a real-time Truncated Signed Distance Field (TSDF)-based three-dimensional (3D) semantic reconstruction for LiDAR point cloud, which achieves incremental surface reconstruction and highly accurate semantic segmentation. The high-precise 3D semantic reconstruction in real time on LiDAR data is important but challenging. Lighting Detection and Ranging (LiDAR) data with high accuracy is massive for 3D reconstruction. We so propose a line-of-sight algorithm to update implicit surface incrementally. Meanwhile, in order to use more semantic information effectively, an online attention-based spatial and temporal feature fusion method is proposed, which is well integrated into the reconstruction system. We implement parallel computation in the reconstruction and semantic fusion process, which achieves real-time performance. We demonstrate our approach on the CARLA dataset, Apollo dataset, and our dataset. When compared with the state-of-art mapping methods, our method has a great advantage in terms of both quality and speed, which meets the needs of robotic mapping and navigation. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

20 pages, 6369 KiB  
Article
Depth Estimation for Light-Field Images Using Stereo Matching and Convolutional Neural Networks
by Ségolène Rogge, Ionut Schiopu and Adrian Munteanu
Sensors 2020, 20(21), 6188; https://doi.org/10.3390/s20216188 - 30 Oct 2020
Cited by 12 | Viewed by 3241
Abstract
The paper presents a novel depth-estimation method for light-field (LF) images based on innovative multi-stereo matching and machine-learning techniques. In the first stage, a novel block-based stereo matching algorithm is employed to compute the initial estimation. The proposed algorithm is specifically designed to [...] Read more.
The paper presents a novel depth-estimation method for light-field (LF) images based on innovative multi-stereo matching and machine-learning techniques. In the first stage, a novel block-based stereo matching algorithm is employed to compute the initial estimation. The proposed algorithm is specifically designed to operate on any pair of sub-aperture images (SAIs) in the LF image and to compute the pair’s corresponding disparity map. For the central SAI, a disparity fusion technique is proposed to compute the initial disparity map based on all available pairwise disparities. In the second stage, a novel pixel-wise deep-learning (DL)-based method for residual error prediction is employed to further refine the disparity estimation. A novel neural network architecture is proposed based on a new structure of layers. The proposed DL-based method is employed to predict the residual error of the initial estimation and to refine the final disparity map. The experimental results demonstrate the superiority of the proposed framework and reveal that the proposed method achieves an average improvement of 15.65% in root mean squared error (RMSE), 43.62% in mean absolute error (MAE), and 5.03% in structural similarity index (SSIM) over machine-learning-based state-of-the-art methods. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Graphical abstract

21 pages, 8516 KiB  
Article
Accurate and Efficient Intracranial Hemorrhage Detection and Subtype Classification in 3D CT Scans with Convolutional and Long Short-Term Memory Neural Networks
by Mihail Burduja, Radu Tudor Ionescu and Nicolae Verga
Sensors 2020, 20(19), 5611; https://doi.org/10.3390/s20195611 - 1 Oct 2020
Cited by 75 | Viewed by 7461
Abstract
In this paper, we present our system for the RSNA Intracranial Hemorrhage Detection challenge, which is based on the RSNA 2019 Brain CT Hemorrhage dataset. The proposed system is based on a lightweight deep neural network architecture composed of a convolutional neural network [...] Read more.
In this paper, we present our system for the RSNA Intracranial Hemorrhage Detection challenge, which is based on the RSNA 2019 Brain CT Hemorrhage dataset. The proposed system is based on a lightweight deep neural network architecture composed of a convolutional neural network (CNN) that takes as input individual CT slices, and a Long Short-Term Memory (LSTM) network that takes as input multiple feature embeddings provided by the CNN. For efficient processing, we consider various feature selection methods to produce a subset of useful CNN features for the LSTM. Furthermore, we reduce the CT slices by a factor of 2×, which enables us to train the model faster. Even if our model is designed to balance speed and accuracy, we report a weighted mean log loss of 0.04989 on the final test set, which places us in the top 30 ranking (2%) from a total of 1345 participants. While our computing infrastructure does not allow it, processing CT slices at their original scale is likely to improve performance. In order to enable others to reproduce our results, we provide our code as open source. After the challenge, we conducted a subjective intracranial hemorrhage detection assessment by radiologists, indicating that the performance of our deep model is on par with that of doctors specialized in reading CT scans. Another contribution of our work is to integrate Grad-CAM visualizations in our system, providing useful explanations for its predictions. We therefore consider our system as a viable option when a fast diagnosis or a second opinion on intracranial hemorrhage detection are needed. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

Review

Jump to: Research, Other

26 pages, 936 KiB  
Review
Active Mapping and Robot Exploration: A Survey
by Iker Lluvia, Elena Lazkano and Ander Ansuategi
Sensors 2021, 21(7), 2445; https://doi.org/10.3390/s21072445 - 2 Apr 2021
Cited by 69 | Viewed by 9800
Abstract
Simultaneous localization and mapping responds to the problem of building a map of the environment without any prior information and based on the data obtained from one or more sensors. In most situations, the robot is driven by a human operator, but some [...] Read more.
Simultaneous localization and mapping responds to the problem of building a map of the environment without any prior information and based on the data obtained from one or more sensors. In most situations, the robot is driven by a human operator, but some systems are capable of navigating autonomously while mapping, which is called active simultaneous localization and mapping. This strategy focuses on actively calculating the trajectories to explore the environment while building a map with a minimum error. In this paper, a comprehensive review of the research work developed in this field is provided, targeting the most relevant contributions in indoor mobile robotics. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

20 pages, 1606 KiB  
Review
A Review on Map-Merging Methods for Typical Map Types in Multiple-Ground-Robot SLAM Solutions
by Shuien Yu, Chunyun Fu, Amirali K. Gostar and Minghui Hu
Sensors 2020, 20(23), 6988; https://doi.org/10.3390/s20236988 - 7 Dec 2020
Cited by 29 | Viewed by 6597
Abstract
When multiple robots are involved in the process of simultaneous localization and mapping (SLAM), a global map should be constructed by merging the local maps built by individual robots, so as to provide a better representation of the environment. Hence, the map-merging methods [...] Read more.
When multiple robots are involved in the process of simultaneous localization and mapping (SLAM), a global map should be constructed by merging the local maps built by individual robots, so as to provide a better representation of the environment. Hence, the map-merging methods play a crucial rule in multi-robot systems and determine the performance of multi-robot SLAM. This paper looks into the key problem of map merging for multiple-ground-robot SLAM and reviews the typical map-merging methods for several important types of maps in SLAM applications: occupancy grid maps, feature-based maps, and topological maps. These map-merging approaches are classified based on their working mechanism or the type of features they deal with. The concepts and characteristics of these map-merging methods are elaborated in this review. The contents summarized in this paper provide insights and guidance for future multiple-ground-robot SLAM solutions. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

Other

Jump to: Research, Review

16 pages, 8804 KiB  
Letter
Two-Dimensional LiDAR Sensor-Based Three-Dimensional Point Cloud Modeling Method for Identification of Anomalies inside Tube Structures for Future Hypersonic Transportation
by Jongdae Baek
Sensors 2020, 20(24), 7235; https://doi.org/10.3390/s20247235 - 17 Dec 2020
Cited by 3 | Viewed by 3574
Abstract
The hyperloop transportation system has emerged as an innovative next-generation transportation system. In this system, a capsule-type vehicle inside a sealed near-vacuum tube moves at 1000 km/h or more. Not only must this transport tube span over long distances, but it must be [...] Read more.
The hyperloop transportation system has emerged as an innovative next-generation transportation system. In this system, a capsule-type vehicle inside a sealed near-vacuum tube moves at 1000 km/h or more. Not only must this transport tube span over long distances, but it must be clear of potential hazards to vehicles traveling at high speeds inside the tube. Therefore, an automated infrastructure anomaly detection system is essential. This study sought to confirm the applicability of advanced sensing technology such as Light Detection and Ranging (LiDAR) in the automatic anomaly detection of next-generation transportation infrastructure such as hyperloops. To this end, a prototype two-dimensional LiDAR sensor was constructed and used to generate three-dimensional (3D) point cloud models of a tube facility. A technique for detecting abnormal conditions or obstacles in the facility was used, which involved comparing the models and determining the changes. The design and development process of the 3D safety monitoring system using 3D point cloud models and the analytical results of experimental data using this system are presented. The tests on the developed system demonstrated that anomalies such as a 25 mm change in position were accurately detected. Thus, we confirm the applicability of the developed system in next-generation transportation infrastructure. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

16 pages, 2329 KiB  
Letter
3DMGNet: 3D Model Generation Network Based on Multi-Modal Data Constraints and Multi-Level Feature Fusion
by Ende Wang, Lei Xue, Yong Li, Zhenxin Zhang and Xukui Hou
Sensors 2020, 20(17), 4875; https://doi.org/10.3390/s20174875 - 28 Aug 2020
Cited by 2 | Viewed by 2296
Abstract
Due to the limitation of less information in a single image, it is very difficult to generate a high-precision 3D model based on the image. There are some problems in the generation of 3D voxel models, e.g., the information loss at the upper [...] Read more.
Due to the limitation of less information in a single image, it is very difficult to generate a high-precision 3D model based on the image. There are some problems in the generation of 3D voxel models, e.g., the information loss at the upper level of a network. To solve these problems, we design a 3D model generation network based on multi-modal data constraints and multi-level feature fusion, named as 3DMGNet. Moreover, 3DMGNet is trained by self-supervised method to achieve 3D voxel model generation from an image. The image feature extraction network (2DNet) and 3D feature extraction network (3D auxiliary network) are used to extract the features of the image and 3D voxel model. Then, feature fusion is used to integrate the low-level features into the high-level features in the 3D auxiliary network. To extract more effective features, each layer of the feature map in feature extraction network is processed by an attention network. Finally, the extracted features generate 3D models by a 3D deconvolution network. The feature extraction of 3D model and the generation of voxelization play an auxiliary role in the training of the whole network for the 3D model generation based on an image. Additionally, a multi-view contour constraint method is proposed, to enhance the effect of the 3D model generation. In the experiment, the ShapeNet dataset is adapted to prove the effect of the 3DMGNet, which verifies the robust performance of the proposed method. Full article
(This article belongs to the Special Issue Sensors and Computer Vision Techniques for 3D Object Modeling)
Show Figures

Figure 1

Back to TopTop