E-Mail Alert

Add your e-mail address to receive forthcoming issues of this journal:

Journal Browser

Journal Browser

Special Issue "Visual Sensors"

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Physical Sensors".

Deadline for manuscript submissions: closed (31 January 2019)

Special Issue Editors

Guest Editor
Prof. Dr. Luis Payá

System Engineering and Automation Department, Miguel Hernandez University, Elche (Alicante) 03202, Spain
Website | E-Mail
Interests: computer vision, omnidirectional imaging, appearance descriptors, image processing, mobile robotics, environment modeling, and visual localization
Guest Editor
Prof. Dr. Oscar Reinoso

System Engineering and Automation Department, Miguel Hernandez University, Elche (Alicante) 03202, Spain
Website | E-Mail
Interests: visual appearance; mobile robots; parallel robots; mapping

Special Issue Information

Dear Colleagues,

Visual sensors are able to capture a large quantity of information from the environment around them. Nowadays, a wide variety of visual systems can be found, from the classical monocular systems to omnidirectional, RGB-D and more sophisticated 3D systems. Every configuration presents some specific characteristics that make them useful to solve different problems. Their range of applications is wide and varied. Amongst them, we can find robotics, industry, agriculture, quality control, visual inspection, surveillance, autonomous driving and navigation aid systems.

Visual systems can be used to obtain relevant information from the environment, which can be processed to solve a specific problem. The aim of this Special Issue is to present some of the possibilities that vision systems offer, focusing on the different configurations that can be used and novel applications in any field. Furthermore, reviews presenting a deep analysis of a specific problem and the use of vision systems to address it would also be appropriate.

This Special Issue invites contributions in the following topics (but is not limited to them):

  • Image analysis.
  • Visual pattern recognition.
  • Object recognition by visual sensors.
  • Movement estimation or registration from images.
  • Visual sensors in robotics.
  • Visual sensors in industrial applications.
  • Computer vision for quality evaluation.
  • Visual sensors in agriculture.
  • Computer vision in autonomous driving.
  • Environment modeling and reconstruction from images.
  • Visual Localization.
  • Visual SLAM.

Prof. Dr. Oscar Reinoso
Dr. Luis Payá
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • 3D imaging
  • Stereo visual systems
  • Omnidirectional visual systems
  • Quality assessment
  • Pattern recognition
  • Visual registration
  • Visual navigation
  • Visual mapping
  • LiDAR/vision system
  • Multi-visual sensors
  • RGB-D cameras
  • Fusion of visual information
  • Networks of visual sensors

Published Papers (34 papers)

View options order results:
result details:
Displaying articles 1-34
Export citation of selected articles as:

Research

Open AccessArticle Star Image Prediction and Restoration under Dynamic Conditions
Sensors 2019, 19(8), 1890; https://doi.org/10.3390/s19081890
Received: 24 March 2019 / Accepted: 16 April 2019 / Published: 20 April 2019
PDF Full-text (6641 KB) | HTML Full-text | XML Full-text
Abstract
The star sensor is widely used in attitude control systems of spacecraft for attitude measurement. However, under high dynamic conditions, frame loss and smearing of the star image may appear and result in decreased accuracy or even failure of the star centroid extraction [...] Read more.
The star sensor is widely used in attitude control systems of spacecraft for attitude measurement. However, under high dynamic conditions, frame loss and smearing of the star image may appear and result in decreased accuracy or even failure of the star centroid extraction and attitude determination. To improve the performance of the star sensor under dynamic conditions, a gyroscope-assisted star image prediction method and an improved Richardson-Lucy (RL) algorithm based on the ensemble back-propagation neural network (EBPNN) are proposed. First, for the frame loss problem of the star sensor, considering the distortion of the star sensor lens, a prediction model of the star spot position is obtained by the angular rates of the gyroscope. Second, to restore the smearing star image, the point spread function (PSF) is calculated by the angular velocity of the gyroscope. Then, we use the EBPNN to predict the number of iterations required by the RL algorithm to complete the star image deblurring. Finally, simulation experiments are performed to verify the effectiveness and real-time of the proposed algorithm. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Vision for Robust Robot Manipulation
Sensors 2019, 19(7), 1648; https://doi.org/10.3390/s19071648
Received: 24 December 2018 / Revised: 1 April 2019 / Accepted: 3 April 2019 / Published: 6 April 2019
PDF Full-text (24172 KB) | HTML Full-text | XML Full-text
Abstract
Advances in Robotics are leading to a new generation of assistant robots working in ordinary, domestic settings. This evolution raises new challenges in the tasks to be accomplished by the robots. This is the case for object manipulation where the detect-approach-grasp loop requires [...] Read more.
Advances in Robotics are leading to a new generation of assistant robots working in ordinary, domestic settings. This evolution raises new challenges in the tasks to be accomplished by the robots. This is the case for object manipulation where the detect-approach-grasp loop requires a robust recovery stage, especially when the held object slides. Several proprioceptive sensors have been developed in the last decades, such as tactile sensors or contact switches, that can be used for that purpose; nevertheless, their implementation may considerably restrict the gripper’s flexibility and functionality, increasing their cost and complexity. Alternatively, vision can be used since it is an undoubtedly rich source of information, and in particular, depth vision sensors. We present an approach based on depth cameras to robustly evaluate the manipulation success, continuously reporting about any object loss and, consequently, allowing it to robustly recover from this situation. For that, a Lab-colour segmentation allows the robot to identify potential robot manipulators in the image. Then, the depth information is used to detect any edge resulting from two-object contact. The combination of those techniques allows the robot to accurately detect the presence or absence of contact points between the robot manipulator and a held object. An experimental evaluation in realistic indoor environments supports our approach. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle 2D Rotation-Angle Measurement Utilizing Least Iterative Region Segmentation
Sensors 2019, 19(7), 1634; https://doi.org/10.3390/s19071634
Received: 3 March 2019 / Revised: 27 March 2019 / Accepted: 3 April 2019 / Published: 5 April 2019
PDF Full-text (2804 KB) | HTML Full-text | XML Full-text
Abstract
When geometric moments are used to measure the rotation-angle of plane workpieces, the same rotation angle would be obtained with dissimilar poses. Such a case would be shown as an error in an automatic sorting system. Here, we present an improved rotation-angle measurement [...] Read more.
When geometric moments are used to measure the rotation-angle of plane workpieces, the same rotation angle would be obtained with dissimilar poses. Such a case would be shown as an error in an automatic sorting system. Here, we present an improved rotation-angle measurement method based on geometric moments, which is suitable for automatic sorting systems. The method can overcome this limitation to obtain accurate results. The accuracy, speed, and generality of this method are analyzed in detail. In addition, a rotation-angle measurement error model is established to study the effect of camera pose on the rotation-angle measurement accuracy. We find that a rotation-angle measurement error will occur with a non-ideal camera pose. Thus, a correction method is proposed to increase accuracy and reduce the measurement error caused by camera pose. Finally, an automatic sorting system is developed, and experiments are conducted to verify the effectiveness of our methods. The experimental results show that the rotation angles are accurately obtained and workpieces could be correctly placed by this system. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle A Convenient Calibration Method for LRF-Camera Combination Systems Based on a Checkerboard
Sensors 2019, 19(6), 1315; https://doi.org/10.3390/s19061315
Received: 6 February 2019 / Revised: 1 March 2019 / Accepted: 11 March 2019 / Published: 15 March 2019
PDF Full-text (5941 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, a simple and easy high-precision calibration method is proposed for the LRF-camera combined measurement system which is widely used at present. This method can be applied not only to mainstream 2D and 3D LRF-cameras, but also to calibrate newly developed [...] Read more.
In this paper, a simple and easy high-precision calibration method is proposed for the LRF-camera combined measurement system which is widely used at present. This method can be applied not only to mainstream 2D and 3D LRF-cameras, but also to calibrate newly developed 1D LRF-camera combined systems. It only needs a calibration board to record at least three sets of data. First, the camera parameters and distortion coefficients are decoupled by the distortion center. Then, the spatial coordinates of laser spots are solved using line and plane constraints, and the estimation of LRF-camera extrinsic parameters is realized. In addition, we establish a cost function for optimizing the system. Finally, the calibration accuracy and characteristics of the method are analyzed through simulation experiments, and the validity of the method is verified through the calibration of a real system. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle A Vision Based Detection Method for Narrow Butt Joints and a Robotic Seam Tracking System
Sensors 2019, 19(5), 1144; https://doi.org/10.3390/s19051144
Received: 18 January 2019 / Revised: 28 February 2019 / Accepted: 1 March 2019 / Published: 6 March 2019
PDF Full-text (8450 KB) | HTML Full-text | XML Full-text
Abstract
Automatic joint detection is of vital importance for the teaching of robots before welding and the seam tracking during welding. For narrow butt joints, the traditional structured light method may be ineffective, and many existing detection methods designed for narrow butt joints can [...] Read more.
Automatic joint detection is of vital importance for the teaching of robots before welding and the seam tracking during welding. For narrow butt joints, the traditional structured light method may be ineffective, and many existing detection methods designed for narrow butt joints can only detect their 2D position. However, for butt joints with narrow gaps and 3D trajectories, their 3D position and orientation of the workpiece surface are required. In this paper, a vision based detection method for narrow butt joints is proposed. A crosshair laser is projected onto the workpiece surface and an auxiliary light source is used to illuminate the workpiece surface continuously. Then, images with an appropriate grayscale distribution are grabbed with the auto exposure function of the camera. The 3D position of the joint and the normal vector of the workpiece surface are calculated by the combination of the 2D and 3D information in the images. In addition, the detection method is applied in a robotic seam tracking system for GTAW (gas tungsten arc welding). Different filtering methods are used to smooth the detection results, and compared with the moving average method, the Kalman filter can reduce the dithering of the robot and improve the tracking accuracy significantly. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle RGB-D SLAM with Manhattan Frame Estimation Using Orientation Relevance
Sensors 2019, 19(5), 1050; https://doi.org/10.3390/s19051050
Received: 31 January 2019 / Revised: 22 February 2019 / Accepted: 25 February 2019 / Published: 1 March 2019
PDF Full-text (9072 KB) | HTML Full-text | XML Full-text
Abstract
Due to image noise, image blur, and inconsistency between depth data and color image, the accuracy and robustness of the pairwise spatial transformation computed by matching extracted features of detected key points in existing sparse Red Green Blue-Depth (RGB-D) Simultaneously Localization And Mapping [...] Read more.
Due to image noise, image blur, and inconsistency between depth data and color image, the accuracy and robustness of the pairwise spatial transformation computed by matching extracted features of detected key points in existing sparse Red Green Blue-Depth (RGB-D) Simultaneously Localization And Mapping (SLAM) algorithms are poor. Considering that most indoor environments follow the Manhattan World assumption and the Manhattan Frame can be used as a reference to compute the pairwise spatial transformation, a new RGB-D SLAM algorithm is proposed. It first performs the Manhattan Frame Estimation using the introduced concept of orientation relevance. Then the pairwise spatial transformation between two RGB-D frames is computed with the Manhattan Frame Estimation. Finally, the Manhattan Frame Estimation using orientation relevance is incorporated into the RGB-D SLAM to improve its performance. Experimental results show that the proposed RGB-D SLAM algorithm has definite improvements in accuracy, robustness, and runtime. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Boosting Texture-Based Classification by Describing Statistical Information of Gray-Levels Differences
Sensors 2019, 19(5), 1048; https://doi.org/10.3390/s19051048
Received: 21 January 2019 / Revised: 22 February 2019 / Accepted: 25 February 2019 / Published: 1 March 2019
PDF Full-text (3501 KB) | HTML Full-text | XML Full-text
Abstract
This paper presents a new texture descriptor booster, Complete Local Oriented Statistical Information Booster (CLOSIB), based on statistical information of the image. Our proposal uses the statistical information of the texture provided by the image gray-levels differences to increase the discriminative capability of [...] Read more.
This paper presents a new texture descriptor booster, Complete Local Oriented Statistical Information Booster (CLOSIB), based on statistical information of the image. Our proposal uses the statistical information of the texture provided by the image gray-levels differences to increase the discriminative capability of Local Binary Patterns (LBP)-based and other texture descriptors. We demonstrated that Half-CLOSIB and M-CLOSIB versions are more efficient and precise than the general one. H-CLOSIB may eliminate redundant statistical information and the multi-scale version, M-CLOSIB, is more robust. We evaluated our method using four datasets: KTH TIPS (2-a) for material recognition, UIUC and USPTex for general texture recognition and JAFFE for face recognition. The results show that when we combine CLOSIB with well-known LBP-based descriptors, the hit rate increases in all the cases, introducing in this way the idea that CLOSIB can be used to enhance the description of texture in a significant number of situations. Additionally, a comparison with recent algorithms demonstrates that a combination of LBP methods with CLOSIB variants obtains comparable results to those of the state-of-the-art. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle A Stereo-Vision System for Measuring the Ram Speed of Steam Hammers in an Environment with a Large Field of View and Strong Vibrations
Sensors 2019, 19(5), 996; https://doi.org/10.3390/s19050996
Received: 24 January 2019 / Revised: 22 February 2019 / Accepted: 22 February 2019 / Published: 26 February 2019
PDF Full-text (5003 KB) | HTML Full-text | XML Full-text
Abstract
The ram speed of a steam hammer is an important parameter that directly affects the forming performance of forgers. This parameter must be monitored regularly in practical applications in industry. Because of the complex and dangerous industrial environment of forging equipment, non-contact measurement [...] Read more.
The ram speed of a steam hammer is an important parameter that directly affects the forming performance of forgers. This parameter must be monitored regularly in practical applications in industry. Because of the complex and dangerous industrial environment of forging equipment, non-contact measurement methods, such as stereo vision, might be optimal. However, in actual application, the field of view (FOV) required to measure the steam hammer is extremely large, with a value of 2–3 m, and heavy steam hammer, at high-speed, usually causes a strong vibration. These two factors combine to sacrifice the accuracy of measurements, and can even cause the failure of measurements. To solve these issues, a bundle-adjustment-principle-based system calibration method is proposed to realize high-accuracy calibration for a large FOV, which can obtain accurate calibration results when the calibration target is not precisely manufactured. To decrease the influence of strong vibration, a stationary world coordinate system was built, and the external parameters were recalibrated during the entire measurement process. The accuracy and effectiveness of the proposed technique were verified by an experiment to measure the ram speed of a counterblow steam hammer in a die forging device. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle High-Accuracy Globally Consistent Surface Reconstruction Using Fringe Projection Profilometry
Sensors 2019, 19(3), 668; https://doi.org/10.3390/s19030668
Received: 16 January 2019 / Revised: 3 February 2019 / Accepted: 4 February 2019 / Published: 6 February 2019
PDF Full-text (22543 KB) | HTML Full-text | XML Full-text
Abstract
This paper presents a high-accuracy method for globally consistent surface reconstruction using a single fringe projection profilometry (FPP) sensor. To solve the accumulated sensor pose estimation error problem encountered in a long scanning trajectory, we first present a novel 3D registration method which [...] Read more.
This paper presents a high-accuracy method for globally consistent surface reconstruction using a single fringe projection profilometry (FPP) sensor. To solve the accumulated sensor pose estimation error problem encountered in a long scanning trajectory, we first present a novel 3D registration method which fuses both dense geometric and curvature consistency constraints to improve the accuracy of relative sensor pose estimation. Then we perform global sensor pose optimization by modeling the surface consistency information as a pre-computed covariance matrix and formulating the multi-view point cloud registration problem in a pose graph optimization framework. Experiments on reconstructing a 1300 mm × 400 mm workpiece with a FPP sensor is performed, verifying that our method can substantially reduce the accumulated error and achieve industrial-level surface model reconstruction without any external positional assistance but only using a single FPP sensor. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Appearance-Based Salient Regions Detection Using Side-Specific Dictionaries
Sensors 2019, 19(2), 421; https://doi.org/10.3390/s19020421
Received: 29 November 2018 / Revised: 5 January 2019 / Accepted: 5 January 2019 / Published: 21 January 2019
Cited by 1 | PDF Full-text (2172 KB) | HTML Full-text | XML Full-text
Abstract
Image saliency detection is a very helpful step in many computer vision-based smart systems to reduce the computational complexity by only focusing on the salient parts of the image. Currently, the image saliency is detected through representation-based generative schemes, as these schemes are [...] Read more.
Image saliency detection is a very helpful step in many computer vision-based smart systems to reduce the computational complexity by only focusing on the salient parts of the image. Currently, the image saliency is detected through representation-based generative schemes, as these schemes are helpful for extracting the concise representations of the stimuli and to capture the high-level semantics in visual information with a small number of active coefficients. In this paper, we propose a novel framework for salient region detection that uses appearance-based and regression-based schemes. The framework segments the image and forms reconstructive dictionaries from four sides of the image. These side-specific dictionaries are further utilized to obtain the saliency maps of the sides. A unified version of these maps is subsequently employed by a representation-based model to obtain a contrast-based salient region map. The map is used to obtain two regression-based maps with LAB and RGB color features that are unified through the optimization-based method to achieve the final saliency map. Furthermore, the side-specific reconstructive dictionaries are extracted from the boundary and the background pixels, which are enriched with geometrical and visual information. The approach has been thoroughly evaluated on five datasets and compared with the seven most recent approaches. The simulation results reveal that our model performs favorably in comparison with the current saliency detection schemes. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Pose Estimation for Straight Wing Aircraft Based on Consistent Line Clustering and Planes Intersection
Sensors 2019, 19(2), 342; https://doi.org/10.3390/s19020342
Received: 27 December 2018 / Revised: 10 January 2019 / Accepted: 11 January 2019 / Published: 16 January 2019
PDF Full-text (5411 KB) | HTML Full-text | XML Full-text
Abstract
Aircraft pose estimation is a necessary technology in aerospace applications, and accurate pose parameters are the foundation for many aerospace tasks. In this paper, we propose a novel pose estimation method for straight wing aircraft without relying on 3D models or other datasets, [...] Read more.
Aircraft pose estimation is a necessary technology in aerospace applications, and accurate pose parameters are the foundation for many aerospace tasks. In this paper, we propose a novel pose estimation method for straight wing aircraft without relying on 3D models or other datasets, and two widely separated cameras are used to acquire the pose information. Because of the large baseline and long-distance imaging, feature point matching is difficult and inaccurate in this configuration. In our method, line features are extracted to describe the structure of straight wing aircraft in images, and pose estimation is performed based on the common geometry constraints of straight wing aircraft. The spatial and length consistency of the line features is used to exclude irrelevant line segments belonging to the background or other parts of the aircraft, and density-based parallel line clustering is utilized to extract the aircraft’s main structure. After identifying the orientation of the fuselage and wings in images, planes intersection is used to estimate the 3D localization and attitude of the aircraft. Experimental results show that our method estimates the aircraft pose accurately and robustly. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Local Parallel Cross Pattern: A Color Texture Descriptor for Image Retrieval
Sensors 2019, 19(2), 315; https://doi.org/10.3390/s19020315
Received: 2 December 2018 / Revised: 11 January 2019 / Accepted: 11 January 2019 / Published: 14 January 2019
PDF Full-text (4328 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Riding the wave of visual sensor equipment (e.g., personal smartphones, home security cameras, vehicle cameras, and camcorders), image retrieval (IR) technology has received increasing attention due to its potential applications in e-commerce, visual surveillance, and intelligent traffic. However, determining how to design an [...] Read more.
Riding the wave of visual sensor equipment (e.g., personal smartphones, home security cameras, vehicle cameras, and camcorders), image retrieval (IR) technology has received increasing attention due to its potential applications in e-commerce, visual surveillance, and intelligent traffic. However, determining how to design an effective feature descriptor has been proven to be the main bottleneck for retrieving a set of images of interest. In this paper, we first construct a six-layer color quantizer to extract a color map. Then, motivated by the human visual system, we design a local parallel cross pattern (LPCP) in which the local binary pattern (LBP) map is amalgamated with the color map in “parallel” and “cross” manners. Finally, to reduce the computational complexity and improve the robustness to image rotation, the LPCP is extended to the uniform local parallel cross pattern (ULPCP) and the rotation-invariant local parallel cross pattern (RILPCP), respectively. Extensive experiments are performed on eight benchmark datasets. The experimental results validate the effectiveness, efficiency, robustness, and computational complexity of the proposed descriptors against eight state-of-the-art color texture descriptors to produce an in-depth comparison. Additionally, compared with a series of Convolutional Neural Network (CNN)-based models, the proposed descriptors still achieve competitive results. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Camera Calibration Using Gray Code
Sensors 2019, 19(2), 246; https://doi.org/10.3390/s19020246
Received: 27 November 2018 / Revised: 21 December 2018 / Accepted: 4 January 2019 / Published: 10 January 2019
PDF Full-text (2863 KB) | HTML Full-text | XML Full-text
Abstract
In order to determine camera parameters, a calibration procedure involving the camera recordings of a checkerboard is usually performed. In this paper, we propose an alternative approach that uses Gray-code patterns displayed on an LCD screen. Gray-code patterns allow us to decode 3D [...] Read more.
In order to determine camera parameters, a calibration procedure involving the camera recordings of a checkerboard is usually performed. In this paper, we propose an alternative approach that uses Gray-code patterns displayed on an LCD screen. Gray-code patterns allow us to decode 3D location information of points of the LCD screen at every pixel in the camera image. This is in contrast to checkerboard patterns where the number of corresponding locations is limited to the number of checkerboard corners. We show that, for the case of a UEye CMOS camera, the precision of focal-length estimation is 1.5 times more precise than when using a standard calibration with a checkerboard pattern. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Motion-Aware Correlation Filters for Online Visual Tracking
Sensors 2018, 18(11), 3937; https://doi.org/10.3390/s18113937
Received: 25 September 2018 / Revised: 24 October 2018 / Accepted: 7 November 2018 / Published: 14 November 2018
PDF Full-text (6256 KB) | HTML Full-text | XML Full-text
Abstract
The discriminative correlation filters-based methods struggle deal with the problem of fast motion and heavy occlusion, the problem can severely degrade the performance of trackers, ultimately leading to tracking failures. In this paper, a novel Motion-Aware Correlation Filters (MACF) framework is proposed for [...] Read more.
The discriminative correlation filters-based methods struggle deal with the problem of fast motion and heavy occlusion, the problem can severely degrade the performance of trackers, ultimately leading to tracking failures. In this paper, a novel Motion-Aware Correlation Filters (MACF) framework is proposed for online visual object tracking, where a motion-aware strategy based on joint instantaneous motion estimation Kalman filters is integrated into the Discriminative Correlation Filters (DCFs). The proposed motion-aware strategy is used to predict the possible region and scale of the target in the current frame by utilizing the previous estimated 3D motion information. Obviously, this strategy can prevent model drift caused by fast motion. On the base of the predicted region and scale, the MACF detects the position and scale of the target by using the DCFs-based method in the current frame. Furthermore, an adaptive model updating strategy is proposed to address the problem of corrupted models caused by occlusions, where the learning rate is determined by the confidence of the response map. The extensive experiments on popular Object Tracking Benchmark OTB-100, OTB-50 and unmanned aerial vehicles (UAV) video have demonstrated that the proposed MACF tracker performs better than most of the state-of-the-art trackers and achieves a high real-time performance. In addition, the proposed approach can be integrated easily and flexibly into other visual tracking algorithms. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature
Sensors 2018, 18(11), 3669; https://doi.org/10.3390/s18113669
Received: 31 August 2018 / Revised: 26 October 2018 / Accepted: 26 October 2018 / Published: 29 October 2018
PDF Full-text (3915 KB) | HTML Full-text | XML Full-text
Abstract
Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture [...] Read more.
Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture with hybrid deep appearance-temporal feature. It can learn the appearance features of pivotal frames, the temporal features, and the independent distance metric of different features. This architecture consists of two-stream deep feature structure and two Siamese networks. For the first-stream structure, we propose the Two-branch Appearance Feature (TAF) sub-structure to obtain the appearance information of persons, and used one of the two Siamese networks to learn the similarity of appearance features of a pairwise person. To utilize the temporal information, we designed the second-stream structure that consisting of the Optical flow Temporal Feature (OTF) sub-structure and another Siamese network, to learn the person’s temporal features and the distances of pairwise features. In addition, we select the pivotal frames of video as inputs to the Inception-V3 network on the Two-branch Appearance Feature sub-structure, and employ the salience-learning fusion layer to fuse the learned global and local appearance features. Extensive experimental results on the PRID2011, iLIDS-VID, and Motion Analysis and Re-identification Set (MARS) datasets showed that the respective proposed architectures reached 79%, 59% and 72% at Rank-1 and had advantages over state-of-the-art algorithms. Meanwhile, it also improved the feature representation ability of persons. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Improved Point-Line Feature Based Visual SLAM Method for Indoor Scenes
Sensors 2018, 18(10), 3559; https://doi.org/10.3390/s18103559
Received: 2 September 2018 / Revised: 15 October 2018 / Accepted: 16 October 2018 / Published: 20 October 2018
Cited by 1 | PDF Full-text (5004 KB) | HTML Full-text | XML Full-text
Abstract
In the study of indoor simultaneous localization and mapping (SLAM) problems using a stereo camera, two types of primary features—point and line segments—have been widely used to calculate the pose of the camera. However, many feature-based SLAM systems are not robust when the [...] Read more.
In the study of indoor simultaneous localization and mapping (SLAM) problems using a stereo camera, two types of primary features—point and line segments—have been widely used to calculate the pose of the camera. However, many feature-based SLAM systems are not robust when the camera moves sharply or turns too quickly. In this paper, an improved indoor visual SLAM method to better utilize the advantages of point and line segment features and achieve robust results in difficult environments is proposed. First, point and line segment features are automatically extracted and matched to build two kinds of projection models. Subsequently, for the optimization problem of line segment features, we add minimization of angle observation in addition to the traditional re-projection error of endpoints. Finally, our model of motion estimation, which is adaptive to the motion state of the camera, is applied to build a new combinational Hessian matrix and gradient vector for iterated pose estimation. Furthermore, our proposal has been tested on EuRoC MAV datasets and sequence images captured with our stereo camera. The experimental results demonstrate the effectiveness of our improved point-line feature based visual SLAM method in improving localization accuracy when the camera moves with rapid rotation or violent fluctuation. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network
Sensors 2018, 18(9), 3099; https://doi.org/10.3390/s18093099
Received: 6 August 2018 / Revised: 7 September 2018 / Accepted: 11 September 2018 / Published: 14 September 2018
PDF Full-text (8577 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
In this paper, a novel Pixel-Voxel network is proposed for dense 3D semantic mapping, which can perform dense 3D mapping while simultaneously recognizing and labelling the semantic category each point in the 3D map. In our approach, we fully leverage the advantages of [...] Read more.
In this paper, a novel Pixel-Voxel network is proposed for dense 3D semantic mapping, which can perform dense 3D mapping while simultaneously recognizing and labelling the semantic category each point in the 3D map. In our approach, we fully leverage the advantages of different modalities. That is, the PixelNet can learn the high-level contextual information from 2D RGB images, and the VoxelNet can learn 3D geometrical shapes from the 3D point cloud. Unlike the existing architecture that fuses score maps from different modalities with equal weights, we propose a softmax weighted fusion stack that adaptively learns the varying contributions of PixelNet and VoxelNet and fuses the score maps according to their respective confidence levels. Our approach achieved competitive results on both the SUN RGB-D and NYU V2 benchmarks, while the runtime of the proposed system is boosted to around 13 Hz, enabling near-real-time performance using an i7 eight-cores PC with a single Titan X GPU. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Pose Estimation of Sweet Pepper through Symmetry Axis Detection
Sensors 2018, 18(9), 3083; https://doi.org/10.3390/s18093083
Received: 20 August 2018 / Revised: 6 September 2018 / Accepted: 7 September 2018 / Published: 13 September 2018
Cited by 1 | PDF Full-text (4094 KB) | HTML Full-text | XML Full-text
Abstract
The space pose of fruits is necessary for accurate detachment in automatic harvesting. This study presents a novel pose estimation method for sweet pepper detachment. In this method, the normal to the local plane at each point in the sweet-pepper point cloud was [...] Read more.
The space pose of fruits is necessary for accurate detachment in automatic harvesting. This study presents a novel pose estimation method for sweet pepper detachment. In this method, the normal to the local plane at each point in the sweet-pepper point cloud was first calculated. The point cloud was separated by a number of candidate planes, and the scores of each plane were then separately calculated using the scoring strategy. The plane with the lowest score was selected as the symmetry plane of the point cloud. The symmetry axis could be finally calculated from the selected symmetry plane, and the pose of sweet pepper in the space was obtained using the symmetry axis. The performance of the proposed method was evaluated by simulated and sweet-pepper cloud dataset tests. In the simulated test, the average angle error between the calculated symmetry and real axes was approximately 6.5°. In the sweet-pepper cloud dataset test, the average error was approximately 7.4° when the peduncle was removed. When the peduncle of sweet pepper was complete, the average error was approximately 6.9°. These results suggested that the proposed method was suitable for pose estimation of sweet peppers and could be adjusted for use with other fruits and vegetables. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Automatic Calibration of an Around View Monitor System Exploiting Lane Markings
Sensors 2018, 18(9), 2956; https://doi.org/10.3390/s18092956
Received: 26 July 2018 / Revised: 28 August 2018 / Accepted: 2 September 2018 / Published: 5 September 2018
Cited by 1 | PDF Full-text (7034 KB) | HTML Full-text | XML Full-text
Abstract
This paper proposes a method that automatically calibrates four cameras of an around view monitor (AVM) system in a natural driving situation. The proposed method estimates orientation angles of four cameras composing the AVM system, and assumes that their locations and intrinsic parameters [...] Read more.
This paper proposes a method that automatically calibrates four cameras of an around view monitor (AVM) system in a natural driving situation. The proposed method estimates orientation angles of four cameras composing the AVM system, and assumes that their locations and intrinsic parameters are known in advance. This method utilizes lane markings because they exist in almost all on-road situations and appear across images of adjacent cameras. It starts by detecting lane markings from images captured by four cameras of the AVM system in a cost-effective manner. False lane markings are rejected by analyzing the statistical properties of the detected lane markings. Once the correct lane markings are sufficiently gathered, this method first calibrates the front and rear cameras, and then calibrates the left and right cameras with the help of the calibration results of the front and rear cameras. This two-step approach is essential because side cameras cannot be fully calibrated by themselves, due to insufficient lane marking information. After this initial calibration, this method collects corresponding lane markings appearing across images of adjacent cameras and simultaneously refines the initial calibration results of four cameras to obtain seamless AVM images. In the case of a long image sequence, this method conducts the camera calibration multiple times, and then selects the medoid as the final result to reduce computational resources and dependency on a specific place. In the experiment, the proposed method was quantitatively and qualitatively evaluated in various real driving situations and showed promising results. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Lightweight Visual Odometry for Autonomous Mobile Robots
Sensors 2018, 18(9), 2837; https://doi.org/10.3390/s18092837
Received: 19 July 2018 / Revised: 23 August 2018 / Accepted: 25 August 2018 / Published: 28 August 2018
Cited by 3 | PDF Full-text (10016 KB) | HTML Full-text | XML Full-text
Abstract
Vision-based motion estimation is an effective means for mobile robot localization and is often used in conjunction with other sensors for navigation and path planning. This paper presents a low-overhead real-time ego-motion estimation (visual odometry) system based on either a stereo or RGB-D [...] Read more.
Vision-based motion estimation is an effective means for mobile robot localization and is often used in conjunction with other sensors for navigation and path planning. This paper presents a low-overhead real-time ego-motion estimation (visual odometry) system based on either a stereo or RGB-D sensor. The algorithm’s accuracy outperforms typical frame-to-frame approaches by maintaining a limited local map, while requiring significantly less memory and computational power in contrast to using global maps common in full visual SLAM methods. The algorithm is evaluated on common publicly available datasets that span different use-cases and performance is compared to other comparable open-source systems in terms of accuracy, frame rate and memory requirements. This paper accompanies the release of the source code as a modular software package for the robotics community compatible with the Robot Operating System (ROS). Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Handshape Recognition Using Skeletal Data
Sensors 2018, 18(8), 2577; https://doi.org/10.3390/s18082577
Received: 10 July 2018 / Revised: 2 August 2018 / Accepted: 5 August 2018 / Published: 6 August 2018
PDF Full-text (11756 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, a method of handshapes recognition based on skeletal data is described. A new feature vector is proposed. It encodes the relative differences between vectors associated with the pointing directions of the particular fingers and the palm normal. Different classifiers are [...] Read more.
In this paper, a method of handshapes recognition based on skeletal data is described. A new feature vector is proposed. It encodes the relative differences between vectors associated with the pointing directions of the particular fingers and the palm normal. Different classifiers are tested on the demanding dataset, containing 48 handshapes performed 500 times by five users. Two different sensor configurations and significant variation in the hand rotation are considered. The late fusion at the decision level of individual models, as well as a comparative study carried out on a publicly available dataset, are also included. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Fast Visual Odometry for a Low-Cost Underwater Embedded Stereo System
Sensors 2018, 18(7), 2313; https://doi.org/10.3390/s18072313
Received: 21 June 2018 / Revised: 10 July 2018 / Accepted: 14 July 2018 / Published: 17 July 2018
Cited by 1 | PDF Full-text (15184 KB) | HTML Full-text | XML Full-text
Abstract
This paper provides details of hardware and software conception and realization of a stereo embedded system for underwater imaging. The system provides several functions that facilitate underwater surveys and run smoothly in real-time. A first post-image acquisition module provides direct visual feedback on [...] Read more.
This paper provides details of hardware and software conception and realization of a stereo embedded system for underwater imaging. The system provides several functions that facilitate underwater surveys and run smoothly in real-time. A first post-image acquisition module provides direct visual feedback on the quality of the taken images which helps appropriate actions to be taken regarding movement speed and lighting conditions. Our main contribution is a light visual odometry method adapted to the underwater context. The proposed method uses the captured stereo image stream to provide real-time navigation and a site coverage map which is necessary to conduct a complete underwater survey. The visual odometry uses a stochastic pose representation and semi-global optimization approach to handle large sites and provides long-term autonomy, whereas a novel stereo matching approach adapted to underwater imaging and system attached lighting allows fast processing and suitability to low computational resource systems. The system is tested in a real context and shows its robustness and promising future potential. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Visual Information Fusion through Bayesian Inference for Adaptive Probability-Oriented Feature Matching
Sensors 2018, 18(7), 2041; https://doi.org/10.3390/s18072041
Received: 5 April 2018 / Revised: 23 June 2018 / Accepted: 24 June 2018 / Published: 26 June 2018
Cited by 8 | PDF Full-text (3051 KB) | HTML Full-text | XML Full-text
Abstract
This work presents a visual information fusion approach for robust probability-oriented feature matching. It is sustained by omnidirectional imaging, and it is tested in a visual localization framework, in mobile robotics. General visual localization methods have been extensively studied and optimized in terms [...] Read more.
This work presents a visual information fusion approach for robust probability-oriented feature matching. It is sustained by omnidirectional imaging, and it is tested in a visual localization framework, in mobile robotics. General visual localization methods have been extensively studied and optimized in terms of performance. However, one of the main threats that jeopardizes the final estimation is the presence of outliers. In this paper, we present several contributions to deal with that issue. First, 3D information data, associated with SURF (Speeded-Up Robust Feature) points detected on the images, is inferred under the Bayesian framework established by Gaussian processes (GPs). Such information represents a probability distribution for the feature points’ existence, which is successively fused and updated throughout the robot’s poses. Secondly, this distribution can be properly sampled and projected onto the next 2D image frame in t+1, by means of a filter-motion prediction. This strategy permits obtaining relevant areas in the image reference system, from which probable matches could be detected, in terms of the accumulated probability of feature existence. This approach entails an adaptive probability-oriented matching search, which accounts for significant areas of the image, but it also considers unseen parts of the scene, thanks to an internal modulation of the probability distribution domain, computed in terms of the current uncertainty of the system. The main outcomes confirm a robust feature matching, which permits producing consistent localization estimates, aided by the odometer’s prior to estimate the scale factor. Publicly available datasets have been used to validate the design and operation of the approach. Moreover, the proposal has been compared, firstly with a standard feature matching and secondly with a localization method, based on an inverse depth parametrization. The results confirm the validity of the approach in terms of feature matching, localization accuracy, and time consumption. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Hybrid Histogram Descriptor: A Fusion Feature Representation for Image Retrieval
Sensors 2018, 18(6), 1943; https://doi.org/10.3390/s18061943
Received: 18 May 2018 / Revised: 10 June 2018 / Accepted: 12 June 2018 / Published: 15 June 2018
Cited by 2 | PDF Full-text (4027 KB) | HTML Full-text | XML Full-text
Abstract
Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. [...] Read more.
Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. This paper presents a fusion feature representation called a hybrid histogram descriptor (HHD) for image retrieval. The proposed descriptor comprises two histograms jointly: a perceptually uniform histogram which is extracted by exploiting the color and edge orientation information in perceptually uniform regions; and a motif co-occurrence histogram which is acquired by calculating the probability of a pair of motif patterns. To evaluate the performance, we benchmarked the proposed descriptor on RSSCN7, AID, Outex-00013, Outex-00014 and ETHZ-53 datasets. Experimental results suggest that the proposed descriptor is more effective and robust than ten recent fusion-based descriptors under the content-based image retrieval framework. The computational complexity was also analyzed to give an in-depth evaluation. Furthermore, compared with the state-of-the-art convolutional neural network (CNN)-based descriptors, the proposed descriptor also achieves comparable performance, but does not require any training process. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Segment-Tube: Spatio-Temporal Action Localization in Untrimmed Videos with Per-Frame Segmentation
Sensors 2018, 18(5), 1657; https://doi.org/10.3390/s18051657
Received: 23 April 2018 / Revised: 16 May 2018 / Accepted: 16 May 2018 / Published: 22 May 2018
Cited by 1 | PDF Full-text (5256 KB) | HTML Full-text | XML Full-text
Abstract
Inspired by the recent spatio-temporal action localization efforts with tubelets (sequences of bounding boxes), we present a new spatio-temporal action localization detector Segment-tube, which consists of sequences of per-frame segmentation masks. The proposed Segment-tube detector can temporally pinpoint the starting/ending frame of each [...] Read more.
Inspired by the recent spatio-temporal action localization efforts with tubelets (sequences of bounding boxes), we present a new spatio-temporal action localization detector Segment-tube, which consists of sequences of per-frame segmentation masks. The proposed Segment-tube detector can temporally pinpoint the starting/ending frame of each action category in the presence of preceding/subsequent interference actions in untrimmed videos. Simultaneously, the Segment-tube detector produces per-frame segmentation masks instead of bounding boxes, offering superior spatial accuracy to tubelets. This is achieved by alternating iterative optimization between temporal action localization and spatial action segmentation. Experimental results on three datasets validated the efficacy of the proposed method, including (1) temporal action localization on the THUMOS 2014 dataset; (2) spatial action segmentation on the Segtrack dataset; and (3) joint spatio-temporal action localization on the newly proposed ActSeg dataset. It is shown that our method compares favorably with existing state-of-the-art methods. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images
Sensors 2018, 18(5), 1656; https://doi.org/10.3390/s18051656
Received: 28 March 2018 / Revised: 15 May 2018 / Accepted: 18 May 2018 / Published: 22 May 2018
Cited by 1 | PDF Full-text (4052 KB) | HTML Full-text | XML Full-text
Abstract
Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination [...] Read more.
Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Lane Marking Detection and Reconstruction with Line-Scan Imaging Data
Sensors 2018, 18(5), 1635; https://doi.org/10.3390/s18051635
Received: 15 April 2018 / Revised: 10 May 2018 / Accepted: 17 May 2018 / Published: 20 May 2018
Cited by 2 | PDF Full-text (6618 KB) | HTML Full-text | XML Full-text
Abstract
Abstract: Lane marking detection and localization are crucial for autonomous driving and lane-based pavement surveys. Numerous studies have been done to detect and locate lane markings with the purpose of advanced driver assistance systems, in which image data are usually captured by [...] Read more.
Abstract: Lane marking detection and localization are crucial for autonomous driving and lane-based pavement surveys. Numerous studies have been done to detect and locate lane markings with the purpose of advanced driver assistance systems, in which image data are usually captured by vision-based cameras. However, a limited number of studies have been done to identify lane markings using high-resolution laser images for road condition evaluation. In this study, the laser images are acquired with a digital highway data vehicle (DHDV). Subsequently, a novel methodology is presented for the automated lane marking identification and reconstruction, and is implemented in four phases: (1) binarization of the laser images with a new threshold method (multi-box segmentation based threshold method); (2) determination of candidate lane markings with closing operations and a marching square algorithm; (3) identification of true lane marking by eliminating false positives (FPs) using a linear support vector machine method; and (4) reconstruction of the damaged and dash lane marking segments to form a continuous lane marking based on the geometry features such as adjacent lane marking location and lane width. Finally, a case study is given to validate effects of the novel methodology. The findings indicate the new strategy is robust in image binarization and lane marking localization. This study would be beneficial in road lane-based pavement condition evaluation such as lane-based rutting measurement and crack classification. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle IrisDenseNet: Robust Iris Segmentation Using Densely Connected Fully Convolutional Networks in the Images by Visible Light and Near-Infrared Light Camera Sensors
Sensors 2018, 18(5), 1501; https://doi.org/10.3390/s18051501
Received: 2 April 2018 / Revised: 1 May 2018 / Accepted: 8 May 2018 / Published: 10 May 2018
Cited by 6 | PDF Full-text (6054 KB) | HTML Full-text | XML Full-text
Abstract
The recent advancements in computer vision have opened new horizons for deploying biometric recognition algorithms in mobile and handheld devices. Similarly, iris recognition is now much needed in unconstraint scenarios with accuracy. These environments make the acquired iris image exhibit occlusion, low resolution, [...] Read more.
The recent advancements in computer vision have opened new horizons for deploying biometric recognition algorithms in mobile and handheld devices. Similarly, iris recognition is now much needed in unconstraint scenarios with accuracy. These environments make the acquired iris image exhibit occlusion, low resolution, blur, unusual glint, ghost effect, and off-angles. The prevailing segmentation algorithms cannot cope with these constraints. In addition, owing to the unavailability of near-infrared (NIR) light, iris recognition in visible light environment makes the iris segmentation challenging with the noise of visible light. Deep learning with convolutional neural networks (CNN) has brought a considerable breakthrough in various applications. To address the iris segmentation issues in challenging situations by visible light and near-infrared light camera sensors, this paper proposes a densely connected fully convolutional network (IrisDenseNet), which can determine the true iris boundary even with inferior-quality images by using better information gradient flow between the dense blocks. In the experiments conducted, five datasets of visible light and NIR environments were used. For visible light environment, noisy iris challenge evaluation part-II (NICE-II selected from UBIRIS.v2 database) and mobile iris challenge evaluation (MICHE-I) datasets were used. For NIR environment, the institute of automation, Chinese academy of sciences (CASIA) v4.0 interval, CASIA v4.0 distance, and IIT Delhi v1.0 iris datasets were used. Experimental results showed the optimal segmentation of the proposed IrisDenseNet and its excellent performance over existing algorithms for all five datasets. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Textile Retrieval Based on Image Content from CDC and Webcam Cameras in Indoor Environments
Sensors 2018, 18(5), 1329; https://doi.org/10.3390/s18051329
Received: 21 March 2018 / Revised: 18 April 2018 / Accepted: 21 April 2018 / Published: 25 April 2018
Cited by 3 | PDF Full-text (4274 KB) | HTML Full-text | XML Full-text
Abstract
Textile based image retrieval for indoor environments can be used to retrieve images that contain the same textile, which may indicate that scenes are related. This makes up a useful approach for law enforcement agencies who want to find evidence based on matching [...] Read more.
Textile based image retrieval for indoor environments can be used to retrieve images that contain the same textile, which may indicate that scenes are related. This makes up a useful approach for law enforcement agencies who want to find evidence based on matching between textiles. In this paper, we propose a novel pipeline that allows searching and retrieving textiles that appear in pictures of real scenes. Our approach is based on first obtaining regions containing textiles by using MSER on high pass filtered images of the RGB, HSV and Hue channels of the original photo. To describe the textile regions, we demonstrated that the combination of HOG and HCLOSIB is the best option for our proposal when using the correlation distance to match the query textile patch with the candidate regions. Furthermore, we introduce a new dataset, TextilTube, which comprises a total of 1913 textile regions labelled within 67 classes. We yielded 84.94% of success in the 40 nearest coincidences and 37.44% of precision taking into account just the first coincidence, which outperforms the current deep learning methods evaluated. Experimental results show that this pipeline can be used to set up an effective textile based image retrieval system in indoor environments. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Presentation Attack Detection for Iris Recognition System Using NIR Camera Sensor
Sensors 2018, 18(5), 1315; https://doi.org/10.3390/s18051315
Received: 28 March 2018 / Revised: 20 April 2018 / Accepted: 20 April 2018 / Published: 24 April 2018
Cited by 2 | PDF Full-text (4998 KB) | HTML Full-text | XML Full-text
Abstract
Among biometric recognition systems such as fingerprint, finger-vein, or face, the iris recognition system has proven to be effective for achieving a high recognition accuracy and security level. However, several recent studies have indicated that an iris recognition system can be fooled by [...] Read more.
Among biometric recognition systems such as fingerprint, finger-vein, or face, the iris recognition system has proven to be effective for achieving a high recognition accuracy and security level. However, several recent studies have indicated that an iris recognition system can be fooled by using presentation attack images that are recaptured using high-quality printed images or by contact lenses with printed iris patterns. As a result, this potential threat can reduce the security level of an iris recognition system. In this study, we propose a new presentation attack detection (PAD) method for an iris recognition system (iPAD) using a near infrared light (NIR) camera image. To detect presentation attack images, we first localized the iris region of the input iris image using circular edge detection (CED). Based on the result of iris localization, we extracted the image features using deep learning-based and handcrafted-based methods. The input iris images were then classified into real and presentation attack categories using support vector machines (SVM). Through extensive experiments with two public datasets, we show that our proposed method effectively solves the iris recognition presentation attack detection problem and produces detection accuracy superior to previous studies. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Improved Seam-Line Searching Algorithm for UAV Image Mosaic with Optical Flow
Sensors 2018, 18(4), 1214; https://doi.org/10.3390/s18041214
Received: 6 March 2018 / Revised: 8 April 2018 / Accepted: 12 April 2018 / Published: 16 April 2018
Cited by 2 | PDF Full-text (56074 KB) | HTML Full-text | XML Full-text
Abstract
Ghosting and seams are two major challenges in creating unmanned aerial vehicle (UAV) image mosaic. In response to these problems, this paper proposes an improved method for UAV image seam-line searching. First, an image matching algorithm is used to extract and match the [...] Read more.
Ghosting and seams are two major challenges in creating unmanned aerial vehicle (UAV) image mosaic. In response to these problems, this paper proposes an improved method for UAV image seam-line searching. First, an image matching algorithm is used to extract and match the features of adjacent images, so that they can be transformed into the same coordinate system. Then, the gray scale difference, the gradient minimum, and the optical flow value of pixels in adjacent image overlapped area in a neighborhood are calculated, which can be applied to creating an energy function for seam-line searching. Based on that, an improved dynamic programming algorithm is proposed to search the optimal seam-lines to complete the UAV image mosaic. This algorithm adopts a more adaptive energy aggregation and traversal strategy, which can find a more ideal splicing path for adjacent UAV images and avoid the ground objects better. The experimental results show that the proposed method can effectively solve the problems of ghosting and seams in the panoramic UAV images. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Comparative Analysis of Warp Function for Digital Image Correlation-Based Accurate Single-Shot 3D Shape Measurement
Sensors 2018, 18(4), 1208; https://doi.org/10.3390/s18041208
Received: 11 February 2018 / Revised: 23 March 2018 / Accepted: 12 April 2018 / Published: 16 April 2018
Cited by 1 | PDF Full-text (45969 KB) | HTML Full-text | XML Full-text
Abstract
Digital image correlation (DIC)-based stereo 3D shape measurement is a kind of single-shot method, which can achieve high precision and is robust to vibration as well as environment noise. The efficiency of DIC has been greatly improved with the proposal of inverse compositional [...] Read more.
Digital image correlation (DIC)-based stereo 3D shape measurement is a kind of single-shot method, which can achieve high precision and is robust to vibration as well as environment noise. The efficiency of DIC has been greatly improved with the proposal of inverse compositional Gauss-Newton (IC-GN) operators for both first-order and second-order warp functions. Without the algorithm itself, both the registration accuracy and efficiency of DIC-based stereo matching for shapes with different complexities are closely related to the selection of warp function, subset size, and convergence criteria. Understanding the similarity and difference of the impacts of prescribed subset size and convergence criteria on first-order and second-order warp functions, and how to choose a proper warp function and set optimal subset size as well as convergence criteria for different shapes are fundamental problems in realizing efficient and accurate 3D shape measurement. In this work, we present a comparative analysis of first-order and second-order warp functions for DIC-based 3D shape measurement using IC-GN algorithm. The effects of subset size and convergence criteria of first-order and second-order warp functions on the accuracy and efficiency of DIC are comparatively examined with both simulation tests and real experiments. Reference standards for the selection of warp function for different kinds of 3D shape measurement and the setting of proper convergence criteria are recommended. The effects of subset size on the measuring precision using different warp functions are also concluded. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Delving Deep into Multiscale Pedestrian Detection via Single Scale Feature Maps
Sensors 2018, 18(4), 1063; https://doi.org/10.3390/s18041063
Received: 28 February 2018 / Revised: 25 March 2018 / Accepted: 26 March 2018 / Published: 2 April 2018
PDF Full-text (1820 KB) | HTML Full-text | XML Full-text
Abstract
The standard pipeline in pedestrian detection is sliding a pedestrian model on an image feature pyramid to detect pedestrians of different scales. In this pipeline, feature pyramid construction is time consuming and becomes the bottleneck for fast detection. Recently, a method called multiresolution [...] Read more.
The standard pipeline in pedestrian detection is sliding a pedestrian model on an image feature pyramid to detect pedestrians of different scales. In this pipeline, feature pyramid construction is time consuming and becomes the bottleneck for fast detection. Recently, a method called multiresolution filtered channels (MRFC) was proposed which only used single scale feature maps to achieve fast detection. However, there are two shortcomings in MRFC which limit its accuracy. One is that the receptive field correspondence in different scales is weak. Another is that the features used are not scale invariance. In this paper, two solutions are proposed to tackle with the two shortcomings respectively. Specifically, scale-aware pooling is proposed to make a better receptive field correspondence, and soft decision tree is proposed to relive scale variance problem. When coupled with efficient sliding window classification strategy, our detector achieves fast detecting speed at the same time with state-of-the-art accuracy. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Open AccessArticle Dynamic Non-Rigid Objects Reconstruction with a Single RGB-D Sensor
Sensors 2018, 18(3), 886; https://doi.org/10.3390/s18030886
Received: 21 January 2018 / Revised: 12 March 2018 / Accepted: 14 March 2018 / Published: 16 March 2018
Cited by 2 | PDF Full-text (5227 KB) | HTML Full-text | XML Full-text
Abstract
This paper deals with the 3D reconstruction problem for dynamic non-rigid objects with a single RGB-D sensor. It is a challenging task as we consider the almost inevitable accumulation error issue in some previous sequential fusion methods and also the possible failure of [...] Read more.
This paper deals with the 3D reconstruction problem for dynamic non-rigid objects with a single RGB-D sensor. It is a challenging task as we consider the almost inevitable accumulation error issue in some previous sequential fusion methods and also the possible failure of surface tracking in a long sequence. Therefore, we propose a global non-rigid registration framework and tackle the drifting problem via an explicit loop closure. Our novel scheme starts with a fusion step to get multiple partial scans from the input sequence, followed by a pairwise non-rigid registration and loop detection step to obtain correspondences between neighboring partial pieces and those pieces that form a loop. Then, we perform a global registration procedure to align all those pieces together into a consistent canonical space as guided by those matches that we have established. Finally, our proposed model-update step helps fixing potential misalignments that still exist after the global registration. Both geometric and appearance constraints are enforced during our alignment; therefore, we are able to get the recovered model with accurate geometry as well as high fidelity color maps for the mesh. Experiments on both synthetic and various real datasets have demonstrated the capability of our approach to reconstruct complete and watertight deformable objects. Full article
(This article belongs to the Special Issue Visual Sensors)
Figures

Figure 1

Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top