remotesensing-logo

Journal Browser

Journal Browser

Computer Vision and Image Processing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (28 February 2023) | Viewed by 35923

Special Issue Editors

Computing Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ 85281, USA
Interests: computer vision and machine learning; transfer learning; active learning; zero-shot learning
School of Mathematics, Northwest University, Xi'an 710127, China
Interests: remote sensing image registration and image denoising; low rank representation; manifold learning; non-negative matrix factorization; deep learning; pattern recognition
Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
Interests: computer vision; multimedia; distribution learning; image processing; active vision; perceptual factors

Special Issue Information

Dear Colleagues,

This special issue in Computer Vision and Image Processing will explore new directions integrating emerging and new techniques in machine learning and optimization. For example, we welcome contributions in Distribution Learning, Active Learning, Non-Negative Matrix Factorization and many related areas to process or understand images and videos. Topics of interest include, but are not limited to:

  • Image, video, and 3D scene processing
  • Remote sensing and satellite image processing
  • 3D imaging, visualization, animation, virtual reality and 3DTV
  • Classification, Clustering and Machine Learning for Multimedia
  • Image and Video processing and understanding for Smart Cars and Smart Homes
  • Perceptually guided imaging and vision
  • Neural networks and learning based optimization
  • Emerging techniques in learning for image, video and 3D vision

Dr. Hemanth Venkateswara
Dr. Chengcai Leng
Dr. Anup Basu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Computer Vision
  • Image Processing
  • Remote Sensing
  • InSAR Satellite Data Analysis
  • Distribution Learning
  • Active Learning
  • Smart Multimedia

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

19 pages, 10511 KiB  
Article
Optical Flow and Expansion Based Deep Temporal Up-Sampling of LIDAR Point Clouds
Remote Sens. 2023, 15(10), 2487; https://doi.org/10.3390/rs15102487 - 09 May 2023
Cited by 1 | Viewed by 1256
Abstract
This paper proposes a framework that enables the online generation of virtual point clouds relying only on previous camera and point clouds and current camera measurements. The continuous usage of the pipeline generating virtual LIDAR measurements makes the temporal up-sampling of point clouds [...] Read more.
This paper proposes a framework that enables the online generation of virtual point clouds relying only on previous camera and point clouds and current camera measurements. The continuous usage of the pipeline generating virtual LIDAR measurements makes the temporal up-sampling of point clouds possible. The only requirement of the system is a camera with a higher frame rate than the LIDAR equipped to the same vehicle, which is usually provided. The pipeline first utilizes optical flow estimations from the available camera frames. Next, optical expansion is used to upgrade it to 3D scene flow. Following that, ground plane fitting is made on the previous LIDAR point cloud. Finally, the estimated scene flow is applied to the previously measured object points to generate the new point cloud. The framework’s efficiency is proved as state-of-the-art performance is achieved on the KITTI dataset. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

20 pages, 12133 KiB  
Article
LL-CSFormer: A Novel Image Denoiser for Intensified CMOS Sensing Images under a Low Light Environment
Remote Sens. 2023, 15(10), 2483; https://doi.org/10.3390/rs15102483 - 09 May 2023
Cited by 1 | Viewed by 981
Abstract
Intensified complementary metal-oxide semiconductor (ICMOS) sensors can capture images under extremely low-light conditions (≤0.01 lux illumination), but the results exhibit spatially clustered noise that seriously damages the structural information. Existing image-denoising methods mainly focus on simulated noise and real noise from normal CMOS [...] Read more.
Intensified complementary metal-oxide semiconductor (ICMOS) sensors can capture images under extremely low-light conditions (≤0.01 lux illumination), but the results exhibit spatially clustered noise that seriously damages the structural information. Existing image-denoising methods mainly focus on simulated noise and real noise from normal CMOS sensors, which can easily mistake the ICMOS noise for the latent image texture. To solve this problem, we propose a low-light cross-scale transformer (LL-CSFormer) that adopts multi-scale and multi-range learning to better distinguish between the noise and signal in ICMOS sensing images. For multi-scale aspects, the proposed LL-CSFormer designs parallel multi-scale streams and ensures information exchange across different scales to maintain high-resolution spatial information and low-resolution contextual information. For multi-range learning, the network contains both convolutions and transformer blocks, which are able to extract noise-wise local features and signal-wise global features. To enable this, we establish a novel ICMOS image dataset of still noisy bursts under different illumination levels. We also designed a two-stream noise-to-noise training strategy for interactive learning and data augmentation. Experiments were conducted on our proposed ICMOS image dataset, and the results demonstrate that our method is able to effectively remove ICMOS image noise compared with other image-denoising methods using objective and subjective metrics. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

21 pages, 10484 KiB  
Article
A Novel Method for Obstacle Detection in Front of Vehicles Based on the Local Spatial Features of Point Cloud
Remote Sens. 2023, 15(4), 1044; https://doi.org/10.3390/rs15041044 - 14 Feb 2023
Viewed by 1264
Abstract
Obstacle detection is the primary task of the Advanced Driving Assistance System (ADAS). However, it is very difficult to achieve accurate obstacle detection in complex traffic scenes. To this end, this paper proposes an obstacle detection method based on the local spatial features [...] Read more.
Obstacle detection is the primary task of the Advanced Driving Assistance System (ADAS). However, it is very difficult to achieve accurate obstacle detection in complex traffic scenes. To this end, this paper proposes an obstacle detection method based on the local spatial features of point clouds. Firstly, the local spatial point cloud of a superpixel is obtained through stereo matching and the SLIC image segmentation algorithm. Then, the probability of the obstacle in the corresponding area is estimated from the spatial feature information of the local plane normal vector and the superpixel point-cloud height, respectively. Finally, the detection results of the two methods are input into the Bayesian framework in the form of probabilities for the final decision. In order to describe the traffic scene efficiently and accurately, the detection results are further transformed into a multi-layer stixel representation. We carried out experiments on the KITTI dataset and compared several obstacle detection methods. The experimental results indicate that the proposed method has advantages in terms of its Pixel-wise True Positive Rate (PTPR) and Pixel-wise False Positive Rate (PFPR), particularly in complex traffic scenes, such as uneven roads. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

17 pages, 2132 KiB  
Article
Learning Domain-Adaptive Landmark Detection-Based Self-Supervised Video Synchronization for Remote Sensing Panorama
Remote Sens. 2023, 15(4), 953; https://doi.org/10.3390/rs15040953 - 09 Feb 2023
Viewed by 1660
Abstract
The synchronization of videos is an essential pre-processing step for multi-view reconstruction such as the image mosaic by UAV remote sensing; it is often solved with hardware solutions in motion capture studios. However, traditional synchronization setups rely on manual interventions or software solutions [...] Read more.
The synchronization of videos is an essential pre-processing step for multi-view reconstruction such as the image mosaic by UAV remote sensing; it is often solved with hardware solutions in motion capture studios. However, traditional synchronization setups rely on manual interventions or software solutions and only fit for a particular domain of motions. In this paper, we propose a self-supervised video synchronization algorithm that attains high accuracy in diverse scenarios without cumbersome manual intervention. At the core is a motion-based video synchronization algorithm that infers temporal offsets from the trajectories of moving objects in the videos. It is complemented by a self-supervised scene decomposition algorithm that detects common parts and their motion tracks in two or more videos, without requiring any manual positional supervision. We evaluate our approach on three different datasets, including the motion of humans, animals, and simulated objects, and use it to build the view panorama of the remote sensing field. All experiments demonstrate that the proposed location-based synchronization is more effective compared to the state-of-the-art methods, and our self-supervised inference approaches the accuracy of supervised solutions, while being much easier to adapt to a new target domain. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

18 pages, 3185 KiB  
Article
Boost Correlation Features with 3D-MiIoU-Based Camera-LiDAR Fusion for MODT in Autonomous Driving
Remote Sens. 2023, 15(4), 874; https://doi.org/10.3390/rs15040874 - 04 Feb 2023
Cited by 3 | Viewed by 1344
Abstract
Three-dimensional (3D) object tracking is critical in 3D computer vision. It has applications in autonomous driving, robotics, and human–computer interaction. However, methods for using multimodal information among objects to increase multi-object detection and tracking (MOT) accuracy remain a critical focus of research. Therefore, [...] Read more.
Three-dimensional (3D) object tracking is critical in 3D computer vision. It has applications in autonomous driving, robotics, and human–computer interaction. However, methods for using multimodal information among objects to increase multi-object detection and tracking (MOT) accuracy remain a critical focus of research. Therefore, we present a multimodal MOT framework for autonomous driving boost correlation multi-object detection and tracking (BcMODT) in this research study to provide more trustworthy features and correlation scores for real-time detection tracking using both camera and LiDAR measurement data. Specifically, we propose an end-to-end deep neural network using 2D and 3D data for joint object detection and association. A new 3D mixed IoU (3D-MiIoU) computational module is also developed to acquire more precise geometric affinity by increasing the aspect ratio and length-to-height ratio between linked frames. Meanwhile, a boost correlation feature (BcF) module is proposed for the affinity calculation of the appearance of similar objects, which comprises an appearance affinity calculation module for similar objects in adjacent frames that are calculated directly using the feature distance and feature direction’s similarity. The KITTI tracking benchmark shows that our method outperforms other methods with respect to tracking accuracy. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

29 pages, 22142 KiB  
Article
A Global-Information-Constrained Deep Learning Network for Digital Elevation Model Super-Resolution
Remote Sens. 2023, 15(2), 305; https://doi.org/10.3390/rs15020305 - 04 Jan 2023
Cited by 5 | Viewed by 1916
Abstract
High-resolution DEMs can provide accurate geographic information and can be widely used in hydrological analysis, path planning, and urban design. As the main complementary means of producing high-resolution DEMs, the DEM super-resolution (SR) method based on deep learning has reached a bottleneck. The [...] Read more.
High-resolution DEMs can provide accurate geographic information and can be widely used in hydrological analysis, path planning, and urban design. As the main complementary means of producing high-resolution DEMs, the DEM super-resolution (SR) method based on deep learning has reached a bottleneck. The reason for this phenomenon is that the DEM super-resolution method based on deep learning lacks a part of the global information it requires. Specifically, the multilevel aggregation process of deep learning has difficulty sufficiently capturing the low-level features with dependencies, which leads to a lack of global relationships with high-level information. To address this problem, we propose a global-information-constrained deep learning network for DEM SR (GISR). Specifically, our proposed GISR method consists of a global information supplement module and a local feature generation module. The former uses the Kriging method to supplement global information, considering the spatial autocorrelation rule. The latter includes a residual module and the PixelShuffle module, which is used to restore the detailed features of the terrain. Compared with the bicubic, Kriging, SRCNN, SRResNet, and TfaSR methods, the experimental results of our method show a better ability to retain terrain features, and the generation effect is more consistent with the ground truth DEM. Meanwhile, compared with the deep learning method, the RMSE of our results is improved by 20.5% to 68.8%. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

22 pages, 9099 KiB  
Article
A Fast and Robust Heterologous Image Matching Method for Visual Geo-Localization of Low-Altitude UAVs
Remote Sens. 2022, 14(22), 5879; https://doi.org/10.3390/rs14225879 - 20 Nov 2022
Cited by 3 | Viewed by 2142
Abstract
Visual geo-localization can achieve UAVs (Unmanned Aerial Vehicles) position during GNSS (Global Navigation Satellite System) denial or restriction. However, The performance of visual geo-localization is seriously impaired by illumination variation, different scales, viewpoint difference, spare texture, and computer power of UAVs, etc. In [...] Read more.
Visual geo-localization can achieve UAVs (Unmanned Aerial Vehicles) position during GNSS (Global Navigation Satellite System) denial or restriction. However, The performance of visual geo-localization is seriously impaired by illumination variation, different scales, viewpoint difference, spare texture, and computer power of UAVs, etc. In this paper, a fast detector-free two-stage matching method is proposed to improve the visual geo-localization of low-altitude UAVs. A detector-free matching method and perspective transformation module are incorporated into the coarse and fine matching stages to improve the robustness of the weak texture and viewpoint data. The minimum Euclidean distance is used to accelerate the coarse matching, and the coordinate regression based on DSNT (Differentiable Spatial to Numerical) transform is used to improve the fine matching accuracy respectively. The experimental results show that the average localization precision of the proposed method is 2.24 m, which is 0.33 m higher than that of the current typical matching methods. In addition, this method has obvious advantages in localization robustness and inference efficiency on Jetson Xavier NX, which completed to match and localize all images in the dataset while the localization frequency reached the best. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

24 pages, 58116 KiB  
Article
Extracting High-Precision Vehicle Motion Data from Unmanned Aerial Vehicle Video Captured under Various Weather Conditions
Remote Sens. 2022, 14(21), 5513; https://doi.org/10.3390/rs14215513 - 01 Nov 2022
Cited by 6 | Viewed by 2131
Abstract
At present, there are many aerial-view datasets that contain motion data from vehicles in a variety of traffic scenarios. However, there are few datasets that have been collected under different weather conditions in an urban mixed-traffic scenario. In this study, we propose a [...] Read more.
At present, there are many aerial-view datasets that contain motion data from vehicles in a variety of traffic scenarios. However, there are few datasets that have been collected under different weather conditions in an urban mixed-traffic scenario. In this study, we propose a framework for extracting vehicle motion data from UAV videos captured under various weather conditions. With this framework, we improve YOLOv5 (you only look once) with image-adaptive enhancement for detecting vehicles in different environments. In addition, a new vehicle-tracking algorithm called SORT++ is proposed to extract high-precision vehicle motion data from the detection results. Moreover, we present a new dataset that includes 7133 traffic images (1311 under sunny conditions, 961 under night, 3366 under rainy, and 1495 under snowy) of 106,995 vehicles. The images were captured by a UAV to evaluate the proposed method for vehicle orientation detection. In order to evaluate the accuracy of the extracted traffic data, we also present a new dataset of four UAV videos, each having 30,000+ frames, of approximately 3K vehicle trajectories collected under sunny, night, rainy, and snowy conditions, respectively. The experimental results show the high accuracy and stability of the proposed methods. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

19 pages, 37981 KiB  
Article
An Adaptive Joint Bilateral Interpolation-Based Color Blending Method for Stitched UAV Images
Remote Sens. 2022, 14(21), 5440; https://doi.org/10.3390/rs14215440 - 29 Oct 2022
Viewed by 1074
Abstract
Given a source UAV (unmanned aerial vehicle) image Is and a target UAV image It, it is a challenging problem to correct the color of all target pixels so that the subjective and objective quality effects between Is and [...] Read more.
Given a source UAV (unmanned aerial vehicle) image Is and a target UAV image It, it is a challenging problem to correct the color of all target pixels so that the subjective and objective quality effects between Is and It can be as consistent as possible. Recently, by referring to all stitching color difference values on the stitching line, a global bilateral joint interpolation-based (GBJI-based) color correction method was proposed. However, because all stitching color difference values may contain aligned and misaligned stitching pixels, the GBJI-based method suffers from a perceptual artifact near the misaligned stitching pixels. To remedy this perceptual artifact, in this paper, we propose an adaptive joint bilateral interpolation-based (AJBI-based) color blending method such that each target pixel only adaptively refers to an adequate interval of stitching color difference values locally. Based on several testing stitched UAV images under different brightness and misalignment situations, comprehensive experimental results demonstrate that in terms of PSNR (peak signal-to-noise ratio), SSIM (structural similarity index), and FSIM (feature similarity index), our method achieves higher objective quality effects and also achieves better perceptual effects, particularly near the misaligned stitching pixels, when compared with the GBJI-based method and the other state-of-the-art methods. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

18 pages, 4593 KiB  
Article
A Novel Error Criterion of Fundamental Matrix Based on Principal Component Analysis
Remote Sens. 2022, 14(21), 5341; https://doi.org/10.3390/rs14215341 - 25 Oct 2022
Viewed by 906
Abstract
Estimating the fundamental matrix (FM) using the known corresponding points is a key step for three-dimensional (3D) scene reconstruction, and its uncertainty directly affects camera calibration and point-cloud calculation. The symmetric epipolar distance is the most popular error criterion for estimating FM error, [...] Read more.
Estimating the fundamental matrix (FM) using the known corresponding points is a key step for three-dimensional (3D) scene reconstruction, and its uncertainty directly affects camera calibration and point-cloud calculation. The symmetric epipolar distance is the most popular error criterion for estimating FM error, but it depends on the accuracy, number, and distribution of known corresponding points and is biased. This study mainly focuses on the error quantitative criterion of FM itself. First, the calculated FM process is reviewed with the known corresponding points. Matrix differential theory is then used to derive the covariance equation of FMs in detail. Subsequently, the principal component analysis method is followed to construct the scalar function as a novel error criterion to measure FM error. Finally, three experiments with different types of stereo images are performed to verify the rationality of the proposed method. Experiments found that the scalar function had approximately 90% correlation degree with the Manhattan norm, and greater than 80% with the epipolar geometric distance. Consequently, the proposed method is also appropriate for estimating FM error, in which the error ellipse or normal distribution curve is the reasonable error boundary of FM. When the error criterion value of this method falls into a normal distribution curve or an error ellipse, its corresponding FM is considered to have less error and be credible. Otherwise, it may be necessary to recalculate an FM to reconstruct 3D models. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

19 pages, 28120 KiB  
Article
Real-Time Detection of Winter Jujubes Based on Improved YOLOX-Nano Network
Remote Sens. 2022, 14(19), 4833; https://doi.org/10.3390/rs14194833 - 28 Sep 2022
Cited by 8 | Viewed by 1716
Abstract
Achieving rapid and accurate localization of winter jujubes in trees is an indispensable step for the development of automated harvesting equipment. Unlike larger fruits such as apples, winter jujube is smaller with a higher density and serious occlusion, which obliges higher requirements for [...] Read more.
Achieving rapid and accurate localization of winter jujubes in trees is an indispensable step for the development of automated harvesting equipment. Unlike larger fruits such as apples, winter jujube is smaller with a higher density and serious occlusion, which obliges higher requirements for the identification and positioning. To address the issues, an accurate winter jujube localization method using improved YOLOX-Nano network was proposed. First, a winter jujube dataset containing a variety of complex scenes, such as backlit, occluded, and different fields of view, was established to train our model. Then, to improve its feature learning ability, an attention feature enhancement module was designed to strengthen useful features and weaken irrelevant features. Moreover, DIoU loss was used to optimize training and obtain a more robust model. A 3D positioning error experiment and a comparative experiment were conducted to validate the effectiveness of our method. The comparative experiment results showed that our method outperforms the state-of-the-art object detection networks and the lightweight networks. Specifically, the precision, recall, and AP of our method reached 93.08%, 87.83%, and 95.56%, respectively. The positioning error experiment results showed that the average positioning errors of the X, Y, Z coordinate axis were 5.8 mm, 5.4 mm, and 3.8 mm, respectively. The model size is only 4.47 MB and can meet the requirements of winter jujube picking for detection accuracy, positioning errors, and the deployment of embedded systems. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

19 pages, 10801 KiB  
Article
BACA: Superpixel Segmentation with Boundary Awareness and Content Adaptation
Remote Sens. 2022, 14(18), 4572; https://doi.org/10.3390/rs14184572 - 13 Sep 2022
Cited by 3 | Viewed by 1668
Abstract
Superpixels could aggregate pixels with similar properties, thus reducing the number of image primitives for subsequent advanced computer vision tasks. Nevertheless, existing algorithms are not effective enough to tackle computing redundancy and inaccurate segmentation. To this end, an optimized superpixel generation framework termed [...] Read more.
Superpixels could aggregate pixels with similar properties, thus reducing the number of image primitives for subsequent advanced computer vision tasks. Nevertheless, existing algorithms are not effective enough to tackle computing redundancy and inaccurate segmentation. To this end, an optimized superpixel generation framework termed Boundary Awareness and Content Adaptation (BACA) is presented. Firstly, an adaptive seed sampling method based on content complexity is proposed in the initialization stage. Different from the conventional uniform mesh initialization, it takes content differentiation into consideration to incipiently eliminate the redundancy of seed distribution. In addition to the efficient initialization strategy, this work also leverages contour prior information to strengthen the boundary adherence from whole to part. During the similarity calculation of inspecting the unlabeled pixels in the non-iterative clustering framework, a multi-feature associated measurement is put forward to ameliorate the misclassification of boundary pixels. Experimental results indicate that the two optimizations could generate a synergistic effect. The integrated BACA achieves an outstanding under-segmentation error (3.34%) on the BSD dataset over the state-of-the-art performances with a minimum number of superpixels (345). Furthermore, it is not limited to image segmentation and can be facilitated by remote sensing imaging analysis. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

24 pages, 11251 KiB  
Article
A Specular Highlight Removal Algorithm for Quality Inspection of Fresh Fruits
Remote Sens. 2022, 14(13), 3215; https://doi.org/10.3390/rs14133215 - 04 Jul 2022
Cited by 3 | Viewed by 2086
Abstract
Nondestructive inspection technology based on machine vision can effectively improve the efficiency of fresh fruit quality inspection. However, fruits with smooth skin and less texture are easily affected by specular highlights during the image acquisition, resulting in light spots appearing on the surface [...] Read more.
Nondestructive inspection technology based on machine vision can effectively improve the efficiency of fresh fruit quality inspection. However, fruits with smooth skin and less texture are easily affected by specular highlights during the image acquisition, resulting in light spots appearing on the surface of fruits, which severely affects the subsequent quality inspection. Aiming at this issue, we propose a new specular highlight removal algorithm based on multi-band polarization imaging. First of all, we realize real-time image acquisition by designing a new multi-band polarization imager, which can acquire all the spectral and polarization information through single image capture. Then we propose a joint multi-band-polarization characteristic vector constraint to realize the detection of specular highlight, and next we put forward a Max-Min multi-band-polarization differencing scheme combined with an ergodic least-squares separation for specular highlight removal, and finally, the chromaticity consistency regularization is used to compensate the missing details. Experimental results demonstrate that the proposed algorithm can effectively and stably remove the specular highlight and provide more accurate information for subsequent fruit quality inspection. Besides, the comparison of algorithm speed further shows that our proposed algorithm has a good tradeoff between accuracy and complexity. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

24 pages, 9369 KiB  
Article
PFD-SLAM: A New RGB-D SLAM for Dynamic Indoor Environments Based on Non-Prior Semantic Segmentation
Remote Sens. 2022, 14(10), 2445; https://doi.org/10.3390/rs14102445 - 19 May 2022
Cited by 10 | Viewed by 2049
Abstract
Now, most existing dynamic RGB-D SLAM methods are based on deep learning or mathematical models. Abundant training sample data is necessary for deep learning, and the selection diversity of semantic samples and camera motion modes are closely related to the robust detection of [...] Read more.
Now, most existing dynamic RGB-D SLAM methods are based on deep learning or mathematical models. Abundant training sample data is necessary for deep learning, and the selection diversity of semantic samples and camera motion modes are closely related to the robust detection of moving targets. Furthermore, the mathematical models are implemented at the feature-level of segmentation, which is likely to cause sub or over-segmentation of dynamic features. To address this problem, different from most feature-level dynamic segmentation based on mathematical models, a non-prior semantic dynamic segmentation based on a particle filter is proposed in this paper, which aims to attain the motion object segmentation. Firstly, GMS and optical flow are used to calculate an inter-frame difference image, which is considered an observation measurement of posterior estimation. Then, a motion equation of a particle filter is established using Gaussian distribution. Finally, our proposed segmentation method is integrated into the front end of visual SLAM and establishes a new dynamic SLAM, PFD-SLAM. Extensive experiments on the public TUM datasets and real dynamic scenes are conducted to verify location accuracy and practical performances of PFD-SLAM. Furthermore, we also compare experimental results with several state-of-the-art dynamic SLAM methods in terms of two evaluation indexes, RPE and ATE. Still, we provide visual comparisons between the camera estimation trajectories and ground truth. The comprehensive verification and testing experiments demonstrate that our PFD-SLAM can achieve better dynamic segmentation results and robust performances. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

19 pages, 4385 KiB  
Article
Traffic Anomaly Prediction System Using Predictive Network
Remote Sens. 2022, 14(3), 447; https://doi.org/10.3390/rs14030447 - 18 Jan 2022
Cited by 5 | Viewed by 2142 | Correction
Abstract
Anomaly anticipation in traffic scenarios is one of the primary challenges in action recognition. It is believed that greater accuracy can be obtained by the use of semantic details and motion information along with the input frames. Most state-of-the art models extract semantic [...] Read more.
Anomaly anticipation in traffic scenarios is one of the primary challenges in action recognition. It is believed that greater accuracy can be obtained by the use of semantic details and motion information along with the input frames. Most state-of-the art models extract semantic details and pre-defined optical flow from RGB frames and combine them using deep neural networks. Many previous models failed to extract motion information from pre-processed optical flow. Our study shows that optical flow provides better detection of objects in video streaming, which is an essential feature in further accident prediction. Additional to this issue, we propose a model that utilizes the recurrent neural network which instantaneously propagates predictive coding errors across layers and time steps. By assessing over time the representations from the pre-trained action recognition model from a given video, the use of pre-processed optical flows as input is redundant. Based on the final predictive score, we show the effectiveness of our proposed model on three different types of anomaly classes as Speeding Vehicle, Vehicle Accident, and Close Merging Vehicle from the state-of-the-art KITTI, D2City and HTA datasets. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

27 pages, 17958 KiB  
Article
Integrated Preprocessing of Multitemporal Very-High-Resolution Satellite Images via Conjugate Points-Based Pseudo-Invariant Feature Extraction
Remote Sens. 2021, 13(19), 3990; https://doi.org/10.3390/rs13193990 - 06 Oct 2021
Cited by 12 | Viewed by 2432
Abstract
Multitemporal very-high-resolution (VHR) satellite images are used as core data in the field of remote sensing because they express the topography and features of the region of interest in detail. However, geometric misalignment and radiometric dissimilarity occur when acquiring multitemporal VHR satellite images [...] Read more.
Multitemporal very-high-resolution (VHR) satellite images are used as core data in the field of remote sensing because they express the topography and features of the region of interest in detail. However, geometric misalignment and radiometric dissimilarity occur when acquiring multitemporal VHR satellite images owing to external environmental factors, and these errors cause various inaccuracies, thereby hindering the effective use of multitemporal VHR satellite images. Such errors can be minimized by applying preprocessing methods such as image registration and relative radiometric normalization (RRN). However, as the data used in image registration and RRN differ, data consistency and computational efficiency are impaired, particularly when processing large amounts of data, such as a large volume of multitemporal VHR satellite images. To resolve these issues, we proposed an integrated preprocessing method by extracting pseudo-invariant features (PIFs), used for RRN, based on the conjugate points (CPs) extracted for image registration. To this end, the image registration was performed using CPs extracted using the speeded-up robust feature algorithm. Then, PIFs were extracted based on the CPs by removing vegetation areas followed by application of the region growing algorithm. Experiments were conducted on two sites constructed under different acquisition conditions to confirm the robustness of the proposed method. Various analyses based on visual and quantitative evaluation of the experimental results were performed from geometric and radiometric perspectives. The results evidence the successful integration of the image registration and RRN preprocessing steps by achieving a reasonable and stable performance. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

19 pages, 11038 KiB  
Article
Motion Estimation Using Region-Level Segmentation and Extended Kalman Filter for Autonomous Driving
Remote Sens. 2021, 13(9), 1828; https://doi.org/10.3390/rs13091828 - 07 May 2021
Cited by 10 | Viewed by 2564
Abstract
Motion estimation is crucial to predict where other traffic participants will be at a certain period of time, and accordingly plan the route of the ego-vehicle. This paper presents a novel approach to estimate the motion state by using region-level instance segmentation and [...] Read more.
Motion estimation is crucial to predict where other traffic participants will be at a certain period of time, and accordingly plan the route of the ego-vehicle. This paper presents a novel approach to estimate the motion state by using region-level instance segmentation and extended Kalman filter (EKF). Motion estimation involves three stages of object detection, tracking and parameter estimate. We first use a region-level segmentation to accurately locate the object region for the latter two stages. The region-level segmentation combines color, temporal (optical flow), and spatial (depth) information as the basis for segmentation by using super-pixels and Conditional Random Field. The optical flow is then employed to track the feature points within the object area. In the stage of parameter estimate, we develop a relative motion model of the ego-vehicle and the object, and accordingly establish an EKF model for point tracking and parameter estimate. The EKF model integrates the ego-motion, optical flow, and disparity to generate optimized motion parameters. During tracking and parameter estimate, we apply edge point constraint and consistency constraint to eliminate outliers of tracking points so that the feature points used for tracking are ensured within the object body and the parameter estimates are refined by inner points. Experiments have been conducted on the KITTI dataset, and the results demonstrate that our method presents excellent performance and outperforms the other state-of-the-art methods either in object segmentation and parameter estimate. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Graphical abstract

Review

Jump to: Research, Other

45 pages, 6205 KiB  
Review
Computer Vision and Pattern Recognition for the Analysis of 2D/3D Remote Sensing Data in Geoscience: A Survey
Remote Sens. 2022, 14(23), 6017; https://doi.org/10.3390/rs14236017 - 27 Nov 2022
Cited by 7 | Viewed by 2898
Abstract
Historically, geoscience has been a prominent domain for applications of computer vision and pattern recognition. The numerous challenges associated with geoscience-related imaging data, which include poor imaging quality, noise, missing values, lack of precise boundaries defining various geoscience objects and processes, as well [...] Read more.
Historically, geoscience has been a prominent domain for applications of computer vision and pattern recognition. The numerous challenges associated with geoscience-related imaging data, which include poor imaging quality, noise, missing values, lack of precise boundaries defining various geoscience objects and processes, as well as non-stationarity in space and/or time, provide an ideal test bed for advanced computer vision techniques. On the other hand, the developments in pattern recognition, especially with the rapid evolution of powerful graphical processing units (GPUs) and the subsequent deep learning breakthrough, enable valuable computational tools, which can aid geoscientists in important problems, such as land cover mapping, target detection, pattern mining in imaging data, boundary extraction and change detection. In this landscape, classical computer vision approaches, such as active contours, superpixels, or descriptor-guided classification, provide alternatives that remain relevant when domain expert labelling of large sample collections is often not feasible. This issue persists, despite efforts for the standardization of geoscience datasets, such as Microsoft’s effort for AI on Earth, or Google Earth. This work covers developments in applications of computer vision and pattern recognition on geoscience-related imaging data, following both pre-deep learning and post-deep learning paradigms. Various imaging modalities are addressed, including: multispectral images, hyperspectral images (HSIs), synthetic aperture radar (SAR) images, point clouds obtained from light detection and ranging (LiDAR) sensors or digital elevation models (DEMs). Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

Other

Jump to: Research, Review

7 pages, 1364 KiB  
Correction
Correction: Riaz et al. Traffic Anomaly Prediction System Using Predictive Network. Remote Sens. 2022, 14, 447
Remote Sens. 2023, 15(12), 3019; https://doi.org/10.3390/rs15123019 - 09 Jun 2023
Viewed by 484
Abstract
Since the article “Traffic Anomaly Prediction System Using Predictive Network” by Riaz et al. [...] Full article
(This article belongs to the Special Issue Computer Vision and Image Processing)
Show Figures

Figure 1

Back to TopTop