Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (23)

Search Parameters:
Keywords = earth’s mover distance (EMD)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 23041 KiB  
Article
An Elastic Fine-Tuning Dual Recurrent Framework for Non-Rigid Point Cloud Registration
by Munan Yuan, Xiru Li and Haibao Tan
Sensors 2025, 25(11), 3525; https://doi.org/10.3390/s25113525 - 3 Jun 2025
Viewed by 435
Abstract
Non-rigid transformation is based on rigid transformation by adding distortions to form a more complex but more consistent common scene. Many advanced non-rigid alignment models are implemented using supervised learning; however, the large number of labels required for the training process makes their [...] Read more.
Non-rigid transformation is based on rigid transformation by adding distortions to form a more complex but more consistent common scene. Many advanced non-rigid alignment models are implemented using supervised learning; however, the large number of labels required for the training process makes their application difficult. Here, an elastic fine-tuning dual recurrent computation for unsupervised non-rigid registration is proposed. At first, we transform a non-rigid transformation into a series of combinations of rigid transformations using an outer recurrent computational network. Then, the inner loop layer computes elastic-controlled rigid incremental transformations by controlling the threshold to obtain a finely coherent rigid transformation. Finally, we design and implement loss functions that constrain deformations and keep transformations as rigid as possible. Extensive experiments validate that the proposed method achieves state-of-the-art performance with 0.01219 earth mover’s distances (EMDs) and 0.0153 root mean square error (RMSE) in non-rigid and rigid scenes, respectively. Full article
Show Figures

Figure 1

29 pages, 8212 KiB  
Article
ApproxGeoMap: An Efficient System for Generating Approximate Geo-Maps from Big Geospatial Data with Quality of Service Guarantees
by Reem Abdelaziz Alshamsi, Isam Mashhour Al Jawarneh, Luca Foschini and Antonio Corradi
Computers 2025, 14(2), 35; https://doi.org/10.3390/computers14020035 - 23 Jan 2025
Cited by 1 | Viewed by 1518
Abstract
Timely, region-based geo-maps like choropleths are essential for smart city applications like traffic monitoring and urban planning because they can reveal statistical patterns in geotagged data. However, because data overloading is brought on by the quick inflow of massive geospatial data, creating these [...] Read more.
Timely, region-based geo-maps like choropleths are essential for smart city applications like traffic monitoring and urban planning because they can reveal statistical patterns in geotagged data. However, because data overloading is brought on by the quick inflow of massive geospatial data, creating these visualizations in real time presents serious difficulties. This paper introduces ApproxGeoMap, a novel system designed to efficiently generate approximate geo-maps from fast-arriving georeferenced data streams. ApproxGeoMap employs a stratified spatial sampling method, leveraging geohash tessellation and Earth Mover’s Distance (EMD) to maintain both accuracy and processing speed. We developed a prototype system and tested it on real-world smart city datasets, demonstrating that ApproxGeoMap meets time-based and accuracy-based quality of service (QoS) constraints. Results indicate that ApproxGeoMap significantly enhances efficiency in both running time and map accuracy, offering a reliable solution for high-speed data environments where traditional methods fall short. Full article
(This article belongs to the Special Issue Feature Papers in Computers 2024)
Show Figures

Figure 1

24 pages, 8360 KiB  
Article
An Approach to Fall Detection Using Statistical Distributions of Thermal Signatures Obtained by a Stand-Alone Low-Resolution IR Array Sensor Device
by Nishat Tasnim Newaz and Eisuke Hanada
Sensors 2025, 25(2), 504; https://doi.org/10.3390/s25020504 - 16 Jan 2025
Cited by 1 | Viewed by 1166
Abstract
Infrared array sensor-based fall detection and activity recognition systems have gained momentum as promising solutions for enhancing healthcare monitoring and safety in various environments. Unlike camera-based systems, which can be privacy-intrusive, IR array sensors offer a non-invasive, reliable approach for fall detection and [...] Read more.
Infrared array sensor-based fall detection and activity recognition systems have gained momentum as promising solutions for enhancing healthcare monitoring and safety in various environments. Unlike camera-based systems, which can be privacy-intrusive, IR array sensors offer a non-invasive, reliable approach for fall detection and activity recognition while preserving privacy. This work proposes a novel method to distinguish between normal motion and fall incidents by analyzing thermal patterns captured by infrared array sensors. Data were collected from two subjects who performed a range of activities of daily living, including sitting, standing, walking, and falling. Data for each state were collected over multiple trials and extended periods to ensure robustness and variability in the measurements. The collected thermal data were compared with multiple statistical distributions using Earth Mover’s Distance. Experimental results showed that normal activities exhibited low EMD values with Beta and Normal distributions, suggesting that these distributions closely matched the thermal patterns associated with regular movements. Conversely, fall events exhibited high EMD values, indicating greater variability in thermal signatures. The system was implemented using a Raspberry Pi-based stand-alone device that provides a cost-effective solution without the need for additional computational devices. This study demonstrates the effectiveness of using IR array sensors for non-invasive, real-time fall detection and activity recognition, which offer significant potential for improving healthcare monitoring and ensuring the safety of fall-prone individuals. Full article
(This article belongs to the Special Issue Sensors for Human Posture and Movement)
Show Figures

Figure 1

24 pages, 2803 KiB  
Article
Explainable Self-Supervised Dynamic Neuroimaging Using Time Reversal
by Zafar Iqbal, Md. Mahfuzur Rahman, Usman Mahmood, Qasim Zia, Zening Fu, Vince D. Calhoun and Sergey Plis
Brain Sci. 2025, 15(1), 60; https://doi.org/10.3390/brainsci15010060 - 11 Jan 2025
Viewed by 1053
Abstract
Objective: Functional magnetic resonance imaging data pose significant challenges due to their inherently noisy and complex nature, making traditional statistical models less effective in capturing predictive features. While deep learning models offer superior performance through their non-linear capabilities, they often lack transparency, reducing [...] Read more.
Objective: Functional magnetic resonance imaging data pose significant challenges due to their inherently noisy and complex nature, making traditional statistical models less effective in capturing predictive features. While deep learning models offer superior performance through their non-linear capabilities, they often lack transparency, reducing trust in their predictions. This study introduces the Time Reversal (TR) pretraining method to address these challenges. TR aims to learn temporal dependencies in data, leveraging large datasets for pretraining and applying this knowledge to improve schizophrenia classification on smaller datasets. Methods: We pretrained an LSTM-based model with attention using the TR approach, focusing on learning the direction of time in fMRI data, achieving over 98 % accuracy on HCP and UK Biobank datasets. For downstream schizophrenia classification, TR-pretrained weights were transferred to models evaluated on FBIRN, COBRE, and B-SNIP datasets. Saliency maps were generated using Integrated Gradients (IG) to provide post hoc explanations for pretraining, while Earth Mover’s Distance (EMD) quantified the temporal dynamics of salient features in the downstream tasks. Results: TR pretraining significantly improved schizophrenia classification performance across all datasets: median AUC scores increased from 0.7958 to 0.8359 (FBIRN), 0.6825 to 0.7778 (COBRE), and 0.6341 to 0.7224 (B-SNIP). The saliency maps revealed more concentrated and biologically meaningful salient features along the time axis, aligning with the episodic nature of schizophrenia. TR consistently outperformed baseline pretraining methods, including OCP and PCL, in terms of AUC, balanced accuracy, and robustness. Conclusions: This study demonstrates the dual benefits of the TR method: enhanced predictive performance and improved interpretability. By aligning model predictions with meaningful temporal patterns in brain activity, TR bridges the gap between deep learning and clinical relevance. These findings emphasize the potential of explainable AI tools for aiding clinicians in diagnostics and treatment planning, especially in conditions characterized by disrupted temporal dynamics. Full article
(This article belongs to the Special Issue Application of Brain Imaging in Mental Illness)
Show Figures

Figure 1

26 pages, 11943 KiB  
Article
3D Point Cloud Fusion Method Based on EMD Auto-Evolution and Local Parametric Network
by Wen Chen, Hao Chen and Shuting Yang
Remote Sens. 2024, 16(22), 4219; https://doi.org/10.3390/rs16224219 - 12 Nov 2024
Cited by 4 | Viewed by 1472
Abstract
Although the development of high-resolution remote sensing satellite technology has made it possible to reconstruct the 3D structure of object-level features using satellite imagery, the results from a single reconstruction are often insufficient to comprehensively describe the 3D structure of the target. Therefore, [...] Read more.
Although the development of high-resolution remote sensing satellite technology has made it possible to reconstruct the 3D structure of object-level features using satellite imagery, the results from a single reconstruction are often insufficient to comprehensively describe the 3D structure of the target. Therefore, developing an effective 3D point cloud fusion method can fully utilize information from multiple observations to improve the accuracy of 3D reconstruction. To this end, this paper addresses the problems of shape distortion and sparse point cloud density in existing 3D point cloud fusion methods by proposing a 3D point cloud fusion method based on Earth mover’s distance (EMD) auto-evolution and local parameterization network. Our method is divided into two stages. In the first stage, EMD is introduced as a key metric for evaluating the fusion results, and a point cloud fusion method based on EMD auto-evolution is constructed. The method uses an alternating iterative technique to sequentially update the variables and produce an initial fusion result. The second stage focuses on point cloud optimization by constructing a local parameterization network for the point cloud, mapping the upsampled point cloud in the 2D parameter domain back to the 3D space to complete the optimization. Through these two steps, the method achieves the fusion of two sets of non-uniform point cloud data obtained from satellite stereo images into a single, denser 3D point cloud that more closely resembles the true target shape. Experimental results demonstrate that our fusion method outperforms other classical comparison algorithms for targets such as buildings, planes, and ships, and achieves a fused RMSE of approximately 2 m and an EMD accuracy better than 0.5. Full article
Show Figures

Figure 1

18 pages, 6553 KiB  
Article
Digitized Seedbed Soil Quality Assessment from Worn and Edge Hardened Cultivator Sweeps
by Jong-Myung Noh, Lijie Liu, Mehari Z. Tekeste, Qing Li, Jerry Hatfield and David Eisenmann
Sensors 2024, 24(21), 6951; https://doi.org/10.3390/s24216951 - 29 Oct 2024
Viewed by 1128
Abstract
Tillage tools for seedbed soil management are often subjected to low stress abrasion wear, which could negatively affect seedbed quality and crop productivity. Limited studies exist that quantify the effects of worn tillage tools on seedbed quality and crop yield. This research investigated [...] Read more.
Tillage tools for seedbed soil management are often subjected to low stress abrasion wear, which could negatively affect seedbed quality and crop productivity. Limited studies exist that quantify the effects of worn tillage tools on seedbed quality and crop yield. This research investigated the influence of tillage tool wear on seedbed preparation by evaluating the effect of cultivator sweep wear on soil tilth utilizing a light detection and ranging (LiDAR) sensor. The framework consists of a seedbed tillage field experiment using a Completely Randomized Design (CRD) experiment in six replicates of two-tillage treatments (new and worn cultivator sweeps). After seedbed tillage, loosely tilled soil aggregates were removed to expose the seedbed soil profile, and then seedbed roughness statistical measures were estimated from LiDAR-scanned seedbed soil surface. Three statistical analyses (Analysis of Variance (ANOVA), Kolmogorov–Smirnov (KS), and Earth Mover’s Distance (EMD)) were compared to quantitatively evaluate the soil roughness estimated from the LiDAR seedbed surface data. Seedbed prepared by new and worn cultivator sweeps showed significant differences (p < 0.05) in soil roughness variables of standard deviation, coefficient of variation, and kurtosis. Data analysis from the ANOVA and KS methods revealed that LiDAR-extracted soil roughness patterns were statistically influenced by tillage treatment. EMD analysis detected noticeable disparities between the tillage treatments and new versus worn cultivator sweeps. This study concludes that tillage tool wear substantively affects seedbed quality, as evidenced by LiDAR soil profile estimated attributes of soil roughness and three statistical methods (ANOVA, KS, and EMD). Our study supports the adoption of LiDAR technology for seedbed management, highlighting its applicability to evaluate seedbed quality that accounts for the wear life cycle of cultivator sweeps. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

27 pages, 79059 KiB  
Article
Unsupervised Noise-Resistant Remote-Sensing Image Change Detection: A Self-Supervised Denoising Network-, FCM_SICM-, and EMD Metric-Based Approach
by Jiangling Xie, Yikun Li, Shuwen Yang and Xiaojun Li
Remote Sens. 2024, 16(17), 3209; https://doi.org/10.3390/rs16173209 - 30 Aug 2024
Viewed by 1582
Abstract
The detection of change in remote-sensing images is broadly applicable to many fields. In recent years, both supervised and unsupervised methods have demonstrated excellent capacity to detect changes in high-resolution images. However, most of these methods are sensitive to noise, and their performance [...] Read more.
The detection of change in remote-sensing images is broadly applicable to many fields. In recent years, both supervised and unsupervised methods have demonstrated excellent capacity to detect changes in high-resolution images. However, most of these methods are sensitive to noise, and their performance significantly deteriorates when dealing with remote-sensing images that have been contaminated by mixed random noises. Moreover, supervised methods require that samples are manually labeled for training, which is time-consuming and labor-intensive. This study proposes a new unsupervised change-detection (CD) framework that is resilient to mixed random noise called self-supervised denoising network-based unsupervised change-detection coupling FCM_SICM and EMD (SSDNet-FSE). It consists of two components, namely a denoising module and a CD module. The proposed method first utilizes a self-supervised denoising network with real 3D weight attention mechanisms to reconstruct noisy images. Then, a noise-resistant fuzzy C-means clustering algorithm (FCM_SICM) is used to decompose the mixed pixels of reconstructed images into multiple signal classes by exploiting local spatial information, spectral information, and membership linkage. Next, the noise-resistant Earth mover’s distance (EMD) is used to calculate the distance between signal-class centers and the corresponding fuzzy memberships of bitemporal pixels and generate a map of the magnitude of change. Finally, automatic thresholding is undertaken to binarize the change-magnitude map into the final CD map. The results of experiments conducted on five public datasets prove the superior noise-resistant performance of the proposed method over six state-of-the-art CD competitors and confirm its effectiveness and potential for practical application. Full article
Show Figures

Figure 1

18 pages, 34513 KiB  
Article
Automatic Removal of Non-Architectural Elements in 3D Models of Historic Buildings with Language Embedded Radiance Fields
by Alexander Rusnak, Bryan G. Pantoja-Rosero, Frédéric Kaplan and Katrin Beyer
Heritage 2024, 7(6), 3332-3349; https://doi.org/10.3390/heritage7060157 - 18 Jun 2024
Cited by 1 | Viewed by 1438
Abstract
Neural radiance fields have emerged as a dominant paradigm for creating complex 3D environments incorporating synthetic novel views. However, 3D object removal applications utilizing neural radiance fields have lagged behind in effectiveness, particularly when open set queries are necessary for determining the relevant [...] Read more.
Neural radiance fields have emerged as a dominant paradigm for creating complex 3D environments incorporating synthetic novel views. However, 3D object removal applications utilizing neural radiance fields have lagged behind in effectiveness, particularly when open set queries are necessary for determining the relevant objects. One such application area is in architectural heritage preservation, where the automatic removal of non-architectural objects from 3D environments is necessary for many downstream tasks. Furthermore, when modeling occupied buildings, it is crucial for modeling techniques to be privacy preserving by default; this also motivates the removal of non-architectural elements. In this paper, we propose a pipeline for the automatic creation of cleaned, architectural structure only point clouds utilizing a language embedded radiance field (LERF) with a specific application toward generating suitable point clouds for the structural integrity assessment of occupied buildings. We then validated the efficacy of our approach on the rooms of the historic Sion hospital, a national historic monument in Valais, Switzerland. By using our automatic removal pipeline on the point clouds of rooms filled with furniture, we decreased the average earth mover’s distance (EMD) to the ground truth point clouds of the physically emptied rooms by 31 percent. The success of our research points the way toward new paradigms in architectural modeling and cultural preservation. Full article
(This article belongs to the Special Issue 3D Reconstruction of Cultural Heritage and 3D Assets Utilisation)
Show Figures

Figure 1

15 pages, 5200 KiB  
Article
Few-Shot Image Classification Based on Swin Transformer + CSAM + EMD
by Huadong Sun, Pengyi Zhang, Xu Zhang and Xiaowei Han
Electronics 2024, 13(11), 2121; https://doi.org/10.3390/electronics13112121 - 29 May 2024
Cited by 2 | Viewed by 1460
Abstract
In few-shot image classification (FSIC), the feature extraction module of the traditional convolutional neural networks is often constrained by the local nature of the convolutional kernel. As a result, it becomes challenging to handle global information and long-distance dependencies effectively. In order to [...] Read more.
In few-shot image classification (FSIC), the feature extraction module of the traditional convolutional neural networks is often constrained by the local nature of the convolutional kernel. As a result, it becomes challenging to handle global information and long-distance dependencies effectively. In order to address this problem, an innovative FSIC method is proposed in this paper, which is the integration of Swin Transformer and CSAM and Earth Mover’s Distance (EMD) technology (STCE). We utilize the Swin Transformer network for image feature extraction, and perform CSAM attention mechanism feature weighting on the output feature map, while we adopt the EMD algorithm to generate the optimal matching flow between the structural units, minimizing the matching cost. This approach allows for a more precise representation of the classification distance between images. We have conducted numerous experiments to validate the effectiveness of our algorithm. On three commonly used few-shot datasets, namely mini-ImageNet, tiered-ImageNet, and FC100, the accuracy of one-shot and five-shot has reached the state of the art (SOTA) in the FSIC; the mini-ImageNet achieves an accuracy of 98.65 ± 0.1% for one-shot and 99.6 ± 0.2% for five-shot tasks, while tiered ImageNet has an accuracy of 91.6 ± 0.1% for one-shot tasks and 96.55 ± 0.27% for five-shot tasks. For FC100, the accuracy is 64.1 ± 0.3% for one-shot tasks and 79.8 ± 0.69% for five-shot tasks. On two commonly used few-shot datasets, namely CUB, CIFAR-FS, CUB achieves an accuracy of 83.1 ± 0.4% for one-shot and 92.88 ± 0.4% for five-shot tasks, while CIFAR-FS achieves an accuracy of 86.95 ± 0.2% for one-shot and 94 ± 0.4% for five-shot tasks. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Figure 1

24 pages, 5751 KiB  
Article
HMFN-FSL: Heterogeneous Metric Fusion Network-Based Few-Shot Learning for Crop Disease Recognition
by Wenbo Yan, Quan Feng, Sen Yang, Jianhua Zhang and Wanxia Yang
Agronomy 2023, 13(12), 2876; https://doi.org/10.3390/agronomy13122876 - 23 Nov 2023
Cited by 4 | Viewed by 1747
Abstract
The high performance of deep learning networks relies mainly on massive data. However, collecting enough samples of crop disease is impractical, which significantly limits the intelligent diagnosis of diseases. In this study, we propose Heterogeneous Metric Fusion Network-based Few-Shot Learning (HMFN-FSL), which aims [...] Read more.
The high performance of deep learning networks relies mainly on massive data. However, collecting enough samples of crop disease is impractical, which significantly limits the intelligent diagnosis of diseases. In this study, we propose Heterogeneous Metric Fusion Network-based Few-Shot Learning (HMFN-FSL), which aims to recognize crop diseases with unseen categories using only a small number of labeled samples. Specifically, CBAM (Convolutional Block Attention Module) was embedded in the feature encoders to improve the feature representation capability. Second, an improved few-shot learning network, namely HMFN-FSL, was built by fusing three metric networks (Prototypical Network, Matching Network, and DeepEMD (Differentiable Earth Mover’s Distance)) under the framework of meta-learning, which solves the problem of the insufficient accuracy of a single metric model. Finally, pre-training and meta-training strategies were optimized to improve the ability to generalize to new tasks in meta-testing. In this study, two datasets named Plantvillage and Field-PV (covering 38 categories of 14 crops and containing 50,403 and 665 images, respectively) are used for extensive comparison and ablation experiments. The results show that the HMFN-FSL proposed in this study outperforms the original metric networks and other state-of-the-art FSL methods. HMFN-FSL achieves 91.21% and 98.29% accuracy for crop disease recognition on 5way-1shot, 5way-5shot tasks on the Plantvillage dataset. The accuracy is improved by 14.86% and 3.96%, respectively, compared to the state-of-the-art method (DeepEMD) in past work. Furthermore, HMFN-FSL was still robust on the field scenes dataset (Field-PV), with average recognition accuracies of 73.80% and 85.86% on 5way-1shot, 5way-5shot tasks, respectively. In addition, domain variation and fine granularity directly affect the performance of the model. In conclusion, the few-shot method proposed in this study for crop disease recognition not only has superior performance in laboratory scenes but is also still effective in field scenes. Our results outperform the existing related works. This study provided technical references for subsequent few-shot disease recognition in complex environments in field environments. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning Technology in Agriculture)
Show Figures

Figure 1

16 pages, 2702 KiB  
Article
Wavelet and Earth Mover’s Distance Coupling Denoising Techniques
by Zhihua Zhang, Xudong Xu and M. James C. Crabbe
Electronics 2023, 12(17), 3588; https://doi.org/10.3390/electronics12173588 - 25 Aug 2023
Viewed by 1463
Abstract
The widely used wavelet-thresholding techniques (DWT-H and DWT-S) have a near-optimal behavior that cannot be enhanced by any local denoising filter, but they cannot utilize the similarity of small-size image patches to enhance the denoising performance. Two of the latest improvements (WNLM and [...] Read more.
The widely used wavelet-thresholding techniques (DWT-H and DWT-S) have a near-optimal behavior that cannot be enhanced by any local denoising filter, but they cannot utilize the similarity of small-size image patches to enhance the denoising performance. Two of the latest improvements (WNLM and NLMW) introduced the Euclidean distance to measure the similarity of image patches, and then used the non-local meaning of similar patches for further denoising. Since the Euclidean distance is not a good similarity measurement, these two improvements are limited. In this study, we introduced the earth mover’s distance (EMD) as the similarity measure of small-scale patches within the wavelet sub-bands of noisy images. Moreover, at higher noise levels, we further incorporated joint bilateral filtering, which can filter both the spatial domain and the intensity domain of images. Denoising simulation experiments on BSDS500 demonstrated that our algorithm outperformed the DWT-H, DWT-S, WNLM, and NLMW algorithms by 4.197 dB, 3.326 dB, 2.097 dB, and 1.162 dB in terms of the average PSNR, and by 0.230, 0.213, 0.132, and 0.085 in terms of the average SSIM. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

14 pages, 3584 KiB  
Article
Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric
by Shunli Li, Linzhang Lu, Qilong Liu and Zhen Chen
Mathematics 2023, 11(8), 1894; https://doi.org/10.3390/math11081894 - 17 Apr 2023
Cited by 2 | Viewed by 2013
Abstract
Non-negative matrix factorization (NMF) is widely used as a powerful matrix factorization tool in data representation. However, the traditional NMF, measured by Euclidean distance or Kullback–Leibler distance, does not take into account the internal implied geometric information of the dataset and cannot measure [...] Read more.
Non-negative matrix factorization (NMF) is widely used as a powerful matrix factorization tool in data representation. However, the traditional NMF, measured by Euclidean distance or Kullback–Leibler distance, does not take into account the internal implied geometric information of the dataset and cannot measure the distance between samples as well as possible. To remedy the defects, in this paper, we propose the NMF method with Earth mover’s distance as a metric, for short GSNMF-EMD. It combines graph regularization and L1/2 smooth constraints. The GSNMF-EMD method takes into account the intrinsic implied geometric information of the dataset and can produce more sparse and stable local solutions. Experiments on two specific image datasets showed that the proposed method outperforms related state-of-the-art methods. Full article
Show Figures

Figure 1

14 pages, 1058 KiB  
Article
An Improved SVM with Earth Mover’s Distance Regularization and Its Application in Pattern Recognition
by Rui Feng, Haitao Dong, Xuri Li, Zhaochuang Gu, Runyang Tian and Houde Li
Electronics 2023, 12(3), 645; https://doi.org/10.3390/electronics12030645 - 28 Jan 2023
Cited by 1 | Viewed by 1602
Abstract
A support vector machine (SVM) aims to achieve an optimal hyperplane with a maximum interclass margin and has been widely utilized in pattern recognition. Traditionally, a SVM mainly considers the separability of boundary points (i.e., support vectors), while the underlying data structure information [...] Read more.
A support vector machine (SVM) aims to achieve an optimal hyperplane with a maximum interclass margin and has been widely utilized in pattern recognition. Traditionally, a SVM mainly considers the separability of boundary points (i.e., support vectors), while the underlying data structure information is commonly ignored. In this paper, an improved support vector machine with earth mover’s distance (EMD-SVM) is proposed. It can be regarded as an improved generalization of the standard SVM, and can automatically learn the distribution between the classes. To validate its performance, we discuss the necessity of the structural information of EMD-SVM in the linear and nonlinear cases, respectively. Experimental validation was designed and conducted in different application fields, which have shown its superior and robust performance. Full article
(This article belongs to the Special Issue Machine Learning for Radar and Communication Signal Processing)
Show Figures

Figure 1

13 pages, 2112 KiB  
Article
A Single Stage and Single View 3D Point Cloud Reconstruction Network Based on DetNet
by Bin Li, Shiao Zhu and Yi Lu
Sensors 2022, 22(21), 8235; https://doi.org/10.3390/s22218235 - 27 Oct 2022
Cited by 10 | Viewed by 4577
Abstract
It is a challenging problem to infer objects with reasonable shapes and appearance from a single picture. Existing research often pays more attention to the structure of the point cloud generation network, while ignoring the feature extraction of 2D images and reducing the [...] Read more.
It is a challenging problem to infer objects with reasonable shapes and appearance from a single picture. Existing research often pays more attention to the structure of the point cloud generation network, while ignoring the feature extraction of 2D images and reducing the loss in the process of feature propagation in the network. In this paper, a single-stage and single-view 3D point cloud reconstruction network, 3D-SSRecNet, is proposed. The proposed 3D-SSRecNet is a simple single-stage network composed of a 2D image feature extraction network and a point cloud prediction network. The single-stage network structure can reduce the loss of the extracted 2D image features. The 2D image feature extraction network takes DetNet as the backbone. DetNet can extract more details from 2D images. In order to generate point clouds with better shape and appearance, in the point cloud prediction network, the exponential linear unit (ELU) is used as the activation function, and the joint function of chamfer distance (CD) and Earth mover’s distance (EMD) is used as the loss function of 3DSSRecNet. In order to verify the effectiveness of 3D-SSRecNet, we conducted a series of experiments on ShapeNet and Pix3D datasets. The experimental results measured by CD and EMD have shown that 3D-SSRecNet outperforms the state-of-the-art reconstruction methods. Full article
(This article belongs to the Special Issue Intelligent Point Cloud Processing, Sensing and Understanding)
Show Figures

Figure 1

26 pages, 13027 KiB  
Article
View-Agnostic Point Cloud Generation for Occlusion Reduction in Aerial Lidar
by Nina Singer and Vijayan K. Asari
Remote Sens. 2022, 14(13), 2955; https://doi.org/10.3390/rs14132955 - 21 Jun 2022
Cited by 5 | Viewed by 3159
Abstract
Occlusions are one of the leading causes of data degradation in lidar. The presence of occlusions reduces the overall aesthetic quality of a point cloud, creating a signature that is specific to that viewpoint and sensor modality. Typically, datasets consist of a series [...] Read more.
Occlusions are one of the leading causes of data degradation in lidar. The presence of occlusions reduces the overall aesthetic quality of a point cloud, creating a signature that is specific to that viewpoint and sensor modality. Typically, datasets consist of a series of point clouds with one type of sensor and a limited range of viewpoints. Therefore, when training a dataset with a particular signature, it is challenging to infer scenes outside of the original range of the viewpoints from the training dataset. This work develops a generative network that can predict the area in which an occlusion occurs and furnish the missing points. The output is a complete point cloud that is a more general representation and agnostic to the original viewpoint. We can then use the resulting point cloud as an input for a secondary method such as semantic or instance segmentation. We propose a learned sampling technique that uses the features to inform the point sampling instead of relying strictly on spatial information. We also introduce a new network structure that considers multiple point locations and augmentations to generate parallel features. The network is tested against other methods using our aerial occlusion dataset, DALES Viewpoints Version 2, and also against other point cloud completion networks on the Point Cloud Network (PCN) dataset. We show that it reduces occlusions visually and outperforms state-of-the-art point cloud completion networks in both Chamfers and Earth Mover’s Distance (EMD) metrics. We also show that using our occlusion reduction method as a pre-processing step improves semantic segmentation results compared to the same scenes processed without using our method. Full article
Show Figures

Graphical abstract

Back to TopTop