Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (32)

Search Parameters:
Keywords = convolutional sparse coding

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 10772 KB  
Article
PBC-Transformer: Interpreting Poultry Behavior Classification Using Image Caption Generation Techniques
by Jun Li, Bing Yang, Jiaxin Liu, Felix Kwame Amevor, Yating Guo, Yuheng Zhou, Qinwen Deng and Xiaoling Zhao
Animals 2025, 15(11), 1546; https://doi.org/10.3390/ani15111546 - 25 May 2025
Cited by 1 | Viewed by 985
Abstract
Accurate classification of poultry behavior is critical for assessing welfare and health, yet most existing methods predict behavior categories without providing explanations for the image content. This study introduces the PBC-Transformer model, a novel model that integrates image captioning techniques to enhance poultry [...] Read more.
Accurate classification of poultry behavior is critical for assessing welfare and health, yet most existing methods predict behavior categories without providing explanations for the image content. This study introduces the PBC-Transformer model, a novel model that integrates image captioning techniques to enhance poultry behavior classification, mimicking expert assessment processes. The model employs a multi-head concentrated attention mechanism, Head Spatial Position Coding (HSPC), to enhance spatial information; a learnable sparse mechanism (LSM) and RNorm function to reduce noise and strengthen feature correlation; and a depth-wise separable convolutional network for improved local feature extraction. Furthermore, a multi-level attention differentiator dynamically selects image regions for precise behavior descriptions. To balance caption generation with classification, we introduce the ICL-Loss function, which adaptively adjusts loss weights. Extensive experiments on the PBC-CapLabels dataset demonstrate that PBC-Transformer outperforms 13 commonly used classification models, improving accuracy by 15% and achieving the highest scores across image captioning metrics: Bleu4 (0.498), RougeL (0.794), Meteor (0.393), and Spice (0.613). Full article
(This article belongs to the Special Issue Animal–Computer Interaction: New Horizons in Animal Welfare)
Show Figures

Figure 1

22 pages, 7303 KB  
Article
Ground Segmentation for LiDAR Point Clouds in Structured and Unstructured Environments Using a Hybrid Neural–Geometric Approach
by Antonio Santo, Enrique Heredia, Carlos Viegas, David Valiente and Arturo Gil
Technologies 2025, 13(4), 162; https://doi.org/10.3390/technologies13040162 - 16 Apr 2025
Cited by 1 | Viewed by 4520
Abstract
Ground segmentation in LiDAR point clouds is a foundational capability for autonomous systems, enabling safe navigation in applications ranging from urban self-driving vehicles to planetary exploration rovers. Reliably distinguishing traversable surfaces in geometrically irregular or sensor-sparse environments remains a critical challenge. This paper [...] Read more.
Ground segmentation in LiDAR point clouds is a foundational capability for autonomous systems, enabling safe navigation in applications ranging from urban self-driving vehicles to planetary exploration rovers. Reliably distinguishing traversable surfaces in geometrically irregular or sensor-sparse environments remains a critical challenge. This paper introduces a hybrid framework that synergizes multi-resolution polar discretization with sparse convolutional neural networks (SCNNs) to address these challenges. The method hierarchically partitions point clouds into adaptive sectors, leveraging PCA-derived geometric features and dynamic variance thresholds for robust terrain modeling, while a SCNN resolves ambiguities in data-sparse regions. Evaluated in structured (SemanticKITTI) and unstructured (Rellis-3D) environments, two different versions of the proposed method are studied, including a purely geometric method and a hybrid approach that exploits deep learning techniques. A comparison of the proposed method with its purely geometric version is made for the purpose of highlighting the strengths of each approach. The hybrid approach achieves state-of-the-art performance, attaining an F1-score of 95.4% in urban environments, surpassing the purely geometric (91.4%) and learning-based baselines. Conversely, in unstructured terrains, the geometric variant demonstrates superior metric balance (80.8% F1) compared to the hybrid method (75.8% F1), highlighting context-dependent trade-offs between precision and recall. The framework’s generalization is further validated on custom datasets (UMH-Gardens, Coimbra-Liv), showcasing robustness to sensor variations and environmental complexity. The code and datasets are openly available to facilitate reproducibility. Full article
(This article belongs to the Special Issue Advanced Autonomous Systems and Artificial Intelligence Stage)
Show Figures

Graphical abstract

23 pages, 10294 KB  
Article
Machine Learning-Based 3D Soil Layer Reconstruction in Foundation Pit Engineering
by Chenxi Zhang, Nan Li, Xiuping Dong, Bin Liu and Meilian Liu
Appl. Sci. 2025, 15(8), 4078; https://doi.org/10.3390/app15084078 - 8 Apr 2025
Cited by 1 | Viewed by 1173
Abstract
In the construction of deep foundation pits, early warning measures are essential to reduce construction risks and prevent personnel injuries. In underground structure and pressure analysis, soil layer and support structure data are indispensable. Therefore, soil layer reconstruction serves as a critical step, [...] Read more.
In the construction of deep foundation pits, early warning measures are essential to reduce construction risks and prevent personnel injuries. In underground structure and pressure analysis, soil layer and support structure data are indispensable. Therefore, soil layer reconstruction serves as a critical step, while sparse borehole data limit the accuracy of traditional reconstruction methods. This paper proposes a machine learning-based soil layer reconstruction method to address this issue. First, various types of borehole and soil layer data are generated by simulating the formation process of Earth’s soil layers, thereby providing sufficient training data. Subsequently, a coding algorithm is designed to extract soil layer features as inputs for the convolutional neural network. Finally, 3D meshing is performed on the soil layer generated from real boreholes, and soil model rendering is achieved through a voxel clustering algorithm. The algorithm achieved an accuracy rate of over 90% in tests and demonstrated excellent robustness. By applying this algorithm, we successfully reconstructed the soil layers at a typical foundation pit site in a Chinese city, validating its effectiveness in real-world scenarios and its potential for large-scale engineering applications. Full article
Show Figures

Figure 1

22 pages, 2362 KB  
Article
Fast Coding Unit Partitioning Method for Video-Based Point Cloud Compression: Combining Convolutional Neural Networks and Bayesian Optimization
by Wenjun Song, Xinqi Liu and Qiuwen Zhang
Electronics 2025, 14(7), 1295; https://doi.org/10.3390/electronics14071295 - 25 Mar 2025
Cited by 1 | Viewed by 1311
Abstract
As 5G technology and 3D capture techniques have been rapidly developing, there has been a remarkable increase in the demand for effectively compressing dynamic 3D point cloud data. Video-based point cloud compression (V-PCC), which is an innovative method for 3D point cloud compression, [...] Read more.
As 5G technology and 3D capture techniques have been rapidly developing, there has been a remarkable increase in the demand for effectively compressing dynamic 3D point cloud data. Video-based point cloud compression (V-PCC), which is an innovative method for 3D point cloud compression, makes use of High-Efficiency Video Coding (HEVC) to carry out the compression of 3D point clouds. This is accomplished through the projection of the point clouds onto two-dimensional video frames. However, V-PCC faces significant coding complexity, particularly for dynamic 3D point clouds, which can be up to four times more complex to process than a conventional video. To address this challenge, we propose an adaptive coding unit (CU) partitioning method that integrates occupancy graphs, convolutional neural networks (CNNs), and Bayesian optimization. In this approach, the coding units (CUs) are first divided into dense regions, sparse regions, and complex composite regions by calculating the occupancy rate R of the CUs, and then an initial classification decision is made using a convolutional neural network (CNN) framework. For regions where the CNN outputs low-confidence classifications, Bayesian optimization is employed to refine the partitioning and enhance accuracy. The findings from the experiments show that the suggested method can efficiently decrease the coding complexity of V-PCC, all the while maintaining a high level of coding quality. Specifically, the average coding time of the geometric graph is reduced by 57.37%, the attribute graph by 54.43%, and the overall coding time by 54.75%. Although the BD rate slightly increases compared with that of the baseline V-PCC method, the impact on video quality is negligible. Additionally, the proposed algorithm outperforms existing methods in terms of geometric compression efficiency and computational time savings. This study’s innovation lies in combining deep learning with Bayesian optimization to deliver an efficient CU partitioning strategy for V-PCC, improving coding speed and reducing computational resource consumption, thereby advancing the practical application of V-PCC. Full article
Show Figures

Figure 1

20 pages, 4387 KB  
Article
Convolutional Sparse Modular Fusion Algorithm for Non-Rigid Registration of Visible–Infrared Images
by Tao Luo, Ning Chen, Xianyou Zhu, Heyuan Yi and Weiwen Duan
Appl. Sci. 2025, 15(5), 2508; https://doi.org/10.3390/app15052508 - 26 Feb 2025
Cited by 1 | Viewed by 995
Abstract
Existing image fusion algorithms involve extensive models and high computational demands when processing source images that require non-rigid registration, which may not align with the practical needs of engineering applications. To tackle this challenge, this study proposes a comprehensive framework for convolutional sparse [...] Read more.
Existing image fusion algorithms involve extensive models and high computational demands when processing source images that require non-rigid registration, which may not align with the practical needs of engineering applications. To tackle this challenge, this study proposes a comprehensive framework for convolutional sparse fusion in the context of non-rigid registration of visible–infrared images. Our approach begins with an attention-based convolutional sparse encoder to extract cross-modal feature encodings from source images. To enhance feature extraction, we introduce a feature-guided loss and an information entropy loss to guide the extraction of homogeneous and isolated features, resulting in a feature decomposition network. Next, we create a registration module that estimates the registration parameters based on homogeneous feature pairs. Finally, we develop an image fusion module by applying homogeneous and isolated feature filtering to the feature groups, resulting in high-quality fused images with maximized information retention. Experimental results on multiple datasets indicate that, compared with similar studies, the proposed algorithm achieves an average improvement of 8.3% in image registration and 30.6% in fusion performance in mutual information. In addition, in downstream target recognition tasks, the fusion images generated by the proposed algorithm show a maximum improvement of 20.1% in average relative accuracy compared with the original images. Importantly, our algorithm maintains a relatively lightweight computational and parameter load. Full article
Show Figures

Figure 1

18 pages, 6989 KB  
Article
A Deep Unfolding Network for Multispectral and Hyperspectral Image Fusion
by Bihui Zhang, Xiangyong Cao and Deyu Meng
Remote Sens. 2024, 16(21), 3979; https://doi.org/10.3390/rs16213979 - 26 Oct 2024
Cited by 4 | Viewed by 2891
Abstract
Multispectral and hyperspectral image fusion (MS/HS fusion) aims to generate a high-resolution hyperspectral (HRHS) image by fusing a high-resolution multispectral (HRMS) and a low-resolution hyperspectral (LRHS) images. The deep unfolding-based MS/HS fusion method is a representative deep learning paradigm due to its excellent [...] Read more.
Multispectral and hyperspectral image fusion (MS/HS fusion) aims to generate a high-resolution hyperspectral (HRHS) image by fusing a high-resolution multispectral (HRMS) and a low-resolution hyperspectral (LRHS) images. The deep unfolding-based MS/HS fusion method is a representative deep learning paradigm due to its excellent performance and sufficient interpretability. However, existing deep unfolding-based MS/HS fusion methods only rely on a fixed linear degradation model, which focuses on modeling the relationships between HRHS and HRMS, as well as HRHS and LRHS. In this paper, we break free from this observation model framework and propose a new observation model. Firstly, the proposed observation model is built based on the convolutional sparse coding (CSC) technique, and then a proximal gradient algorithm is designed to solve this model. Secondly, we unfold the iterative algorithm into a deep network, dubbed as MHF-CSCNet, where the proximal operators are learned using convolutional neural networks. Finally, all trainable parameters can be automatically learned end-to-end from the training pairs. Experimental evaluations conducted on various benchmark datasets demonstrate the superiority of our method both quantitatively and qualitatively compared to other state-of-the-art methods. Full article
Show Figures

Figure 1

16 pages, 2564 KB  
Article
Modeling Chickpea Productivity with Artificial Image Objects and Convolutional Neural Network
by Mikhail Bankin, Yaroslav Tyrykin, Maria Duk, Maria Samsonova and Konstantin Kozlov
Plants 2024, 13(17), 2444; https://doi.org/10.3390/plants13172444 - 1 Sep 2024
Cited by 1 | Viewed by 1569
Abstract
The chickpea plays a significant role in global agriculture and occupies an increasing share in the human diet. The main aim of the research was to develop a model for the prediction of two chickpea productivity traits in the available dataset. Genomic data [...] Read more.
The chickpea plays a significant role in global agriculture and occupies an increasing share in the human diet. The main aim of the research was to develop a model for the prediction of two chickpea productivity traits in the available dataset. Genomic data for accessions were encoded in Artificial Image Objects, and a model for the thousand-seed weight (TSW) and number of seeds per plant (SNpP) prediction was constructed using a Convolutional Neural Network, dictionary learning and sparse coding for feature extraction, and extreme gradient boosting for regression. The model was capable of predicting both traits with an acceptable accuracy of 84–85%. The most important factors for model solution were identified using the dense regression attention maps method. The SNPs important for the SNpP and TSW traits were found in 34 and 49 genes, respectively. Genomic prediction with a constructed model can help breeding programs harness genotypic and phenotypic diversity to more effectively produce varieties with a desired phenotype. Full article
(This article belongs to the Section Plant Genetics, Genomics and Biotechnology)
Show Figures

Graphical abstract

20 pages, 5382 KB  
Article
Activated Sparsely Sub-Pixel Transformer for Remote Sensing Image Super-Resolution
by Yongde Guo, Chengying Gong and Jun Yan
Remote Sens. 2024, 16(11), 1895; https://doi.org/10.3390/rs16111895 - 24 May 2024
Cited by 8 | Viewed by 2885
Abstract
Transformers have recently achieved significant breakthroughs in various visual tasks. However, these methods often overlook the optimization of interactions between convolution and transformer blocks. Although the basic attention module strengthens the feature selection ability, it is still weak in generating superior quality output. [...] Read more.
Transformers have recently achieved significant breakthroughs in various visual tasks. However, these methods often overlook the optimization of interactions between convolution and transformer blocks. Although the basic attention module strengthens the feature selection ability, it is still weak in generating superior quality output. In order to address this challenge, we propose the integration of sub-pixel space and the application of sparse coding theory in the calculation of self-attention. This approach aims to enhance the network’s generation capability, leading to the development of a sparse-activated sub-pixel transformer network (SSTNet). The experimental results show that compared with several state-of-the-art methods, our proposed network can obtain better generation results, improving the sharpness of object edges and the richness of detail texture information in super-resolution generated images. Full article
Show Figures

Figure 1

14 pages, 2003 KB  
Article
Abnormal Traffic Detection System Based on Feature Fusion and Sparse Transformer
by Xinjian Zhao, Weiwei Miao, Guoquan Yuan, Yu Jiang, Song Zhang and Qianmu Li
Mathematics 2024, 12(11), 1643; https://doi.org/10.3390/math12111643 - 24 May 2024
Cited by 4 | Viewed by 2660
Abstract
This paper presents a feature fusion and sparse transformer-based anomalous traffic detection system (FSTDS). FSTDS utilizes a feature fusion network to encode the traffic data sequences and extracting features, fusing them into coding vectors through shallow and deep convolutional networks, followed by deep [...] Read more.
This paper presents a feature fusion and sparse transformer-based anomalous traffic detection system (FSTDS). FSTDS utilizes a feature fusion network to encode the traffic data sequences and extracting features, fusing them into coding vectors through shallow and deep convolutional networks, followed by deep coding using a sparse transformer to capture the complex relationships between network flows; finally, a multilayer perceptron is used to classify the traffic and achieve anomaly traffic detection. The feature fusion network of FSTDS improves feature extraction from small sample data, the deep encoder enhances the understanding of complex traffic patterns, and the sparse transformer reduces the computational and storage overhead and improves the scalability of the model. Experiments demonstrate that the number of FSTDS parameters is reduced by up to nearly half compared to the baseline, and the success rate of anomalous flow detection is close to 100%. Full article
Show Figures

Figure 1

17 pages, 4862 KB  
Article
RETRACTED: A Global Structural Hypergraph Convolutional Model for Bundle Recommendation
by Xingtong Liu and Man Yuan
Electronics 2023, 12(18), 3952; https://doi.org/10.3390/electronics12183952 - 19 Sep 2023
Cited by 1 | Viewed by 2381 | Retraction
Abstract
Bundle recommendations provide personalized suggestions to users by combining related items into bundles, aiming to enhance users’ shopping experiences and boost merchants’ sales revenue. Existing solutions based on graph neural networks (GNN) face several significant challenges: (1) it is demanding to explicitly model [...] Read more.
Bundle recommendations provide personalized suggestions to users by combining related items into bundles, aiming to enhance users’ shopping experiences and boost merchants’ sales revenue. Existing solutions based on graph neural networks (GNN) face several significant challenges: (1) it is demanding to explicitly model multiple complex associations using standard graph neural networks, (2) numerous additional nodes and edges are introduced to approximate higher-order associations, and (3) the user–bundle historical interaction data are highly sparse. In this work, we propose a global structural hypergraph convolutional model for bundle recommendation (SHCBR) to address the above problems. Specifically, we jointly incorporate multiple complex interactions between users, items, and bundles into a relational hypergraph without introducing additional nodes and edges. The hypergraph structure inherently incorporates higher-order associations, thereby alleviating the training burden on neural networks and the dilemma of scarce data effectively. In addition, we design a special matrix propagation rule that captures non-pairwise complex relationships between entities. Using item nodes as links, structural hypergraph convolutional networks learn representations of users and bundles on a relational hypergraph. Experiments conducted on two real-world datasets demonstrate that the SHCBR outperforms the state-of-the-art baselines by 11.07–25.66% on Recall and 16.81–33.53% on NDCG. Experimental results further indicate that the approach based on hypergraphs can offer new insights for addressing bundle recommendation challenges. The codes and datasets have been publicly released on GitHub. Full article
(This article belongs to the Special Issue Recommender Systems and Data Mining)
Show Figures

Figure 1

19 pages, 5894 KB  
Article
Noise Attenuation for CSEM Data via Deep Residual Denoising Convolutional Neural Network and Shift-Invariant Sparse Coding
by Xin Wang, Ximin Bai, Guang Li, Liwei Sun, Hailong Ye and Tao Tong
Remote Sens. 2023, 15(18), 4456; https://doi.org/10.3390/rs15184456 - 10 Sep 2023
Cited by 11 | Viewed by 2695
Abstract
To overcome the interference of noise on the exploration effectiveness of the controlled-source electromagnetic method (CSEM), we improved the deep learning algorithm by combining the denoising convolutional neural network (DnCNN) with the residual network (ResNet), and propose a method based on the residual [...] Read more.
To overcome the interference of noise on the exploration effectiveness of the controlled-source electromagnetic method (CSEM), we improved the deep learning algorithm by combining the denoising convolutional neural network (DnCNN) with the residual network (ResNet), and propose a method based on the residual denoising convolutional neural network (ResDnCNN) and shift-invariant sparse coding (SISC) for denoising CSEM data. Firstly, a sample library was constructed by adding simulated noises of different types and amplitudes to high-quality CSEM data collected. Then, the sample library was used for model training in the ResDnCNN, resulting in a network model specifically designed for denoising CSEM data. Subsequently, the trained model was employed to denoise the measured data, generating preliminary denoised data. Finally, the preliminary denoised data was processed using SISC to obtain the final denoised high-quality data. Comparative experiments with the ResNet, DnCNN, U-Net, and long short-term memory (LSTM) networks demonstrated the significant advantages of our proposed method. It effectively removed strong noise such as Gaussian, impulse, and square wave, resulting in an improvement of the signal-to-noise ratio by nearly 20 dB. Testing on CSEM data from Sichuan Province, China, showed that the apparent resistivity curves plotted using our method were smoother and more credible. Full article
(This article belongs to the Special Issue Multi-Scale Remote Sensed Imagery for Mineral Exploration)
Show Figures

Figure 1

14 pages, 2911 KB  
Technical Note
Fast Frequency-Diverse Radar Imaging Based on Adaptive Sampling Iterative Soft-Thresholding Deep Unfolding Network
by Zhenhua Wu, Fafa Zhao, Lei Zhang, Yice Cao, Jun Qian, Jiafei Xu and Lixia Yang
Remote Sens. 2023, 15(13), 3284; https://doi.org/10.3390/rs15133284 - 26 Jun 2023
Cited by 3 | Viewed by 2440
Abstract
Frequency-diverse radar imaging is an emerging field that combines computational imaging with frequency-diverse techniques to interrogate the high-quality images of objects. Despite the success of deep reconstruction networks in improving scene image reconstruction from noisy or under-sampled frequency-diverse measurements, their reliance on large [...] Read more.
Frequency-diverse radar imaging is an emerging field that combines computational imaging with frequency-diverse techniques to interrogate the high-quality images of objects. Despite the success of deep reconstruction networks in improving scene image reconstruction from noisy or under-sampled frequency-diverse measurements, their reliance on large amounts of high-quality training data and the inherent uninterpretable features pose significant challenges in the design and optimization of imaging networks, particularly in the face of dynamic variations in radar operating frequency bands. Here, aiming at reducing the latency and processing burden involved in scene image reconstruction, we propose an adaptive sampling iterative soft-thresholding deep unfolding network (ASISTA-Net). Specifically, we embed an adaptively sampling module into the iterative soft-thresholding (ISTA) unfolding network, which contains multiple measurement matrices with different compressed sampling ratios. The outputs of the convolutional layers are then passed through a series of ISTA layers that perform a sparse coding step followed by a thresholding step. The proposed method requires no need for heavy matrix operations and massive amount of training scene targets and measurements datasets. Unlike recent work using matrix-inversion-based and data-driven deep reconstruction networks, our generic approach is directly adapted to multi-compressed sampling ratios and multi-scene target image reconstruction, and no restrictions on the types of imageable scenes are imposed. Multiple measurement matrices with different scene compressed sampling ratios are trained in parallel, which enables the frequency-diverse radar to select operation frequency bands flexibly. In general, the application of the proposed approach paves the way for the widespread deployment of computational microwave and millimeter wave frequency-diverse radar imagers to achieve real-time imaging. Extensive imaging simulations demonstrate the effectiveness of our proposed method. Full article
(This article belongs to the Special Issue Advanced Radar Signal Processing and Applications)
Show Figures

Figure 1

18 pages, 1972 KB  
Article
Inferior Alveolar Canal Automatic Detection with Deep Learning CNNs on CBCTs: Development of a Novel Model and Release of Open-Source Dataset and Algorithm
by Mattia Di Bartolomeo, Arrigo Pellacani, Federico Bolelli, Marco Cipriano, Luca Lumetti, Sara Negrello, Stefano Allegretti, Paolo Minafra, Federico Pollastri, Riccardo Nocini, Giacomo Colletti, Luigi Chiarini, Costantino Grana and Alexandre Anesi
Appl. Sci. 2023, 13(5), 3271; https://doi.org/10.3390/app13053271 - 3 Mar 2023
Cited by 14 | Viewed by 4324
Abstract
Introduction: The need of accurate three-dimensional data of anatomical structures is increasing in the surgical field. The development of convolutional neural networks (CNNs) has been helping to fill this gap by trying to provide efficient tools to clinicians. Nonetheless, the lack of a [...] Read more.
Introduction: The need of accurate three-dimensional data of anatomical structures is increasing in the surgical field. The development of convolutional neural networks (CNNs) has been helping to fill this gap by trying to provide efficient tools to clinicians. Nonetheless, the lack of a fully accessible datasets and open-source algorithms is slowing the improvements in this field. In this paper, we focus on the fully automatic segmentation of the Inferior Alveolar Canal (IAC), which is of immense interest in the dental and maxillo-facial surgeries. Conventionally, only a bidimensional annotation of the IAC is used in common clinical practice. A reliable convolutional neural network (CNNs) might be timesaving in daily practice and improve the quality of assistance. Materials and methods: Cone Beam Computed Tomography (CBCT) volumes obtained from a single radiological center using the same machine were gathered and annotated. The course of the IAC was annotated on the CBCT volumes. A secondary dataset with sparse annotations and a primary dataset with both dense and sparse annotations were generated. Three separate experiments were conducted in order to evaluate the CNN. The IoU and Dice scores of every experiment were recorded as the primary endpoint, while the time needed to achieve the annotation was assessed as the secondary end-point. Results: A total of 347 CBCT volumes were collected, then divided into primary and secondary datasets. Among the three experiments, an IoU score of 0.64 and a Dice score of 0.79 were obtained thanks to the pre-training of the CNN on the secondary dataset and the creation of a novel deep label propagation model, followed by proper training on the primary dataset. To the best of our knowledge, these results are the best ever published in the segmentation of the IAC. The datasets is publicly available and algorithm is published as open-source software. On average, the CNN could produce a 3D annotation of the IAC in 6.33 s, compared to 87.3 s needed by the radiology technician to produce a bidimensional annotation. Conclusions: To resume, the following achievements have been reached. A new state of the art in terms of Dice score was achieved, overcoming the threshold commonly considered of 0.75 for the use in clinical practice. The CNN could fully automatically produce accurate three-dimensional segmentation of the IAC in a rapid setting, compared to the bidimensional annotations commonly used in the clinical practice and generated in a time-consuming manner. We introduced our innovative deep label propagation method to optimize the performance of the CNN in the segmentation of the IAC. For the first time in this field, the datasets and the source codes used were publicly released, granting reproducibility of the experiments and helping in the improvement of IAC segmentation. Full article
(This article belongs to the Special Issue Current Advances in Dentistry)
Show Figures

Figure 1

13 pages, 2344 KB  
Article
MTGEA: A Multimodal Two-Stream GNN Framework for Efficient Point Cloud and Skeleton Data Alignment
by Gawon Lee and Jihie Kim
Sensors 2023, 23(5), 2787; https://doi.org/10.3390/s23052787 - 3 Mar 2023
Cited by 9 | Viewed by 3807
Abstract
Because of societal changes, human activity recognition, part of home care systems, has become increasingly important. Camera-based recognition is mainstream but has privacy concerns and is less accurate under dim lighting. In contrast, radar sensors do not record sensitive information, avoid the invasion [...] Read more.
Because of societal changes, human activity recognition, part of home care systems, has become increasingly important. Camera-based recognition is mainstream but has privacy concerns and is less accurate under dim lighting. In contrast, radar sensors do not record sensitive information, avoid the invasion of privacy, and work in poor lighting. However, the collected data are often sparse. To address this issue, we propose a novel Multimodal Two-stream GNN Framework for Efficient Point Cloud and Skeleton Data Alignment (MTGEA), which improves recognition accuracy through accurate skeletal features from Kinect models. We first collected two datasets using the mmWave radar and Kinect v4 sensors. Then, we used zero-padding, Gaussian Noise (GN), and Agglomerative Hierarchical Clustering (AHC) to increase the number of collected point clouds to 25 per frame to match the skeleton data. Second, we used Spatial Temporal Graph Convolutional Network (ST-GCN) architecture to acquire multimodal representations in the spatio-temporal domain focusing on skeletal features. Finally, we implemented an attention mechanism aligning the two multimodal features to capture the correlation between point clouds and skeleton data. The resulting model was evaluated empirically on human activity data and shown to improve human activity recognition with radar data only. All datasets and codes are available in our GitHub. Full article
(This article belongs to the Special Issue Artificial Intelligence and Deep Learning in Sensors and Applications)
Show Figures

Figure 1

17 pages, 5094 KB  
Article
Rain Removal of Single Image Based on Directional Gradient Priors
by Shuying Huang, Yating Xu, Mingyang Ren, Yong Yang and Weiguo Wan
Appl. Sci. 2022, 12(22), 11628; https://doi.org/10.3390/app122211628 - 16 Nov 2022
Cited by 5 | Viewed by 2369
Abstract
Images taken on rainy days often lose a significant amount of detailed information owing to the coverage of rain streaks, which interfere with the recognition and detection of the intelligent vision systems. It is, therefore, extremely important to recover clean rain-free images from [...] Read more.
Images taken on rainy days often lose a significant amount of detailed information owing to the coverage of rain streaks, which interfere with the recognition and detection of the intelligent vision systems. It is, therefore, extremely important to recover clean rain-free images from the rain images. In this paper, we propose a rain removal method based on directional gradient priors, which aims to retain the structural information of the original rain image to the greatest extent possible while removing the rain streaks. First, to solve the problem of residual rain streaks, on the basis of the sparse convolutional coding model, two directional gradient regularization terms are proposed to constrain the direction information of the rain stripe. Then, for the rain layer coding in the directional gradient prior terms, a multi-scale dictionary is designed for convolutional sparse coding to detect rain stripes of different widths. Finally, to obtain a more accurate solution, the alternating direction method of multipliers (ADMM) is used to update the multi-scale dictionary and coding coefficients alternately to obtain a rainless image with rich details. Finally, experiments verify that the proposed algorithm achieves good results both subjectively and objectively. Full article
(This article belongs to the Special Issue Image Enhancement and Restoration Based on Deep Learning Technology)
Show Figures

Figure 1

Back to TopTop