Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (21)

Search Parameters:
Keywords = middle-level feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 2794 KB  
Article
Estimating Soil Moisture Content in Winter Wheat in Southern Xinjiang by Fusing UAV Texture Feature with Novel Three-Dimensional Texture Indexes
by Tao Sun, Zhijun Li, Zijun Tang, Wei Zhang, Wangyang Li, Zhiying Liu, Jinqi Wu, Shiqi Liu, Youzhen Xiang and Fucang Zhang
Plants 2025, 14(19), 2948; https://doi.org/10.3390/plants14192948 - 23 Sep 2025
Cited by 2 | Viewed by 780
Abstract
Winter wheat is a major staple crop worldwide, and real-time monitoring of soil moisture content (SMC) is critical for yield security. Targeting the monitoring needs under arid conditions in southern Xinjiang, this study proposes a UAV multispectral-based SMC estimation method that constructs novel [...] Read more.
Winter wheat is a major staple crop worldwide, and real-time monitoring of soil moisture content (SMC) is critical for yield security. Targeting the monitoring needs under arid conditions in southern Xinjiang, this study proposes a UAV multispectral-based SMC estimation method that constructs novel three-dimensional (3-D) texture indices. Field experiments were conducted over two consecutive growing seasons in Kunyu City, southern Xinjiang, China, with four irrigation and four fertilization levels. High-resolution multispectral imagery was acquired at the jointing stage using a UAV-mounted camera. From the imagery, conventional texture features were extracted, and six two-dimensional (2-D) and four 3-D texture indices were constructed. A correlation matrix approach was used to screen feature combinations significantly associated with SMC. Random forest (RF), partial least squares regression (PLSR), and back-propagation neural networks (BPNN) were then used to develop SMC models for three soil depths (0–20, 20–40, and 40–60 cm). Results showed that estimation accuracy for the shallow layer (0–20 cm) was markedly higher than for the middle and deep layers. Under single-source input, using 3-D texture indices (Combination 3) with RF achieved the best shallow-layer performance (validation R2 = 0.827, RMSE = 0.534, MRE = 2.686%). With multi-source fusion inputs (Combination 7: texture features + 2-D texture indices + 3-D texture indices) combined with RF, shallow-layer SMC estimation further improved (R2 = 0.890, RMSE = 0.395, MRE = 1.91%). Relative to models using only conventional texture features, fusion increased R2 by approximately 11.4%, 11.7%, and 18.1% for the shallow, middle, and deep layers, respectively. The findings indicate that 3-D texture indices (e.g., DTTI), which integrate multi-band texture information, more comprehensively capture canopy spatial structure and are more sensitive to shallow-layer moisture dynamics. Multi-source fusion provides complementary information and substantially enhances model accuracy. The proposed approach offers a new pathway for accurate SMC monitoring in arid croplands and is of practical significance for remote sensing-based moisture estimation and precision irrigation. Full article
Show Figures

Figure 1

18 pages, 5623 KB  
Article
Rapid and Quantitative Prediction of Tea Pigments Content During the Rolling of Black Tea by Multi-Source Information Fusion and System Analysis Methods
by Hanting Zou, Ranyang Li, Xuan Xuan, Yongwen Jiang, Haibo Yuan and Ting An
Foods 2025, 14(16), 2829; https://doi.org/10.3390/foods14162829 - 15 Aug 2025
Viewed by 828
Abstract
Efficient and convenient intelligent online detection methods can provide important technical support for the standardization of processing flow in the tea industry. Hence, this study focuses on the key chemical indicators—tea pigments in the rolling process of black tea as the research object, [...] Read more.
Efficient and convenient intelligent online detection methods can provide important technical support for the standardization of processing flow in the tea industry. Hence, this study focuses on the key chemical indicators—tea pigments in the rolling process of black tea as the research object, and uses multi-source information fusion methods to predict the changes of tea pigments content. Firstly, the tea pigments content of the samples under different rolling time series of black tea is determined by system analysis methods. Secondly, the spectra and images of the corresponding samples under different rolling time series are simultaneously obtained through the portable near-infrared spectrometer and the machine vision system. Then, by extracting the principal components of the image feature information and screening characteristic wavelengths from the spectral information, low-level and middle-level data fusion strategies are chosen to effectively integrate sensor data from different sources. At last, the linear (PLSR) and nonlinear (SVR and LSSVR) models are established respectively based on the different characteristic data information. The research results show that the LSSVR based on middle-level data fusion strategy have the best effect. In the prediction results of theaflavins, thearubigins, and theabrownins, the correlation coefficients of the testing sets are all greater than 0.98, and the relative percentage deviations are all greater than 5. The complementary fusion of the spectrum and image information effectively compensates for the problems of information redundancy and feature missing in the quantitative analysis of tea pigments content using the single-modal data information. Full article
Show Figures

Figure 1

14 pages, 8051 KB  
Article
Evaluation of Withering Quality of Black Tea Based on Multi-Information Fusion Strategy
by Ting An, Yongwen Jiang, Hanting Zou, Xuan Xuan, Jian Zhang and Haibo Yuan
Foods 2025, 14(9), 1442; https://doi.org/10.3390/foods14091442 - 22 Apr 2025
Cited by 2 | Viewed by 3224
Abstract
The intelligent perception of moisture content (MC) for tea leaves during the black tea withering process is an unsolved task because of the acquisition of limited sample characteristic information. In this study, both the external and internal features of withering samples were simultaneously [...] Read more.
The intelligent perception of moisture content (MC) for tea leaves during the black tea withering process is an unsolved task because of the acquisition of limited sample characteristic information. In this study, both the external and internal features of withering samples were simultaneously acquired based on near-infrared spectroscopy (NIRS) and machine vision (MV) technology. Different data fusion strategies, including low-, middle- and high-level strategies, were employed to integrate two types of heterogeneous information. Subsequently, the different fused features were combined with a support vector regression (SVR) algorithm to establish the moisture perception models of withering leaves. The middle-level-variable iterative space shrinkage approach (VISSA) displayed the best performance with 5.7705 for the relative percent deviation (RPD). Therefore, the proposed multi-information fusion strategy could achieve an intelligent perception of tea leaves in the black tea withering process. The integration of NIRS and MV technology overcomes the limitations of single-technology approaches in black tea withering assessment, providing a robust methodology for precision processing and targeted quality control of black tea. Full article
Show Figures

Figure 1

21 pages, 5838 KB  
Article
A Study on the Spatial Perception and Inclusive Characteristics of Outdoor Activity Spaces in Residential Areas for Diverse Populations from the Perspective of All-Age Friendly Design
by Biao Yin, Lijun Wang, Yuan Xu and Kiang Chye Heng
Buildings 2025, 15(6), 895; https://doi.org/10.3390/buildings15060895 - 13 Mar 2025
Cited by 5 | Viewed by 2395
Abstract
With the transformation of urban development patterns and profound changes in population structure in China, outdoor activity spaces in residential areas are facing common issues such as obsolete infrastructure, insufficient barrier-free facilities, and intergenerational conflicts, which severely impact residents’ quality of life and [...] Read more.
With the transformation of urban development patterns and profound changes in population structure in China, outdoor activity spaces in residential areas are facing common issues such as obsolete infrastructure, insufficient barrier-free facilities, and intergenerational conflicts, which severely impact residents’ quality of life and hinder high-quality urban development. Guided by the principles of all-age friendly and inclusive design, this study innovatively integrates eye-tracking and multi-modal physiological monitoring technologies to collect both subjective and objective perception data of different age groups regarding outdoor activity spaces in residential areas through human factor experiments and empirical interviews. Machine learning methods are utilized to analyze the data, uncovering the differentiated response mechanisms among diverse groups and clarifying the inclusive characteristics of these spaces. The findings reveal that: (1) Common Demands: All groups prioritize spatial features such as unobstructed views, adequate space, diverse landscapes, proximity accessibility, and smooth pavement surfaces, with similar levels of concern. (2) Differentiated Characteristics: Children place greater emphasis on environmental familiarity and children’s play facilities, while middle-aged and elderly groups show heightened concern for adequate space, efficient parking management, and barrier-free facilities. (3) Technical Validation: Heart Rate Variability (HRV) was identified as the core perception indicator for spatial inclusivity through dimensionality reduction using Self-Organizing Maps (SOM), and the Extra Trees model demonstrated superior performance in spatial inclusivity prediction. By integrating multi-group perception data, standardizing experimental environments, and applying intelligent data mining, this study achieves multi-modal data fusion and in-depth analysis, providing theoretical and methodological support for precisely optimizing outdoor activity spaces in residential areas and advancing the development of all-age friendly communities. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

19 pages, 8648 KB  
Article
Automatic Extraction of Water Body from SAR Images Considering Enhanced Feature Fusion and Noise Suppression
by Meijun Gao, Wenjie Dong, Lifu Chen and Zhongwu Wu
Appl. Sci. 2025, 15(5), 2366; https://doi.org/10.3390/app15052366 - 22 Feb 2025
Cited by 3 | Viewed by 1255
Abstract
Water extraction from Synthetic Aperture Radar (SAR) images is crucial for water resource management and maintaining the sustainability of ecosystems. Though great progress has been achieved, there are still some challenges, such as an insufficient ability to extract water edge details, an inability [...] Read more.
Water extraction from Synthetic Aperture Radar (SAR) images is crucial for water resource management and maintaining the sustainability of ecosystems. Though great progress has been achieved, there are still some challenges, such as an insufficient ability to extract water edge details, an inability to detect small water bodies, and a weak ability to suppress background noise. To address these problems, we propose the Global Context Attention Feature Fusion Network (GCAFF-Net) in this article. It includes an encoder module for hierarchical feature extraction and a decoder module for merging multi-scale features. The encoder utilizes ResNet-101 as the backbone network to generate four-level features of different resolutions. In the middle-level feature fusion stage, the Attention Feature Fusion module (AFFM) is presented for multi-scale feature learning to improve the performance of fine water segmentation. In the advanced feature encoding stage, the Global Context Atrous Spatial Pyramid Pooling (GCASPP) is constructed to adaptively integrate the water information in SAR images from a global perspective, thereby enhancing the network’s ability to express water boundaries. In the decoder module, an attention modulation module (AMM) is introduced to rearrange the distribution of feature importance from the channel-space sequence perspective, so as to better extract the detailed features of water bodies. In the experiment, SAR images from Sentinel-1 system are utilized, and three different water areas with different features and scales are selected for independent testing. The Pixel Accuracy (PA) and Intersection over Union (IoU) values for water extraction are 95.24% and 91.63%, respectively. The results indicate that the network can extract more integral water edges and better detailed features, enhancing the accuracy and generalization of water body extraction. Compared with the several existing classical semantic segmentation models, GCAFF-Net embodies superior performance, which can also be used for typical target segmentation from SAR images. Full article
Show Figures

Figure 1

31 pages, 6413 KB  
Article
Noise-to-Convex: A Hierarchical Framework for SAR Oriented Object Detection via Scattering Keypoint Feature Fusion and Convex Contour Refinement
by Shuoyang Liu, Ming Tong, Bokun He, Jiu Jiang and Chu He
Electronics 2025, 14(3), 569; https://doi.org/10.3390/electronics14030569 - 31 Jan 2025
Cited by 1 | Viewed by 1007
Abstract
Oriented object detection has become a hot topic in SAR image interpretation. Due to the unique imaging mechanism, SAR objects are represented as clusters of scattering points surrounded by coherent speckle noise, leading to blurred outlines and increased false alarms in complex scenes. [...] Read more.
Oriented object detection has become a hot topic in SAR image interpretation. Due to the unique imaging mechanism, SAR objects are represented as clusters of scattering points surrounded by coherent speckle noise, leading to blurred outlines and increased false alarms in complex scenes. To address these challenges, we propose a novel noise-to-convex detection paradigm with a hierarchical framework based on the scattering-keypoint-guided diffusion detection transformer (SKG-DDT), which consists of three levels. At the bottom level, the strong-scattering-region generation (SSRG) module constructs the spatial distribution of strong scattering regions via a diffusion model, enabling the direct identification of approximate object regions. At the middle level, the scattering-keypoint feature fusion (SKFF) module dynamically locates scattering keypoints across multiple scales, capturing their spatial and structural relationships with the attention mechanism. Finally, the convex contour prediction (CCP) module at the top level refines the object outline by predicting fine-grained convex contours. Furthermore, we unify the three-level framework into an end-to-end pipeline via a detection transformer. The proposed method was comprehensively evaluated on three public SAR datasets, including HRSID, RSDD-SAR, and SAR-Aircraft-v1.0. The experimental results demonstrate that the proposed method attains an AP50 of 86.5%, 92.7%, and 89.2% on these three datasets, respectively, which is an increase of 0.7%, 0.6%, and 1.0% compared to the existing state-of-the-art method. These results indicate that our approach outperforms existing algorithms across multiple object categories and diverse scenes. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

19 pages, 15351 KB  
Article
A Deep-Learning-Based Algorithm for Landslide Detection over Wide Areas Using InSAR Images Considering Topographic Features
by Ning Li, Guangcai Feng, Yinggang Zhao, Zhiqiang Xiong, Lijia He, Xiuhua Wang, Wenxin Wang and Qi An
Sensors 2024, 24(14), 4583; https://doi.org/10.3390/s24144583 - 15 Jul 2024
Cited by 9 | Viewed by 5449
Abstract
The joint action of human activities and environmental changes contributes to the frequent occurrence of landslide, causing major hazards. Using Interferometric Synthetic Aperture Radar (InSAR) technique enables the detailed detection of surface deformation, facilitating early landslide detection. The growing availability of SAR data [...] Read more.
The joint action of human activities and environmental changes contributes to the frequent occurrence of landslide, causing major hazards. Using Interferometric Synthetic Aperture Radar (InSAR) technique enables the detailed detection of surface deformation, facilitating early landslide detection. The growing availability of SAR data and the development of artificial intelligence have spurred the integration of deep learning methods with InSAR for intelligent geological identification. However, existing studies using deep learning methods to detect landslides in InSAR deformation often rely on single InSAR data, which leads to the presence of other types of geological hazards in the identification results and limits the accuracy of landslide identification. Landslides are affected by many factors, especially topographic features. To enhance the accuracy of landslide identification, this study improves the existing geological hazard detection model and proposes a multi-source data fusion network termed MSFD-Net. MSFD-Net employs a pseudo-Siamese network without weight sharing, enabling the extraction of texture features from the wrapped deformation data and topographic features from topographic data, which are then fused in higher-level feature layers. We conducted comparative experiments on different networks and ablation experiments, and the results show that the proposed method achieved the best performance. We applied our method to the middle and upper reaches of the Yellow River in eastern Qinghai Province, China, and obtained deformation rates using Sentinel-1 SAR data from 2018 to 2020 in the region, ultimately identifying 254 landslides. Quantitative evaluations reveal that most detected landslides in the study area occurred at an elevation of 2500–3700 m with slope angles of 10–30°. The proposed landslide detection algorithm holds significant promise for quickly and accurately detecting wide-area landslides, facilitating timely preventive and control measures. Full article
Show Figures

Figure 1

16 pages, 11469 KB  
Article
MTUW-GAN: A Multi-Teacher Knowledge Distillation Generative Adversarial Network for Underwater Image Enhancement
by Tianchi Zhang and Yuxuan Liu
Appl. Sci. 2024, 14(2), 529; https://doi.org/10.3390/app14020529 - 8 Jan 2024
Cited by 4 | Viewed by 3131
Abstract
Underwater imagery is plagued by issues such as image blurring and color distortion, which significantly impede the detection and operational capabilities of underwater robots, specifically Autonomous Underwater Vehicles (AUVs). Previous approaches to image fusion or multi-scale feature fusion based on deep learning necessitated [...] Read more.
Underwater imagery is plagued by issues such as image blurring and color distortion, which significantly impede the detection and operational capabilities of underwater robots, specifically Autonomous Underwater Vehicles (AUVs). Previous approaches to image fusion or multi-scale feature fusion based on deep learning necessitated multi-branch image preprocessing prior to merging through fusion modules. However, these methods have intricate network structures and a high demand for computational resources, rendering them unsuitable for deployment on AUVs, which have limited resources at their disposal. To tackle these challenges, we propose a multi-teacher knowledge distillation GAN for underwater image enhancement (MTUW-GAN). Our approach entails multiple teacher networks instructing student networks simultaneously, enabling them to enhance color and detail in degraded images from various perspectives, thus achieving an image-fusion-level performance. Additionally, we employ middle layer channel distillation in conjunction with the attention mechanism to extract and transfer rich middle layer feature information from the teacher model to the student model. By eliminating multiplexed branching and fusion modules, our lightweight student model can directly generate enhanced underwater images through model compression. Furthermore, we introduce a multimodal objective enhancement function to refine the overall framework training, striking a balance between a low computational effort and high-quality image enhancement. Experimental results, obtained by comparing our method with existing approaches, demonstrate the clear advantages of our proposed method in terms of visual quality, model parameters, and real-time performance. Consequently, our method serves as an effective solution for real-time underwater image enhancement, specifically tailored for deployment on AUVs. Full article
Show Figures

Figure 1

28 pages, 22162 KB  
Article
MF-DCMANet: A Multi-Feature Dual-Stage Cross Manifold Attention Network for PolSAR Target Recognition
by Feng Li, Chaoqi Zhang, Xin Zhang and Yang Li
Remote Sens. 2023, 15(9), 2292; https://doi.org/10.3390/rs15092292 - 26 Apr 2023
Cited by 7 | Viewed by 3077
Abstract
The distinctive polarization information of polarimetric SAR (PolSAR) has been widely applied to terrain classification but is rarely used for PolSAR target recognition. The target recognition strategies built upon multi-feature have gained favor among researchers due to their ability to provide diverse classification [...] Read more.
The distinctive polarization information of polarimetric SAR (PolSAR) has been widely applied to terrain classification but is rarely used for PolSAR target recognition. The target recognition strategies built upon multi-feature have gained favor among researchers due to their ability to provide diverse classification information. The paper introduces a robust multi-feature cross-fusion approach, i.e., a multi-feature dual-stage cross manifold attention network, namely, MF-DCMANet, which essentially relies on the complementary information between different features to enhance the representation ability of targets. In the first-stage process, a Cross-Feature-Network (CFN) module is proposed to mine the middle-level semantic information of monogenic features and polarization features extracted from the PolSAR target. In the second-stage process, a Cross-Manifold-Attention (CMA) transformer is proposed, which takes the input features represented on the Grassmann manifold to mine the nonlinear relationship between features so that rich and fine-grained features can be captured to compute attention weight. Furthermore, a local window is used instead of the global window in the attention mechanism to improve the local feature representation capabilities and reduce the computation. The proposed MF-DCMANet achieves competitive performance on the GOTCHA dataset, with a recognition accuracy of 99.75%. Furthermore, it maintains a high accuracy rate in the few-shot recognition and open-set recognition scenarios, outperforming the current state-of-the-art method by about 2%. Full article
(This article belongs to the Special Issue Pattern Recognition in Remote Sensing)
Show Figures

Graphical abstract

16 pages, 4836 KB  
Article
A Deep Learning Workflow for Mass-Forming Intrahepatic Cholangiocarcinoma and Hepatocellular Carcinoma Classification Based on MRI
by Yangling Liu, Bin Wang, Xiao Mo, Kang Tang, Jianfeng He and Jingang Hao
Curr. Oncol. 2023, 30(1), 529-544; https://doi.org/10.3390/curroncol30010042 - 30 Dec 2022
Cited by 10 | Viewed by 4056
Abstract
Objective: Precise classification of mass-forming intrahepatic cholangiocarcinoma (MF-ICC) and hepatocellular carcinoma (HCC) based on magnetic resonance imaging (MRI) is crucial for personalized treatment strategy. The purpose of the present study was to differentiate MF-ICC from HCC applying a novel deep-learning-based workflow with stronger [...] Read more.
Objective: Precise classification of mass-forming intrahepatic cholangiocarcinoma (MF-ICC) and hepatocellular carcinoma (HCC) based on magnetic resonance imaging (MRI) is crucial for personalized treatment strategy. The purpose of the present study was to differentiate MF-ICC from HCC applying a novel deep-learning-based workflow with stronger feature extraction ability and fusion capability to improve the classification performance of deep learning on small datasets. Methods: To retain more effective lesion features, we propose a preprocessing method called semi-segmented preprocessing (Semi-SP) to select the region of interest (ROI). Then, the ROIs were sent to the strided feature fusion residual network (SFFNet) for training and classification. The SFFNet model is composed of three parts: the multilayer feature fusion module (MFF) was proposed to extract discriminative features of MF-ICC/HCC and integrate features of different levels; a new stationary residual block (SRB) was proposed to solve the problem of information loss and network instability during training; the attention mechanism convolutional block attention module (CBAM) was adopted in the middle layer of the network to extract the correlation of multi-spatial feature information, so as to filter the irrelevant feature information in pixels. Results: The SFFNet model achieved an overall accuracy of 92.26% and an AUC of 0.9680, with high sensitivity (86.21%) and specificity (94.70%) for MF-ICC. Conclusion: In this paper, we proposed a specifically designed Semi-SP method and SFFNet model to differentiate MF-ICC from HCC. This workflow achieves good MF-ICC/HCC classification performance due to stronger feature extraction and fusion capabilities, which provide complementary information for personalized treatment strategy. Full article
(This article belongs to the Special Issue Machine Learning for Imaging-Based Cancer Diagnostics)
Show Figures

Figure 1

24 pages, 4408 KB  
Article
An Improved Human-Body-Segmentation Algorithm with Attention-Based Feature Fusion and a Refined Stereo-Matching Scheme Working at the Sub-Pixel Level for the Anthropometric System
by Lei Yang, Xiaoyu Guo, Xiaowei Song, Deyuan Lu, Wenjing Cai and Zixiang Xiong
Entropy 2022, 24(11), 1647; https://doi.org/10.3390/e24111647 - 13 Nov 2022
Viewed by 2539
Abstract
This paper proposes an improved human-body-segmentation algorithm with attention-based feature fusion and a refined corner-based feature-point design with sub-pixel stereo matching for the anthropometric system. In the human-body-segmentation algorithm, four CBAMs are embedded in the four middle convolution layers of the backbone network [...] Read more.
This paper proposes an improved human-body-segmentation algorithm with attention-based feature fusion and a refined corner-based feature-point design with sub-pixel stereo matching for the anthropometric system. In the human-body-segmentation algorithm, four CBAMs are embedded in the four middle convolution layers of the backbone network (ResNet101) of PSPNet to achieve better feature fusion in space and channels, so as to improve accuracy. The common convolution in the residual blocks of ResNet101 is substituted by group convolution to reduce model parameters and computational cost, thereby optimizing efficiency. For the stereo-matching scheme, a corner-based feature point is designed to obtain the feature-point coordinates at sub-pixel level, so that precision is refined. A regional constraint is applied according to the characteristic of the checkerboard corner points, thereby reducing complexity. Experimental results demonstrated that the anthropometric system with the proposed CBAM-based human-body-segmentation algorithm and corner-based stereo-matching scheme can significantly outperform the state-of-the-art system in accuracy. It can also meet the national standards GB/T 2664-2017, GA 258-2009 and GB/T 2665-2017; and the textile industry standards FZ/T 73029-2019, FZ/T 73017-2014, FZ/T 73059-2017 and FZ/T 73022-2019. Full article
(This article belongs to the Special Issue Advances in Image Fusion)
Show Figures

Figure 1

23 pages, 7799 KB  
Article
Multi-Scale Hybrid Network for Polyp Detection in Wireless Capsule Endoscopy and Colonoscopy Images
by Meryem Souaidi and Mohamed El Ansari
Diagnostics 2022, 12(8), 2030; https://doi.org/10.3390/diagnostics12082030 - 22 Aug 2022
Cited by 24 | Viewed by 3354
Abstract
The trade-off between speed and precision is a key step in the detection of small polyps in wireless capsule endoscopy (WCE) images. In this paper, we propose a hybrid network of an inception v4 architecture-based single-shot multibox detector (Hyb-SSDNet) to detect small polyp [...] Read more.
The trade-off between speed and precision is a key step in the detection of small polyps in wireless capsule endoscopy (WCE) images. In this paper, we propose a hybrid network of an inception v4 architecture-based single-shot multibox detector (Hyb-SSDNet) to detect small polyp regions in both WCE and colonoscopy frames. Medical privacy concerns are considered the main barriers to WCE image acquisition. To satisfy the object detection requirements, we enlarged the training datasets and investigated deep transfer learning techniques. The Hyb-SSDNet framework adopts inception blocks to alleviate the inherent limitations of the convolution operation to incorporate contextual features and semantic information into deep networks. It consists of four main components: (a) multi-scale encoding of small polyp regions, (b) using the inception v4 backbone to enhance more contextual features in shallow and middle layers, and (c) concatenating weighted features of mid-level feature maps, giving them more importance to highly extract semantic information. Then, the feature map fusion is delivered to the next layer, followed by some downsampling blocks to generate new pyramidal layers. Finally, the feature maps are fed to multibox detectors, consistent with the SSD process-based VGG16 network. The Hyb-SSDNet achieved a 93.29% mean average precision (mAP) and a testing speed of 44.5 FPS on the WCE dataset. This work proves that deep learning has the potential to develop future research in polyp detection and classification tasks. Full article
Show Figures

Figure 1

18 pages, 7722 KB  
Article
Dual-Coupled CNN-GCN-Based Classification for Hyperspectral and LiDAR Data
by Lei Wang and Xili Wang
Sensors 2022, 22(15), 5735; https://doi.org/10.3390/s22155735 - 31 Jul 2022
Cited by 17 | Viewed by 4585
Abstract
Deep learning techniques have brought substantial performance gains to remote sensing image classification. Among them, convolutional neural networks (CNN) can extract rich spatial and spectral features from hyperspectral images in a short-range region, whereas graph convolutional networks (GCN) can model middle- and long-range [...] Read more.
Deep learning techniques have brought substantial performance gains to remote sensing image classification. Among them, convolutional neural networks (CNN) can extract rich spatial and spectral features from hyperspectral images in a short-range region, whereas graph convolutional networks (GCN) can model middle- and long-range spatial relations (or structural features) between samples on their graph structure. These different features make it possible to classify remote sensing images finely. In addition, hyperspectral images and light detection and ranging (LiDAR) images can provide spatial-spectral information and elevation information of targets on the Earth’s surface, respectively. These multi-source remote sensing data can further improve classification accuracy in complex scenes. This paper proposes a classification method for HS and LiDAR data based on a dual-coupled CNN-GCN structure. The model can be divided into a coupled CNN and a coupled GCN. The former employs a weight-sharing mechanism to structurally fuse and simplify the dual CNN models and extracting the spatial features from HS and LiDAR data. The latter first concatenates the HS and LiDAR data to construct a uniform graph structure. Then, the dual GCN models perform structural fusion by sharing the graph structures and weight matrices of some layers to extract their structural information, respectively. Finally, the final hybrid features are fed into a standard classifier for the pixel-level classification task under a unified feature fusion module. Extensive experiments on two real-world hyperspectral and LiDAR data demonstrate the effectiveness and superiority of the proposed method compared to other state-of-the-art baseline methods, such as two-branch CNN and context CNN. In particular, the overall accuracy (99.11%) on Trento achieves the best classification performance reported so far. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

15 pages, 4350 KB  
Article
Optimized Score Level Fusion for Multi-Instance Finger Vein Recognition
by Jackson Horlick Teng, Thian Song Ong, Tee Connie, Kalaiarasi Sonai Muthu Anbananthen and Pa Pa Min
Algorithms 2022, 15(5), 161; https://doi.org/10.3390/a15050161 - 11 May 2022
Cited by 6 | Viewed by 3649
Abstract
The finger vein recognition system uses blood vessels inside the finger of an individual for identity verification. The public is in favor of a finger vein recognition system over conventional passwords or ID cards as the biometric technology is harder to forge, misplace, [...] Read more.
The finger vein recognition system uses blood vessels inside the finger of an individual for identity verification. The public is in favor of a finger vein recognition system over conventional passwords or ID cards as the biometric technology is harder to forge, misplace, and share. In this study, the histogram of oriented gradients (HOG) features, which are robust against changes in illumination and position, are extracted from the finger vein for personal recognition. To further increase the amount of information that can be used for recognition, different instances of the finger vein, ranging from the index, middle, and ring finger are combined to form a multi-instance finger vein representation. This fusion approach is preferred since it can be performed without requiring additional sensors or feature extractors. To combine different instances of finger vein effectively, score level fusion is adopted to allow greater compatibility among the wide range of matches. Towards this end, two methods are proposed: Bayesian optimized support vector machine (SVM) score fusion (BSSF) and Bayesian optimized SVM based fusion (BSBF). The fusion results are incrementally improved by optimizing the hyperparameters of the HOG feature, SVM matcher, and the weighted sum of score level fusion using the Bayesian optimization approach. This is considered a kind of knowledge-based approach that takes into account the previous optimization attempts or trials to determine the next optimization trial, making it an efficient optimizer. By using stratified cross-validation in the training process, the proposed method is able to achieve the lowest EER of 0.48% and 0.22% for the SDUMLA-HMT dataset and UTFVP dataset, respectively. Full article
(This article belongs to the Special Issue Metaheuristic Algorithms and Applications)
Show Figures

Figure 1

19 pages, 10810 KB  
Article
A New Spatial–Temporal Depthwise Separable Convolutional Fusion Network for Generating Landsat 8-Day Surface Reflectance Time Series over Forest Regions
by Yuzhen Zhang, Jindong Liu, Shunlin Liang and Manyao Li
Remote Sens. 2022, 14(9), 2199; https://doi.org/10.3390/rs14092199 - 4 May 2022
Cited by 7 | Viewed by 2739
Abstract
Landsat has provided the longest fine resolution data archive of Earth’s environment since 1972; however, one of the challenges in using Landsat data for various applications is its frequent large data gaps and heavy cloud contaminations. One pressing research topic is to generate [...] Read more.
Landsat has provided the longest fine resolution data archive of Earth’s environment since 1972; however, one of the challenges in using Landsat data for various applications is its frequent large data gaps and heavy cloud contaminations. One pressing research topic is to generate the regular time series by integrating coarse-resolution satellite data through data fusion techniques. This study presents a novel spatiotemporal fusion (STF) method based on a depthwise separable convolutional neural network (DSC), namely, STFDSC, to generate Landsat-surface reflectance time series at 8-day intervals by fusing Landsat 30 m with high-quality Moderate Resolution Imaging Spectroradiometer (MODIS) 500 m surface reflectance data. The STFDSC method consists of three main stages: feature extraction, feature fusion and prediction. Features were first extracted from Landsat and MODIS surface reflectance changes, and the extracted multilevel features were then stacked and fused. Both low-level and middle-level features that were generally ignored in convolutional neural network (CNN)-based fusion models were included in STFDSC to avoid key information loss and thus ensure high prediction accuracy. The prediction stage generated a Landsat residual image and is combined with original Landsat data to obtain predictions of Landsat imagery at the target date. The performance of STFDSC was evaluated in the Greater Khingan Mountains (GKM) in Northeast China and the Ziwuling (ZWL) forest region in Northwest China. A comparison of STFDSC with four published fusion methods, including two classic fusion methods (FSDAF, ESTARFM) and two machine learning methods (EDCSTFN and STFNET), was also carried out. The results showed that STFDSC made stable and more accurate predictions of Landsat surface reflectance than other methods in both the GKM and ZWL regions. The root-mean-square-errors (RMSEs) of TM bands 2, 3, 4, and 7 were 0.0046, 0.0038, 0.0143, and 0.0055 in GKM, respectively, and 0.0246, 0.0176, 0.0280, and 0.0141 in ZWL, respectively; it can be potentially used for generating the global surface reflectance and other high-level land products. Full article
Show Figures

Graphical abstract

Back to TopTop