Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (18)

Search Parameters:
Keywords = multi-constraint fully convolutional networks

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 7561 KB  
Article
Satellite Optical Target Edge Detection Based on Knowledge Distillation
by Ying Meng, Luping Zhang, Yan Zhang, Moufa Hu, Fei Zhao and Xinglin Shen
Remote Sens. 2025, 17(17), 3008; https://doi.org/10.3390/rs17173008 - 29 Aug 2025
Viewed by 693
Abstract
Edge detection of space targets is vital in aerospace applications, such as satellite monitoring and analysis, yet it faces challenges due to diverse target shapes and complex backgrounds. While deep learning-based edge detection methods dominate due to their powerful feature representation capabilities, they [...] Read more.
Edge detection of space targets is vital in aerospace applications, such as satellite monitoring and analysis, yet it faces challenges due to diverse target shapes and complex backgrounds. While deep learning-based edge detection methods dominate due to their powerful feature representation capabilities, they often suffer from large parameter sizes and lack explicit geometric prior constraints for space targets. This paper proposes a novel edge detection method for satellite targets based on knowledge distillation, namely STED-KD. Firstly, a multi-stage distillation strategy is proposed to guide a lightweight, fully convolutional network with fewer parameters to learn key features and decision boundaries from a complex teacher model, achieving model efficiency. Next, a shape prior guidance module is integrated into the student branch, incorporating geometric shape information through shape prior model construction, similarity metric calculation, and feature reconstruction, enhancing adaptability to space targets and improving detection accuracy. Additionally, a curvature-guided edge loss function is designed to ensure continuous and complete edges, minimizing local discontinuities. Experimental results on the UESD space target dataset demonstrate superior performance, with ODS, OIS, and AP scores of 0.659, 0.715, and 0.596, respectively. On the BSDS500, STED-KD achieves ODS, OIS, and AP scores of 0.818, 0.829, and 0.850, respectively, demonstrating strong competitiveness and further confirming its stability. Full article
Show Figures

Figure 1

27 pages, 1220 KB  
Article
Robust Supervised Deep Discrete Hashing for Cross-Modal Retrieval
by Xiwei Dong, Fei Wu, Junqiu Zhai, Fei Ma, Guangxing Wang, Tao Liu, Xiaogang Dong and Xiao-Yuan Jing
Technologies 2025, 13(9), 383; https://doi.org/10.3390/technologies13090383 - 29 Aug 2025
Viewed by 509
Abstract
The exponential growth of multi-modal data in the real world poses significant challenges to efficient retrieval, and traditional single-modal methods are no longer suitable for the growth of multi-modal data. To address this issue, hashing retrieval methods play an important role in cross-modal [...] Read more.
The exponential growth of multi-modal data in the real world poses significant challenges to efficient retrieval, and traditional single-modal methods are no longer suitable for the growth of multi-modal data. To address this issue, hashing retrieval methods play an important role in cross-modal retrieval tasks when referring to a large amount of multi-modal data. However, effectively embedding multi-modal data into a common low-dimensional Hamming space remains challenging. A critical issue is that feature redundancies in existing methods lead to suboptimal hash codes, severely degrading retrieval performance; yet, selecting optimal features remains an open problem in deep cross-modal hashing. In this paper, we propose an end-to-end approach, named Robust Supervised Deep Discrete Hashing (RSDDH), which can accomplish feature learning and hashing learning simultaneously. RSDDH has a hybrid deep architecture consisting of a convolutional neural network and a multilayer perceptron adaptively learning modality-specific representations. Moreover, it utilizes a non-redundant feature selection strategy to select optimal features for generating discriminative hash codes. Furthermore, it employs a direct discrete hashing scheme (SVDDH) to solve the binary constraint optimization problem without relaxation, fully preserving the intrinsic properties of hash codes. Additionally, RSDDH employs inter-modal and intra-modal consistency preservation strategies to reduce the gap between modalities and improve the discriminability of learned Hamming space. Extensive experiments on four benchmark datasets demonstrate that RSDDH significantly outperforms state-of-the-art cross-modal hashing methods. Full article
(This article belongs to the Special Issue Image Analysis and Processing)
Show Figures

Figure 1

20 pages, 1688 KB  
Article
Spectrum Sensing for Noncircular Signals Using Augmented Covariance-Matrix-Aware Deep Convolutional Neural Network
by Songlin Chen, Zhenqing He, Wenze Song and Guohao Sun
Sensors 2025, 25(15), 4791; https://doi.org/10.3390/s25154791 - 4 Aug 2025
Viewed by 592
Abstract
This work investigates spectrum sensing in cognitive radio networks, where multi-antenna secondary users aim to detect the spectral occupancy of noncircular signals transmitted by primary users. Specifically, we propose a deep-learning-based spectrum sensing approach using an augmented covariance-matrix-aware convolutional neural network (CNN). The [...] Read more.
This work investigates spectrum sensing in cognitive radio networks, where multi-antenna secondary users aim to detect the spectral occupancy of noncircular signals transmitted by primary users. Specifically, we propose a deep-learning-based spectrum sensing approach using an augmented covariance-matrix-aware convolutional neural network (CNN). The core innovation of our approach lies in employing an augmented sample covariance matrix, which integrates both a standard covariance matrix and complementary covariance matrix, thereby fully exploiting the statistical properties of noncircular signals. By feeding augmented sample covariance matrices into the designed CNN architecture, the proposed approach effectively learns discriminative patterns from the underlying data structure, without stringent model constraints. Meanwhile, our approach eliminates the need for restrictive model assumptions and significantly enhances the detection performance by fully exploiting noncircular signal characteristics. Various experimental results demonstrate the significant performance improvement and generalization capability of the proposed approach compared to existing benchmark methods. Full article
Show Figures

Figure 1

26 pages, 4899 KB  
Article
SDDGRNets: Level–Level Semantically Decomposed Dynamic Graph Reasoning Network for Remote Sensing Semantic Change Detection
by Zhuli Xie, Gang Wan, Yunxia Yin, Guangde Sun and Dongdong Bu
Remote Sens. 2025, 17(15), 2641; https://doi.org/10.3390/rs17152641 - 30 Jul 2025
Cited by 1 | Viewed by 835
Abstract
Semantic change detection technology based on remote sensing data holds significant importance for urban and rural planning decisions and the monitoring of ground objects. However, simple convolutional networks are limited by the receptive field, cannot fully capture detailed semantic information, and cannot effectively [...] Read more.
Semantic change detection technology based on remote sensing data holds significant importance for urban and rural planning decisions and the monitoring of ground objects. However, simple convolutional networks are limited by the receptive field, cannot fully capture detailed semantic information, and cannot effectively perceive subtle changes and constrain edge information. Therefore, a dynamic graph reasoning network with layer-by-layer semantic decomposition for semantic change detection in remote sensing data is developed in response to these limitations. This network aims to understand and perceive subtle changes in the semantic content of remote sensing data from the image pixel level. On the one hand, low-level semantic information and cross-scale spatial local feature details are obtained by dividing subspaces and decomposing convolutional layers with significant kernel expansion. Semantic selection aggregation is used to enhance the characterization of global and contextual semantics. Meanwhile, the initial multi-scale local spatial semantics are screened and re-aggregated to improve the characterization of significant features. On the other hand, at the encoding stage, the weight-sharing approach is employed to align the positions of ground objects in the change area and generate more comprehensive encoding information. Meanwhile, the dynamic graph reasoning module is used to decode the encoded semantics layer by layer to investigate the hidden associations between pixels in the neighborhood. In addition, the edge constraint module is used to constrain boundary pixels and reduce semantic ambiguity. The weighted loss function supervises and optimizes each module separately to enable the network to acquire the optimal feature representation. Finally, experimental results on three open-source datasets, such as SECOND, HIUSD, and Landsat-SCD, show that the proposed method achieves good performance, with an SCD score reaching 35.65%, 98.33%, and 67.29%, respectively. Full article
Show Figures

Graphical abstract

19 pages, 10010 KB  
Article
MCANet: An Unsupervised Multi-Constraint Cascaded Attention Network for Accurate and Smooth Brain Medical Image Registration
by Min Huang, Haoyu Wang and Guanyu Ren
Appl. Sci. 2025, 15(9), 4629; https://doi.org/10.3390/app15094629 - 22 Apr 2025
Viewed by 544
Abstract
Brain medical image registration is a fundamental premise for the computer-assisted treatment of brain diseases. The brain is one of the most important and complex organs of the human body, and it is very challenging to perform accurate and fast registration on it. [...] Read more.
Brain medical image registration is a fundamental premise for the computer-assisted treatment of brain diseases. The brain is one of the most important and complex organs of the human body, and it is very challenging to perform accurate and fast registration on it. Aiming at the problem of voxel folding in the deformation field and low registration accuracy when facing complex and fine objects, this paper proposed a fully convolutional multi-constraint cascaded attention network (MCANet). The network is composed of two registration sub-network cascades and performs coarse-to-fine registration of input image pairs in an iterative manner. The registration subnetwork is called the dilated self-attention network (DSNet), which incorporates dilated convolution combinations with different dilation rates and attention gate modules. During the training of MCANet, a double regularization constraint was applied to punish, in a targeted manner, the excessive deformation problem, so that the network can generate relatively smooth deformation while having high registration accuracy. Experimental results on the Mindboggle101 dataset showed that the registration accuracy of MCANet was significantly better than several existing advanced registration methods, and the network can complete relatively smooth registration. Full article
Show Figures

Figure 1

22 pages, 2839 KB  
Article
Narrowband Radar Micromotion Targets Recognition Strategy Based on Graph Fusion Network Constructed by Cross-Modal Attention Mechanism
by Yuanjie Zhang, Ting Gao, Hongtu Xie, Haozong Liu, Mengfan Ge, Bin Xu, Nannan Zhu and Zheng Lu
Remote Sens. 2025, 17(4), 641; https://doi.org/10.3390/rs17040641 - 13 Feb 2025
Cited by 3 | Viewed by 931
Abstract
In the domain of micromotion target recognition, target characteristics can be extracted through either broadband or narrowband radar echoes. However, due to technical limitations and cost constraints in acquiring broadband radar waveform data, recognition can often only be performed through narrowband radar waveforms. [...] Read more.
In the domain of micromotion target recognition, target characteristics can be extracted through either broadband or narrowband radar echoes. However, due to technical limitations and cost constraints in acquiring broadband radar waveform data, recognition can often only be performed through narrowband radar waveforms. To fully utilize the information embedded within narrowband radar waveforms, it is necessary to conduct in-depth research on multi-dimensional features of micromotion targets, including radar cross-sections (RCSs), time frequency (TF) images, and cadence velocity diagrams (CVDs). To address the limitations of existing identification methodologies in achieving accurate recognition with narrowband echoes, this paper proposes a graph fusion network based on a cross-modal attention mechanism, termed GF-AM Net. The network first adopts convolutional neural networks (CNNs) to extract unimodal features from RCSs, TF images, and CVDs independently. Subsequently, a cross-modal attention mechanism integrates these extracted features into a graph structure, achieving multi-level interactions among unimodal, bimodal, and trimodal features. Finally, the fused features are input into a classification module to accomplish narrowband radar micromotion target identification. Experimental results demonstrate that the proposed methodology successfully captures potential correlations between modal features by incorporating cross-modal multi-level information interactions across different processing stages, exhibiting exceptional accuracy and robustness in narrowband radar micromotion target identification tasks. Full article
(This article belongs to the Special Issue Ocean Remote Sensing Based on Radar, Sonar and Optical Techniques)
Show Figures

Figure 1

28 pages, 3337 KB  
Article
Lung and Colon Cancer Classification Using Multiscale Deep Features Integration of Compact Convolutional Neural Networks and Feature Selection
by Omneya Attallah
Technologies 2025, 13(2), 54; https://doi.org/10.3390/technologies13020054 - 1 Feb 2025
Cited by 6 | Viewed by 2940
Abstract
The automated and precise classification of lung and colon cancer from histopathological photos continues to pose a significant challenge in medical diagnosis, as current computer-aided diagnosis (CAD) systems are frequently constrained by their dependence on singular deep learning architectures, elevated computational complexity, and [...] Read more.
The automated and precise classification of lung and colon cancer from histopathological photos continues to pose a significant challenge in medical diagnosis, as current computer-aided diagnosis (CAD) systems are frequently constrained by their dependence on singular deep learning architectures, elevated computational complexity, and their ineffectiveness in utilising multiscale features. To this end, the present research introduces a CAD system that integrates several lightweight convolutional neural networks (CNNs) with dual-layer feature extraction and feature selection to overcome the aforementioned constraints. Initially, it extracts deep attributes from two separate layers (pooling and fully connected) of three pre-trained CNNs (MobileNet, ResNet-18, and EfficientNetB0). Second, the system uses the benefits of canonical correlation analysis for dimensionality reduction in pooling layer attributes to reduce complexity. In addition, it integrates the dual-layer features to encapsulate both high- and low-level representations. Finally, to benefit from multiple deep network architectures while reducing classification complexity, the proposed CAD merges dual deep layer variables of the three CNNs and then applies the analysis of variance (ANOVA) and Chi-Squared for the selection of the most discriminative features from the integrated CNN architectures. The CAD is assessed on the LC25000 dataset leveraging eight distinct classifiers, encompassing various Support Vector Machine (SVM) variants, Decision Trees, Linear Discriminant Analysis, and k-nearest neighbours. The experimental results exhibited outstanding performance, attaining 99.8% classification accuracy with cubic SVM classifiers employing merely 50 ANOVA-selected features, exceeding the performance of individual CNNs while markedly diminishing computational complexity. The framework’s capacity to sustain exceptional accuracy with a limited feature set renders it especially advantageous for clinical applications where diagnostic precision and efficiency are critical. These findings confirm the efficacy of the multi-CNN, multi-layer methodology in enhancing cancer classification precision while mitigating the computational constraints of current systems. Full article
Show Figures

Figure 1

22 pages, 16731 KB  
Article
Advanced Global Prototypical Segmentation Framework for Few-Shot Hyperspectral Image Classification
by Kunming Xia, Guowu Yuan, Mengen Xia, Xiaosen Li, Jinkang Gui and Hao Zhou
Sensors 2024, 24(16), 5386; https://doi.org/10.3390/s24165386 - 21 Aug 2024
Cited by 2 | Viewed by 1896
Abstract
With the advancement of deep learning, related networks have shown strong performance for Hyperspectral Image (HSI) classification. However, these methods face two main challenges in HSI classification: (1) the inability to capture global information of HSI due to the restriction of patch input [...] Read more.
With the advancement of deep learning, related networks have shown strong performance for Hyperspectral Image (HSI) classification. However, these methods face two main challenges in HSI classification: (1) the inability to capture global information of HSI due to the restriction of patch input and (2) insufficient utilization of information from limited labeled samples. To overcome these challenges, we propose an Advanced Global Prototypical Segmentation (AGPS) framework. Within the AGPS framework, we design a patch-free feature extractor segmentation network (SegNet) based on a fully convolutional network (FCN), which processes the entire HSI to capture global information. To enrich the global information extracted by SegNet, we propose a Fusion of Lateral Connection (FLC) structure that fuses the low-level detailed features of the encoder output with the high-level features of the decoder output. Additionally, we propose an Atrous Spatial Pyramid Pooling-Position Attention (ASPP-PA) module to capture multi-scale spatial positional information. Finally, to explore more valuable information from limited labeled samples, we propose an advanced global prototypical representation learning strategy. Building upon the dual constraints of the global prototypical representation learning strategy, we introduce supervised contrastive learning (CL), which optimizes our network with three different constraints. The experimental results of three public datasets demonstrate that our method outperforms the existing state-of-the-art methods. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

28 pages, 3045 KB  
Article
LJCD-Net: Cross-Domain Jamming Generalization Diagnostic Network Based on Deep Adversarial Transfer
by Zhichao Zhang, Zhongliang Deng, Jingrong Liu, Zhenke Ding and Bingxun Liu
Sensors 2024, 24(11), 3266; https://doi.org/10.3390/s24113266 - 21 May 2024
Cited by 3 | Viewed by 1494
Abstract
Global Navigation Satellite Systems (GNSS) offer comprehensive position, navigation, and timing (PNT) estimates worldwide. Given the growing demand for reliable location awareness in both indoor and outdoor contexts, the advent of fifth-generation mobile communication technology (5G) has enabled expansive coverage and precise positioning [...] Read more.
Global Navigation Satellite Systems (GNSS) offer comprehensive position, navigation, and timing (PNT) estimates worldwide. Given the growing demand for reliable location awareness in both indoor and outdoor contexts, the advent of fifth-generation mobile communication technology (5G) has enabled expansive coverage and precise positioning services. However, the power received by the signal of interest (SOI) at terminals is notably low. This can lead to significant jamming, whether intentional or unintentional, which can adversely affect positioning receivers. The diagnosis of jamming types, such as classification, assists receivers in spectrum sensing and choosing effective mitigation strategies. Traditional jamming diagnosis methodologies predominantly depend on the expertise of classification experts, often demonstrating a lack of adaptability for diverse tasks. Recently, researchers have begun utilizing convolutional neural networks to re-conceptualize a jamming diagnosis as an image classification issue, thereby augmenting recognition performance. However, in real-world scenarios, the assumptions of independent and homogeneous distributions are frequently violated. This discrepancy between the source and target distributions frequently leads to subpar model performance on the test set or an inability to procure usable evaluation samples during training. In this paper, we introduce LJCD-Net, a deep adversarial migration-based cross-domain jamming generalization diagnostic network. LJCD-Net capitalizes on a fully labeled source domain and multiple unlabeled auxiliary domains to generate shared feature representations with generalization capabilities. Initially, our paper proposes an uncertainty-guided auxiliary domain labeling weighting strategy, which estimates the multi-domain sample uncertainty to re-weight the classification loss and specify the gradient optimization direction. Subsequently, from a probabilistic distribution standpoint, the spatial constraint imposed on the cross-domain global jamming time-frequency feature distribution facilitates the optimization of collaborative objectives. These objectives include minimizing both the source domain classification loss and auxiliary domain classification loss, as well as optimizing the inter-domain marginal probability and conditional probability distribution. Experimental results demonstrate that LJCD-Net enhances the recognition accuracy and confidence compared to five other diagnostic methods. Full article
Show Figures

Figure 1

25 pages, 6437 KB  
Article
A Refined Wind Power Forecasting Method with High Temporal Resolution Based on Light Convolutional Neural Network Architecture
by Fei Zhang, Xiaoying Ren and Yongqian Liu
Energies 2024, 17(5), 1183; https://doi.org/10.3390/en17051183 - 1 Mar 2024
Cited by 2 | Viewed by 1580
Abstract
With a large proportion of wind farms connected to the power grid, it brings more pressure on the stable operation of power systems in shorter time scales. Efficient and accurate scheduling, operation control and decision making require high time resolution power forecasting algorithms [...] Read more.
With a large proportion of wind farms connected to the power grid, it brings more pressure on the stable operation of power systems in shorter time scales. Efficient and accurate scheduling, operation control and decision making require high time resolution power forecasting algorithms with higher accuracy and real-time performance. In this paper, we innovatively propose a high temporal resolution wind power forecasting method based on a light convolutional architecture—DC_LCNN. The method starts from the source data and novelly designs the dual-channel data input mode to provide different combinations of feature data for the model, thus improving the upper limit of the learning ability of the whole model. The dual-channel convolutional neural network (CNN) structure extracts different spatial and temporal constraints of the input features. The light global maximum pooling method replaces the flat operation combined with the fully connected (FC) forecasting method in the traditional CNN, extracts the most significant features of the global method, and directly performs data downscaling at the same time, which significantly improves the forecasting accuracy and efficiency of the model. In this paper, the experiments are carried out on the 1 s resolution data of the actual wind field, and the single-step forecasting task with 1 s ahead of time and the multi-step forecasting task with 1~10 s ahead of time are executed, respectively. Comparing the experimental results with the classical deep learning models in the current field, the proposed model shows absolute accuracy advantages on both forecasting tasks. This also shows that the light architecture design based on simple deep learning models is also a good solution in performing high time resolution wind power forecasting tasks. Full article
(This article belongs to the Section A3: Wind, Wave and Tidal Energy)
Show Figures

Figure 1

20 pages, 15556 KB  
Article
Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
by Suyeon Jeon and Yong Seok Heo
Sensors 2022, 22(15), 5500; https://doi.org/10.3390/s22155500 - 23 Jul 2022
Cited by 5 | Viewed by 3066
Abstract
While recent deep learning-based stereo-matching networks have shown outstanding advances, there are still some unsolved challenges. First, most state-of-the-art stereo models employ 3D convolutions for 4D cost volume aggregation, which limit the deployment of networks for resource-limited mobile environments owing to heavy consumption [...] Read more.
While recent deep learning-based stereo-matching networks have shown outstanding advances, there are still some unsolved challenges. First, most state-of-the-art stereo models employ 3D convolutions for 4D cost volume aggregation, which limit the deployment of networks for resource-limited mobile environments owing to heavy consumption of computation and memory. Although there are some efficient networks, most of them still require a heavy computational cost to incorporate them to mobile computing devices in real-time. Second, most stereo networks indirectly supervise cost volumes through disparity regression loss by using the softargmax function. This causes problems in ambiguous regions, such as the boundaries of objects, because there are many possibilities for unreasonable cost distributions which result in overfitting problem. A few works deal with this problem by generating artificial cost distribution using only the ground truth disparity value that is insufficient to fully regularize the cost volume. To address these problems, we first propose an efficient multi-scale sequential feature fusion network (MSFFNet). Specifically, we connect multi-scale SFF modules in parallel with a cross-scale fusion function to generate a set of cost volumes with different scales. These cost volumes are then effectively combined using the proposed interlaced concatenation method. Second, we propose an adaptive cost-volume-filtering (ACVF) loss function that directly supervises our estimated cost volume. The proposed ACVF loss directly adds constraints to the cost volume using the probability distribution generated from the ground truth disparity map and that estimated from the teacher network which achieves higher accuracy. Results of several experiments using representative datasets for stereo matching show that our proposed method is more efficient than previous methods. Our network architecture consumes fewer parameters and generates reasonable disparity maps with faster speed compared with the existing state-of-the art stereo models. Concretely, our network achieves 1.01 EPE with runtime of 42 ms, 2.92 M parameters, and 97.96 G FLOPs on the Scene Flow test set. Compared with PSMNet, our method is 89% faster and 7% more accurate with 45% fewer parameters. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

18 pages, 22778 KB  
Article
Phenotype Tracking of Leafy Greens Based on Weakly Supervised Instance Segmentation and Data Association
by Zhuang Qiang, Jingmin Shi and Fanhuai Shi
Agronomy 2022, 12(7), 1567; https://doi.org/10.3390/agronomy12071567 - 29 Jun 2022
Cited by 7 | Viewed by 2172
Abstract
Phenotype analysis of leafy green vegetables in planting environment is the key technology of precision agriculture. In this paper, deep convolutional neural network is employed to conduct instance segmentation of leafy greens by weakly supervised learning based on box-level annotations and Excess Green [...] Read more.
Phenotype analysis of leafy green vegetables in planting environment is the key technology of precision agriculture. In this paper, deep convolutional neural network is employed to conduct instance segmentation of leafy greens by weakly supervised learning based on box-level annotations and Excess Green (ExG) color similarity. Then, weeds are filtered based on area threshold, K-means clustering and time context constraint. Thirdly, leafy greens tracking is achieved by bipartite graph matching based on mask IoU measure. Under the framework of phenotype tracking, some time-context-dependent phenotype analysis tasks such as growth monitoring can be performed. Experiments show that the proposed method can achieve 0.95 F1-score and 76.3 sMOTSA (soft multi-object tracking and segmentation accuracy) by using weakly supervised annotation data. Compared with the fully supervised approach, the proposed method can effectively reduce the requirements for agricultural data annotation, which has more potential in practical applications. Full article
Show Figures

Figure 1

17 pages, 2218 KB  
Article
Multi-Stage Convolutional Broad Learning with Block Diagonal Constraint for Hyperspectral Image Classification
by Yi Kong, Xuesong Wang, Yuhu Cheng and C. L. Philip Chen
Remote Sens. 2021, 13(17), 3412; https://doi.org/10.3390/rs13173412 - 27 Aug 2021
Cited by 5 | Viewed by 2346
Abstract
By combining the broad learning and a convolutional neural network (CNN), a block-diagonal constrained multi-stage convolutional broad learning (MSCBL-BD) method is proposed for hyperspectral image (HSI) classification. Firstly, as the linear sparse feature extracted by the conventional broad learning method cannot fully characterize [...] Read more.
By combining the broad learning and a convolutional neural network (CNN), a block-diagonal constrained multi-stage convolutional broad learning (MSCBL-BD) method is proposed for hyperspectral image (HSI) classification. Firstly, as the linear sparse feature extracted by the conventional broad learning method cannot fully characterize the complex spatial-spectral features of HSIs, we replace the linear sparse features in the mapped feature (MF) with the features extracted by the CNN to achieve more complex nonlinear mapping. Then, in the multi-layer mapping process of the CNN, information loss occurs to a certain degree. To this end, the multi-stage convolutional features (MSCFs) extracted by the CNN are expanded to obtain the multi-stage broad features (MSBFs). MSCFs and MSBFs are further spliced to obtain multi-stage convolutional broad features (MSCBFs). Additionally, in order to enhance the mutual independence between MSCBFs, a block diagonal constraint is introduced, and MSCBFs are mapped by a block diagonal matrix, so that each feature is represented linearly only by features of the same stage. Finally, the output layer weights of MSCBL-BD and the desired block-diagonal matrix are solved by the alternating direction method of multipliers. Experimental results on three popular HSI datasets demonstrate the superiority of MSCBL-BD. Full article
Show Figures

Figure 1

24 pages, 14902 KB  
Article
Building Multi-Feature Fusion Refined Network for Building Extraction from High-Resolution Remote Sensing Images
by Shuhao Ran, Xianjun Gao, Yuanwei Yang, Shaohua Li, Guangbin Zhang and Ping Wang
Remote Sens. 2021, 13(14), 2794; https://doi.org/10.3390/rs13142794 - 16 Jul 2021
Cited by 39 | Viewed by 5215
Abstract
Deep learning approaches have been widely used in building automatic extraction tasks and have made great progress in recent years. However, the missing detection and wrong detection causing by spectrum confusion is still a great challenge. The existing fully convolutional networks (FCNs) cannot [...] Read more.
Deep learning approaches have been widely used in building automatic extraction tasks and have made great progress in recent years. However, the missing detection and wrong detection causing by spectrum confusion is still a great challenge. The existing fully convolutional networks (FCNs) cannot effectively distinguish whether the feature differences are from one building or the building and its adjacent non-building objects. In order to overcome the limitations, a building multi-feature fusion refined network (BMFR-Net) was presented in this paper to extract buildings accurately and completely. BMFR-Net is based on an encoding and decoding structure, mainly consisting of two parts: the continuous atrous convolution pyramid (CACP) module and the multiscale output fusion constraint (MOFC) structure. The CACP module is positioned at the end of the contracting path and it effectively minimizes the loss of effective information in multiscale feature extraction and fusion by using parallel continuous small-scale atrous convolution. To improve the ability to aggregate semantic information from the context, the MOFC structure performs predictive output at each stage of the expanding path and integrates the results into the network. Furthermore, the multilevel joint weighted loss function effectively updates parameters well away from the output layer, enhancing the learning capacity of the network for low-level abstract features. The experimental results demonstrate that the proposed BMFR-Net outperforms the other five state-of-the-art approaches in both visual interpretation and quantitative evaluation. Full article
(This article belongs to the Special Issue Techniques and Applications of UAV-Based Photogrammetric 3D Mapping)
Show Figures

Graphical abstract

12 pages, 2077 KB  
Article
FCN-Based 3D Reconstruction with Multi-Source Photometric Stereo
by Ruixin Wang, Xin Wang, Di He, Lei Wang and Ke Xu
Appl. Sci. 2020, 10(8), 2914; https://doi.org/10.3390/app10082914 - 23 Apr 2020
Cited by 3 | Viewed by 3336
Abstract
As a classical method widely used in 3D reconstruction tasks, the multi-source Photometric Stereo can obtain more accurate 3D reconstruction results compared with the basic Photometric Stereo, but its complex calibration and solution process reduces the efficiency of this algorithm. In this paper, [...] Read more.
As a classical method widely used in 3D reconstruction tasks, the multi-source Photometric Stereo can obtain more accurate 3D reconstruction results compared with the basic Photometric Stereo, but its complex calibration and solution process reduces the efficiency of this algorithm. In this paper, we propose a multi-source Photometric Stereo 3D reconstruction method based on the fully convolutional network (FCN). We first represent the 3D shape of the object as a depth value corresponding to each pixel as the optimized object. After training in an end-to-end manner, our network can efficiently obtain 3D information on the object surface. In addition, we added two regularization constraints to the general loss function, which can effectively help the network to optimize. Under the same light source configuration, our method can obtain a higher accuracy than the classic multi-source Photometric Stereo. At the same time, our new loss function can help the deep learning method to get a more realistic 3D reconstruction result. We have also used our own real dataset to experimentally verify our method. The experimental results show that our method has a good effect on solving the main problems faced by the classical method. Full article
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)
Show Figures

Figure 1

Back to TopTop