Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (16)

Search Parameters:
Keywords = building residual refine network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 3438 KiB  
Article
Revolutionizing Detection of Minimal Residual Disease in Breast Cancer Using Patient-Derived Gene Signature
by Chen Yeh, Hung-Chih Lai, Nathan Grabbe, Xavier Willett and Shu-Ti Lin
Onco 2025, 5(3), 35; https://doi.org/10.3390/onco5030035 - 12 Jul 2025
Viewed by 294
Abstract
Background: Many patients harbor minimal residual disease (MRD)—small clusters of residual tumor cells that survive therapy and evade conventional detection but drive recurrence. Although advances in molecular and computational methods have improved circulating tumor DNA (ctDNA)-based MRD detection, these approaches face challenges: ctDNA [...] Read more.
Background: Many patients harbor minimal residual disease (MRD)—small clusters of residual tumor cells that survive therapy and evade conventional detection but drive recurrence. Although advances in molecular and computational methods have improved circulating tumor DNA (ctDNA)-based MRD detection, these approaches face challenges: ctDNA shedding fluctuates widely across tumor types, disease stages, and histological features. Additionally, low levels of driver mutations originating from healthy tissues can create background noise, complicating the accurate identification of bona fide tumor-specific signals. These limitations underscore the need for refined technologies to further enhance MRD detection beyond DNA sequences in solid malignancies. Methods: Profiling circulating cell-free mRNA (cfmRNA), which is hyperactive in tumor and non-tumor microenvironments, could address these limitations to inform postoperative surveillance and treatment strategies. This study reported the development of OncoMRD BREAST, a customized, gene signature-informed cfmRNA assay for residual disease monitoring in breast cancer. OncoMRD BREAST introduces several advanced technologies that distinguish it from the existing ctDNA-MRD tests. It builds on the patient-derived gene signature for capturing tumor activities while introducing significant upgrades to its liquid biopsy transcriptomic profiling, digital scoring systems, and tracking capabilities. Results: The OncoMRD BREAST test processes inputs from multiple cutting-edge biomarkers—tumor and non-tumor microenvironment—to provide enhanced awareness of tumor activities in real time. By fusing data from these diverse intra- and inter-cellular networks, OncoMRD BREAST significantly improves the sensitivity and reliability of MRD detection and prognosis analysis, even under challenging and complex conditions. In a proof-of-concept real-world pilot trial, OncoMRD BREAST’s rapid quantification of potential tumor activity helped reduce the risk of incorrect treatment strategies, while advanced predictive analytics contributed to the overall benefits and improved outcomes of patients. Conclusions: By tailoring the assay to individual tumor profiles, we aimed to enhance early identification of residual disease and optimize therapeutic decision-making. OncoMRD BREAST is the world’s first and only gene signature-powered test for monitoring residual disease in solid tumors. Full article
Show Figures

Figure 1

24 pages, 8074 KiB  
Article
MMRAD-Net: A Multi-Scale Model for Precise Building Extraction from High-Resolution Remote Sensing Imagery with DSM Integration
by Yu Gao, Huiming Chai and Xiaolei Lv
Remote Sens. 2025, 17(6), 952; https://doi.org/10.3390/rs17060952 - 7 Mar 2025
Viewed by 723
Abstract
High-resolution remote sensing imagery (HRRSI) presents significant challenges for building extraction tasks due to its complex terrain structures, multi-scale features, and rich spectral and geometric information. Traditional methods often face limitations in effectively integrating multi-scale features while maintaining a balance between detailed and [...] Read more.
High-resolution remote sensing imagery (HRRSI) presents significant challenges for building extraction tasks due to its complex terrain structures, multi-scale features, and rich spectral and geometric information. Traditional methods often face limitations in effectively integrating multi-scale features while maintaining a balance between detailed and global semantic information. To address these challenges, this paper proposes an innovative deep learning network, Multi-Source Multi-Scale Residual Attention Network (MMRAD-Net). This model is built upon the classical encoder–decoder framework and introduces two key components: the GCN OA-SWinT Dense Module (GSTDM) and the Res DualAttention Dense Fusion Block (R-DDFB). Additionally, it incorporates Digital Surface Model (DSM) data, presenting a novel feature extraction and fusion strategy. Specifically, the model enhances building extraction accuracy and robustness through hierarchical feature modeling and a refined cross-scale fusion mechanism, while effectively preserving both detail information and global semantic relationships. Furthermore, we propose a Hybrid Loss, which combines Binary Cross-Entropy Loss (BCE Loss), Dice Loss, and an edge-sensitive term to further improve the precision of building edges and foreground reconstruction capabilities. Experiments conducted on the GF-7 and WHU datasets validate the performance of MMRAD-Net, demonstrating its superiority over traditional methods in boundary handling, detail recovery, and adaptability to complex scenes. On the GF-7 Dataset, MMRAD-Net achieved an F1-score of 91.12% and an IoU of 83.01%. On the WHU Building Dataset, the F1-score and IoU were 94.04% and 88.99%, respectively. Ablation studies and transfer learning experiments further confirm the rationality of the model design and its strong generalization ability. These results highlight that innovations in multi-source data fusion, multi-scale feature modeling, and detailed feature fusion mechanisms have enhanced the accuracy and robustness of building extraction. Full article
Show Figures

Figure 1

19 pages, 10750 KiB  
Article
AM-ESRGAN: Super-Resolution Reconstruction of Ancient Murals Based on Attention Mechanism and Multi-Level Residual Network
by Ci Xiao, Yajun Chen, Chaoyue Sun, Longxiang You and Rongzhen Li
Electronics 2024, 13(16), 3142; https://doi.org/10.3390/electronics13163142 - 8 Aug 2024
Cited by 4 | Viewed by 2627
Abstract
To address the issues of blurred edges and contours, insufficient extraction of low-frequency information, and unclear texture details in ancient murals, which lead to decreased ornamental value and limited research significance of the murals, this paper proposes a novel ancient mural super-resolution reconstruction [...] Read more.
To address the issues of blurred edges and contours, insufficient extraction of low-frequency information, and unclear texture details in ancient murals, which lead to decreased ornamental value and limited research significance of the murals, this paper proposes a novel ancient mural super-resolution reconstruction method, based on an attention mechanism and a multi-level residual network, termed AM-ESRGAN. This network builds a module for Multi-Scale Dense Feature Fusion (MDFF) to adaptively fuse features at different levels for more complete structural information regarding the image. The deep feature extraction module is improved with a new Sim-RRDB module, which expands capacity without increasing complexity. Additionally, a Simple Parameter-Free Attention Module for Convolutional Neural Networks (SimAM) is introduced to address the issue of insufficient feature extraction in the nonlinear mapping process of image super-resolution reconstruction. A new feature refinement module (DEABlock) is added to extract image feature information without changing the resolution, thereby avoiding excessive loss of image information and ensuring richer generated details. The experimental results indicate that the proposed method improves PSNR/dB by 3.4738 dB, SSIM by 0.2060, MSE by 123.8436, and NIQE by 0.1651 at a ×4 scale factor. At a ×2 scale factor, PSNR/dB improves by 4.0280 dB, SSIM increases by 3.38%, MSE decreases by 62.2746, and NIQE reduces by 0.1242. Compared to mainstream models, the objective evaluation metrics of the reconstructed images achieve the best results, and the reconstructed ancient mural images exhibit more detailed textures and clearer edges. Full article
Show Figures

Figure 1

22 pages, 7134 KiB  
Article
End-to-End Edge-Guided Multi-Scale Matching Network for Optical Satellite Stereo Image Pairs
by Yixin Luo, Hao Wang and Xiaolei Lv
Remote Sens. 2024, 16(5), 882; https://doi.org/10.3390/rs16050882 - 2 Mar 2024
Cited by 3 | Viewed by 2011
Abstract
Acquiring disparity maps by dense stereo matching is one of the most important methods for producing digital surface models. However, the characteristics of optical satellite imagery, including significant occlusions and long baselines, increase the challenges of dense matching. In this study, we propose [...] Read more.
Acquiring disparity maps by dense stereo matching is one of the most important methods for producing digital surface models. However, the characteristics of optical satellite imagery, including significant occlusions and long baselines, increase the challenges of dense matching. In this study, we propose an end-to-end edge-guided multi-scale matching network (EGMS-Net) tailored for optical satellite stereo image pairs. Using small convolutional filters and residual blocks, the EGMS-Net captures rich high-frequency signals during the initial feature extraction phase. Subsequently, pyramid features are derived through efficient down-sampling and consolidated into cost volumes. To regularize these cost volumes, we design a top–down multi-scale fusion network that integrates an attention mechanism. Finally, we innovate the use of trainable guided filter layers in disparity refinement to improve edge detail recovery. The network is trained and evaluated using the Urban Semantic 3D and WHU-Stereo datasets, with subsequent analysis of the disparity maps. The results show that the EGMS-Net provides superior results, achieving endpoint errors of 1.515 and 2.459 pixels, respectively. In challenging scenarios, particularly in regions with textureless surfaces and dense buildings, our network consistently delivers satisfactory matching performance. In addition, EGMS-Net reduces training time and increases network efficiency, improving overall results. Full article
Show Figures

Graphical abstract

21 pages, 4781 KiB  
Article
A Physics-Guided Bi-Fidelity Fourier-Featured Operator Learning Framework for Predicting Time Evolution of Drag and Lift Coefficients
by Amirhossein Mollaali, Izzet Sahin, Iqrar Raza, Christian Moya, Guillermo Paniagua and Guang Lin
Fluids 2023, 8(12), 323; https://doi.org/10.3390/fluids8120323 - 18 Dec 2023
Cited by 2 | Viewed by 2912
Abstract
In the pursuit of accurate experimental and computational data while minimizing effort, there is a constant need for high-fidelity results. However, achieving such results often requires significant computational resources. To address this challenge, this paper proposes a deep operator learning-based framework that requires [...] Read more.
In the pursuit of accurate experimental and computational data while minimizing effort, there is a constant need for high-fidelity results. However, achieving such results often requires significant computational resources. To address this challenge, this paper proposes a deep operator learning-based framework that requires a limited high-fidelity dataset for training. We introduce a novel physics-guided, bi-fidelity, Fourier-featured deep operator network (DeepONet) framework that effectively combines low- and high-fidelity datasets, leveraging the strengths of each. In our methodology, we begin by designing a physics-guided Fourier-featured DeepONet, drawing inspiration from the intrinsic physical behavior of the target solution. Subsequently, we train this network to primarily learn the low-fidelity solution, utilizing an extensive dataset. This process ensures a comprehensive grasp of the foundational solution patterns. Following this foundational learning, the low-fidelity deep operator network’s output is enhanced using a physics-guided Fourier-featured residual deep operator network. This network refines the initial low-fidelity output, achieving the high-fidelity solution by employing a small high-fidelity dataset for training. Notably, in our framework, we employ the Fourier feature network as the trunk network for the DeepONets, given its proficiency in capturing and learning the oscillatory nature of the target solution with high precision. We validate our approach using a well-known 2D benchmark cylinder problem, which aims to predict the time trajectories of lift and drag coefficients. The results highlight that the physics-guided Fourier-featured deep operator network, serving as a foundational building block of our framework, possesses superior predictive capability for the lift and drag coefficients compared to its data-driven counterparts. The bi-fidelity learning framework, built upon the physics-guided Fourier-featured deep operator, accurately forecasts the time trajectories of lift and drag coefficients. A thorough evaluation of the proposed bi-fidelity framework confirms that our approach closely matches the high-fidelity solution, with an error rate under 2%. This confirms the effectiveness and reliability of our framework, particularly given the limited high-fidelity dataset used during training. Full article
(This article belongs to the Special Issue Challenges and Directions in Fluid Structure Interaction)
Show Figures

Figure 1

28 pages, 6569 KiB  
Article
A Novel Building Extraction Network via Multi-Scale Foreground Modeling and Gated Boundary Refinement
by Junlin Liu, Ying Xia, Jiangfan Feng and Peng Bai
Remote Sens. 2023, 15(24), 5638; https://doi.org/10.3390/rs15245638 - 5 Dec 2023
Cited by 2 | Viewed by 1841
Abstract
Deep learning-based methods for building extraction from remote sensing images have been widely applied in fields such as land management and urban planning. However, extracting buildings from remote sensing images commonly faces challenges due to specific shooting angles. First, there exists a foreground–background [...] Read more.
Deep learning-based methods for building extraction from remote sensing images have been widely applied in fields such as land management and urban planning. However, extracting buildings from remote sensing images commonly faces challenges due to specific shooting angles. First, there exists a foreground–background imbalance issue, and the model excessively learns features unrelated to buildings, resulting in performance degradation and propagative interference. Second, buildings have complex boundary information, while conventional network architectures fail to capture fine boundaries. In this paper, we designed a multi-task U-shaped network (BFL-Net) to solve these problems. This network enhances the expression of the foreground and boundary features in the prediction results through foreground learning and boundary refinement, respectively. Specifically, the Foreground Mining Module (FMM) utilizes the relationship between buildings and multi-scale scene spaces to explicitly model, extract, and learn foreground features, which can enhance foreground and related contextual features. The Dense Dilated Convolutional Residual Block (DDCResBlock) and the Dual Gate Boundary Refinement Module (DGBRM) individually process the diverted regular stream and boundary stream. The former can effectively expand the receptive field, and the latter utilizes spatial and channel gates to activate boundary features in low-level feature maps, helping the network refine boundaries. The predictions of the network for the building, foreground, and boundary are respectively supervised by ground truth. The experimental results on the WHU Building Aerial Imagery and Massachusetts Buildings Datasets show that the IoU scores of BFL-Net are 91.37% and 74.50%, respectively, surpassing state-of-the-art models. Full article
Show Figures

Figure 1

25 pages, 10450 KiB  
Article
MAFF-Net: Multi-Attention Guided Feature Fusion Network for Change Detection in Remote Sensing Images
by Jinming Ma, Gang Shi, Yanxiang Li and Ziyu Zhao
Sensors 2022, 22(3), 888; https://doi.org/10.3390/s22030888 - 24 Jan 2022
Cited by 14 | Viewed by 4258
Abstract
One of the most important tasks in remote sensing image analysis is remote sensing image Change Detection (CD), and CD is the key to helping people obtain more accurate information about changes on the Earth’s surface. A Multi-Attention Guided Feature Fusion Network (MAFF-Net) [...] Read more.
One of the most important tasks in remote sensing image analysis is remote sensing image Change Detection (CD), and CD is the key to helping people obtain more accurate information about changes on the Earth’s surface. A Multi-Attention Guided Feature Fusion Network (MAFF-Net) for CD tasks has been designed. The network enhances feature extraction and feature fusion by building different blocks. First, a Feature Enhancement Module (FEM) is proposed. The FEM introduces Coordinate Attention (CA). The CA block embeds the position information into the channel attention to obtain the accurate position information and channel relationships of the remote sensing images. An updated feature map is obtained by using an element-wise summation of the input of the FEM and the output of the CA. The FEM enhances the feature representation in the network. Then, an attention-based Feature Fusion Module (FFM) is designed. It changes the previous idea of layer-by-layer fusion and chooses cross-layer aggregation. The FFM is to compensate for some semantic information missing as the number of layers increases. FFM plays an important role in the communication of feature maps at different scales. To further refine the feature representation, a Refinement Residual Block (RRB) is proposed. The RRB changes the number of channels of the aggregated features and uses convolutional blocks to further refine the feature representation. Compared with all compared methods, MAFF-Net improves the F1-Score scores by 4.9%, 3.2%, and 1.7% on three publicly available benchmark datasets, the CDD, LEVIR-CD, and WHU-CD datasets, respectively. The experimental results show that MAFF-Net achieves state-of-the-art (SOTA) CD performance on these three challenging datasets. Full article
(This article belongs to the Special Issue Deep Learning Methods for Remote Sensing)
Show Figures

Figure 1

23 pages, 13339 KiB  
Article
Multiscale Semantic Feature Optimization and Fusion Network for Building Extraction Using High-Resolution Aerial Images and LiDAR Data
by Qinglie Yuan, Helmi Zulhaidi Mohd Shafri, Aidi Hizami Alias and Shaiful Jahari bin Hashim
Remote Sens. 2021, 13(13), 2473; https://doi.org/10.3390/rs13132473 - 24 Jun 2021
Cited by 22 | Viewed by 3420
Abstract
Automatic building extraction has been applied in many domains. It is also a challenging problem because of the complex scenes and multiscale. Deep learning algorithms, especially fully convolutional neural networks (FCNs), have shown robust feature extraction ability than traditional remote sensing data processing [...] Read more.
Automatic building extraction has been applied in many domains. It is also a challenging problem because of the complex scenes and multiscale. Deep learning algorithms, especially fully convolutional neural networks (FCNs), have shown robust feature extraction ability than traditional remote sensing data processing methods. However, hierarchical features from encoders with a fixed receptive field perform weak ability to obtain global semantic information. Local features in multiscale subregions cannot construct contextual interdependence and correlation, especially for large-scale building areas, which probably causes fragmentary extraction results due to intra-class feature variability. In addition, low-level features have accurate and fine-grained spatial information for tiny building structures but lack refinement and selection, and the semantic gap of across-level features is not conducive to feature fusion. To address the above problems, this paper proposes an FCN framework based on the residual network and provides the training pattern for multi-modal data combining the advantage of high-resolution aerial images and LiDAR data for building extraction. Two novel modules have been proposed for the optimization and integration of multiscale and across-level features. In particular, a multiscale context optimization module is designed to adaptively generate the feature representations for different subregions and effectively aggregate global context. A semantic guided spatial attention mechanism is introduced to refine shallow features and alleviate the semantic gap. Finally, hierarchical features are fused via the feature pyramid network. Compared with other state-of-the-art methods, experimental results demonstrate superior performance with 93.19 IoU, 97.56 OA on WHU datasets and 94.72 IoU, 97.84 OA on the Boston dataset, which shows that the proposed network can improve accuracy and achieve better performance for building extraction. Full article
(This article belongs to the Special Issue Big Remotely Sensed Data)
Show Figures

Graphical abstract

22 pages, 8110 KiB  
Article
Combining Deep Semantic Segmentation Network and Graph Convolutional Neural Network for Semantic Segmentation of Remote Sensing Imagery
by Song Ouyang and Yansheng Li
Remote Sens. 2021, 13(1), 119; https://doi.org/10.3390/rs13010119 - 31 Dec 2020
Cited by 76 | Viewed by 10225
Abstract
Although the deep semantic segmentation network (DSSN) has been widely used in remote sensing (RS) image semantic segmentation, it still does not fully mind the spatial relationship cues between objects when extracting deep visual features through convolutional filters and pooling layers. In fact, [...] Read more.
Although the deep semantic segmentation network (DSSN) has been widely used in remote sensing (RS) image semantic segmentation, it still does not fully mind the spatial relationship cues between objects when extracting deep visual features through convolutional filters and pooling layers. In fact, the spatial distribution between objects from different classes has a strong correlation characteristic. For example, buildings tend to be close to roads. In view of the strong appearance extraction ability of DSSN and the powerful topological relationship modeling capability of the graph convolutional neural network (GCN), a DSSN-GCN framework, which combines the advantages of DSSN and GCN, is proposed in this paper for RS image semantic segmentation. To lift the appearance extraction ability, this paper proposes a new DSSN called the attention residual U-shaped network (AttResUNet), which leverages residual blocks to encode feature maps and the attention module to refine the features. As far as GCN, the graph is built, where graph nodes are denoted by the superpixels and the graph weight is calculated by considering the spectral information and spatial information of the nodes. The AttResUNet is trained to extract the high-level features to initialize the graph nodes. Then the GCN combines features and spatial relationships between nodes to conduct classification. It is worth noting that the usage of spatial relationship knowledge boosts the performance and robustness of the classification module. In addition, benefiting from modeling GCN on the superpixel level, the boundaries of objects are restored to a certain extent and there are less pixel-level noises in the final classification result. Extensive experiments on two publicly open datasets show that DSSN-GCN model outperforms the competitive baseline (i.e., the DSSN model) and the DSSN-GCN when adopting AttResUNet achieves the best performance, which demonstrates the advance of our method. Full article
Show Figures

Figure 1

17 pages, 1490 KiB  
Article
BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction From High-Resolution Remote Sensing Images
by Zhenfeng Shao, Penghao Tang, Zhongyuan Wang, Nayyer Saleem, Sarath Yam and Chatpong Sommai
Remote Sens. 2020, 12(6), 1050; https://doi.org/10.3390/rs12061050 - 24 Mar 2020
Cited by 230 | Viewed by 13342
Abstract
Building extraction from high-resolution remote sensing images is of great significance in urban planning, population statistics, and economic forecast. However, automatic building extraction from high-resolution remote sensing images remains challenging. On the one hand, the extraction results of buildings are partially missing and [...] Read more.
Building extraction from high-resolution remote sensing images is of great significance in urban planning, population statistics, and economic forecast. However, automatic building extraction from high-resolution remote sensing images remains challenging. On the one hand, the extraction results of buildings are partially missing and incomplete due to the variation of hue and texture within a building, especially when the building size is large. On the other hand, the building footprint extraction of buildings with complex shapes is often inaccurate. To this end, we propose a new deep learning network, termed Building Residual Refine Network (BRRNet), for accurate and complete building extraction. BRRNet consists of such two parts as the prediction module and the residual refinement module. The prediction module based on an encoder–decoder structure introduces atrous convolution of different dilation rates to extract more global features, by gradually increasing the receptive field during feature extraction. When the prediction module outputs the preliminary building extraction results of the input image, the residual refinement module takes the output of the prediction module as an input. It further refines the residual between the result of the prediction module and the real result, thus improving the accuracy of building extraction. In addition, we use Dice loss as the loss function during training, which effectively alleviates the problem of data imbalance and further improves the accuracy of building extraction. The experimental results on Massachusetts Building Dataset show that our method outperforms other five state-of-the-art methods in terms of the integrity of buildings and the accuracy of complex building footprints. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Graphical abstract

20 pages, 9349 KiB  
Article
Post-Disaster Building Database Updating Using Automated Deep Learning: An Integration of Pre-Disaster OpenStreetMap and Multi-Temporal Satellite Data
by Saman Ghaffarian, Norman Kerle, Edoardo Pasolli and Jamal Jokar Arsanjani
Remote Sens. 2019, 11(20), 2427; https://doi.org/10.3390/rs11202427 - 19 Oct 2019
Cited by 63 | Viewed by 7739
Abstract
First responders and recovery planners need accurate and quickly derived information about the status of buildings as well as newly built ones to both help victims and to make decisions for reconstruction processes after a disaster. Deep learning and, in particular, convolutional neural [...] Read more.
First responders and recovery planners need accurate and quickly derived information about the status of buildings as well as newly built ones to both help victims and to make decisions for reconstruction processes after a disaster. Deep learning and, in particular, convolutional neural network (CNN)-based approaches have recently become state-of-the-art methods to extract information from remote sensing images, in particular for image-based structural damage assessment. However, they are predominantly based on manually extracted training samples. In the present study, we use pre-disaster OpenStreetMap building data to automatically generate training samples to train the proposed deep learning approach after the co-registration of the map and the satellite images. The proposed deep learning framework is based on the U-net design with residual connections, which has been shown to be an effective method to increase the efficiency of CNN-based models. The ResUnet is followed by a Conditional Random Field (CRF) implementation to further refine the results. Experimental analysis was carried out on selected very high resolution (VHR) satellite images representing various scenarios after the 2013 Super Typhoon Haiyan in both the damage and the recovery phases in Tacloban, the Philippines. The results show the robustness of the proposed ResUnet-CRF framework in updating the building map after a disaster for both damage and recovery situations by producing an overall F1-score of 84.2%. Full article
Show Figures

Graphical abstract

20 pages, 3796 KiB  
Article
Unsupervised Monocular Depth Estimation Based on Residual Neural Network of Coarse–Refined Feature Extractions for Drone
by Tao Huang, Shuanfeng Zhao, Longlong Geng and Qian Xu
Electronics 2019, 8(10), 1179; https://doi.org/10.3390/electronics8101179 - 17 Oct 2019
Cited by 8 | Viewed by 4955
Abstract
To take full advantage of the information of images captured by drones and given that most existing monocular depth estimation methods based on supervised learning require vast quantities of corresponding ground truth depth data for training, the model of unsupervised monocular depth estimation [...] Read more.
To take full advantage of the information of images captured by drones and given that most existing monocular depth estimation methods based on supervised learning require vast quantities of corresponding ground truth depth data for training, the model of unsupervised monocular depth estimation based on residual neural network of coarse–refined feature extractions for drone is therefore proposed. As a virtual camera is introduced through a deep residual convolution neural network based on coarse–refined feature extractions inspired by the principle of binocular depth estimation, the unsupervised monocular depth estimation has become an image reconstruction problem. To improve the performance of our model for monocular depth estimation, the following innovations are proposed. First, the pyramid processing for input image is proposed to build the topological relationship between the resolution of input image and the depth of input image, which can improve the sensitivity of depth information from a single image and reduce the impact of input image resolution on depth estimation. Second, the residual neural network of coarse–refined feature extractions for corresponding image reconstruction is designed to improve the accuracy of feature extraction and solve the contradiction between the calculation time and the numbers of network layers. In addition, to predict high detail output depth maps, the long skip connections between corresponding layers in the neural network of coarse feature extractions and deconvolution neural network of refined feature extractions are designed. Third, the loss of corresponding image reconstruction based on the structural similarity index (SSIM), the loss of approximate disparity smoothness and the loss of depth map are united as a novel training loss to better train our model. The experimental results show that our model has superior performance on the KITTI dataset composed by corresponding left view and right view and Make3D dataset composed by image and corresponding ground truth depth map compared to the state-of-the-art monocular depth estimation methods and basically meet the requirements for depth information of images captured by drones when our model is trained on KITTI. Full article
(This article belongs to the Special Issue Bioinspired Computer Vision)
Show Figures

Figure 1

22 pages, 7790 KiB  
Article
Multi-Task cGAN for Simultaneous Spaceborne DSM Refinement and Roof-Type Classification
by Ksenia Bittner, Marco Körner, Friedrich Fraundorfer and Peter Reinartz
Remote Sens. 2019, 11(11), 1262; https://doi.org/10.3390/rs11111262 - 28 May 2019
Cited by 21 | Viewed by 6314
Abstract
Various deep learning applications benefit from multi-task learning with multiple regression and classification objectives by taking advantage of the similarities between individual tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models compared to separately trained models. In [...] Read more.
Various deep learning applications benefit from multi-task learning with multiple regression and classification objectives by taking advantage of the similarities between individual tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models compared to separately trained models. In this paper, we make an observation of such influences for important remote sensing applications like elevation model generation and semantic segmentation tasks from the stereo half-meter resolution satellite digital surface models (DSMs). Mainly, we aim to generate good-quality DSMs with complete, as well as accurate level of detail (LoD)2-like building forms and to assign an object class label to each pixel in the DSMs. For the label assignment task, we select the roof type classification problem to distinguish between flat, non-flat, and background pixels. To realize those tasks, we train a conditional generative adversarial network (cGAN) with an objective function based on least squares residuals and an auxiliary term based on normal vectors for further roof surface refinement. Besides, we investigate recently published deep learning architectures for both tasks and develop the final end-to-end network, which combines different models, as using them first separately, they provide the best results for their individual tasks. Full article
(This article belongs to the Special Issue 3D Reconstruction Based on Aerial and Satellite Imagery)
Show Figures

Figure 1

19 pages, 4666 KiB  
Article
Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network
by Penghua Liu, Xiaoping Liu, Mengxi Liu, Qian Shi, Jinxing Yang, Xiaocong Xu and Yuanying Zhang
Remote Sens. 2019, 11(7), 830; https://doi.org/10.3390/rs11070830 - 7 Apr 2019
Cited by 191 | Viewed by 11409
Abstract
The rapid development in deep learning and computer vision has introduced new opportunities and paradigms for building extraction from remote sensing images. In this paper, we propose a novel fully convolutional network (FCN), in which a spatial residual inception (SRI) module is proposed [...] Read more.
The rapid development in deep learning and computer vision has introduced new opportunities and paradigms for building extraction from remote sensing images. In this paper, we propose a novel fully convolutional network (FCN), in which a spatial residual inception (SRI) module is proposed to capture and aggregate multi-scale contexts for semantic understanding by successively fusing multi-level features. The proposed SRI-Net is capable of accurately detecting large buildings that might be easily omitted while retaining global morphological characteristics and local details. On the other hand, to improve computational efficiency, depthwise separable convolutions and convolution factorization are introduced to significantly decrease the number of model parameters. The proposed model is evaluated on the Inria Aerial Image Labeling Dataset and the Wuhan University (WHU) Aerial Building Dataset. The experimental results show that the proposed methods exhibit significant improvements compared with several state-of-the-art FCNs, including SegNet, U-Net, RefineNet, and DeepLab v3+. The proposed model shows promising potential for building detection from remote sensing images on a large scale. Full article
(This article belongs to the Special Issue Advanced Topics in Remote Sensing)
Show Figures

Figure 1

16 pages, 15828 KiB  
Article
Road Extraction from High-Resolution Remote Sensing Imagery Using Refined Deep Residual Convolutional Neural Network
by Lin Gao, Weidong Song, Jiguang Dai and Yang Chen
Remote Sens. 2019, 11(5), 552; https://doi.org/10.3390/rs11050552 - 6 Mar 2019
Cited by 114 | Viewed by 9999
Abstract
Road extraction is one of the most significant tasks for modern transportation systems. This task is normally difficult due to complex backgrounds such as rural roads that have heterogeneous appearances with large intraclass and low interclass variations and urban roads that are covered [...] Read more.
Road extraction is one of the most significant tasks for modern transportation systems. This task is normally difficult due to complex backgrounds such as rural roads that have heterogeneous appearances with large intraclass and low interclass variations and urban roads that are covered by vehicles, pedestrians and the shadows of surrounding trees or buildings. In this paper, we propose a novel method for extracting roads from optical satellite images using a refined deep residual convolutional neural network (RDRCNN) with a postprocessing stage. RDRCNN consists of a residual connected unit (RCU) and a dilated perception unit (DPU). The RDRCNN structure is symmetric to generate the outputs of the same size. A math morphology and a tensor voting algorithm are used to improve RDRCNN performance during postprocessing. Experiments are conducted on two datasets of high-resolution images to demonstrate the performance of the proposed network architectures, and the results of the proposed architectures are compared with those of other network architectures. The results demonstrate the effective performance of the proposed method for extracting roads from a complex scene. Full article
(This article belongs to the Special Issue Convolutional Neural Networks Applications in Remote Sensing)
Show Figures

Graphical abstract

Back to TopTop