remotesensing-logo

Journal Browser

Journal Browser

Advances in Understanding and 3D Semantic Modeling of Large-Scale Urban Scenes from Point Clouds

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Urban Remote Sensing".

Deadline for manuscript submissions: closed (31 August 2023) | Viewed by 15579

Special Issue Editors

Department of Geomatics Engineering, College of Civil Engineering, Nanjing Forestry University, Nanjing 210037, China
Interests: image- and LiDAR-based segmentation and reconstruction; full-waveform LiDAR data processing; related remote sensing applications in the field of forest ecosystems
Special Issues, Collections and Topics in MDPI journals
Department of Remote Sensing Technology, College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China
Interests: light detection and ranging data processing; quality analysis of geographic information systems; remote sensing image processing; algorithm development
Special Issues, Collections and Topics in MDPI journals
Department of Mathematics and Computing Science, Faculty of Science, Saint Mary’s University, Halifax, NS B3H 3C2, Canada
Interests: computer graphics; 3D computer vision; geometric deep learning; related applications including motion capture for VR/AR and LiDAR-based urban modeling
Special Issues, Collections and Topics in MDPI journals
Department of Informatics and Telecommunications, School of Science, National and Kapodistrian University of Athens, 157 84 Athens, Greece
Interests: remote sensing image processing; 3D urban reconstruction; spatial object recognition

Special Issue Information

Dear Colleagues,

Driven by many applications and the improvement of 3D data acquisition technology, computer vision, and remote sensing communities are now focusing on deep learning-based and knowledge-based algorithms to tackle the challenges in understanding and 3D semantic modeling of large-scale urban scenes. The scene understanding and physical modeling from point clouds include enhancement and segmentation at the scene, object, and part level, shape recognition, indoor and outdoor abstraction, and reconstruction, and an optional simplification to make the 3D model web and/or mobile compatible. Although the recent advanced deep-learning algorithm exhibits a powerful performance on low-level recognition tasks such as classification and segmentation, scant attention has been given to deep learning for large-scale 3D urban modeling due to a lack of available training data or benchmark repositories. Other challenges include detail modeling from imperfect (occluded, noisy) scans, free-from building modeling, lightweight modeling for web/mobile compatibility, flexible modeling which generates multiple Levels of Detail (LoDs) on the fly, and automated reconstruction from large-scale urban point clouds, to name a few. We position our Special Issue to support the ongoing efforts in 3D scanning and modeling industry and applications of LiDAR/RGB-D/photogrammetric point clouds. The topics of this Special Issue include, but are not limited to:

  • Enhancement, registration, filtering of point clouds;
  • Semantic, instance, panoptic, and part-level segmentation;
  • Large-scale outdoor scene and indoor scene reconstruction;
  • Detail synthesis and implicit modeling of urban scenes;
  • 3D modeling of buildings, bridges, roads, trees, and utilities;
  • Rendering and visualization of urban scenes;
  • Polyhedral meshes, procedural models and model simplification;
  • Innovative applications in smart cities, VR/AR, autonomous driving, indoor navigation, etc.

Dr. Dong Chen
Dr. Zhengxin Zhang
Dr. Jiju Poovvancheri
Prof. Dr. Takis Mathiopoulos
Prof. Dr. Sisi Zlatanova
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • point cloud enhancement
  • semantic segmentation
  • instance segmentation
  • panoptic segmentation
  • outdoor reconstruction
  • indoor reconstruction
  • urban utilities modeling
  • efficient data structures
  • model simplification
  • intelligent applications

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 2594 KiB  
Article
Regional-to-Local Point-Voxel Transformer for Large-Scale Indoor 3D Point Cloud Semantic Segmentation
by Shuai Li and Hongjun Li
Remote Sens. 2023, 15(19), 4832; https://doi.org/10.3390/rs15194832 - 05 Oct 2023
Cited by 1 | Viewed by 909
Abstract
Semantic segmentation of large-scale indoor 3D point cloud scenes is crucial for scene understanding but faces challenges in effectively modeling long-range dependencies and multi-scale features. In this paper, we present RegionPVT, a novel Regional-to-Local Point-Voxel Transformer that synergistically integrates voxel-based regional self-attention and [...] Read more.
Semantic segmentation of large-scale indoor 3D point cloud scenes is crucial for scene understanding but faces challenges in effectively modeling long-range dependencies and multi-scale features. In this paper, we present RegionPVT, a novel Regional-to-Local Point-Voxel Transformer that synergistically integrates voxel-based regional self-attention and window-based point-voxel self-attention for concurrent coarse-grained and fine-grained feature learning. The voxel-based regional branch focuses on capturing regional context and facilitating inter-window communication. The window-based point-voxel branch concentrates on local feature learning while integrating voxel-level information within each window. This unique design enables the model to jointly extract local details and regional structures efficiently and provides an effective and efficient solution for multi-scale feature fusion and a comprehensive understanding of 3D point clouds. Extensive experiments on S3DIS and ScanNet v2 datasets demonstrate that our RegionPVT achieves competitive or superior performance compared with state-of-the-art approaches, attaining mIoUs of 71.0% and 73.9% respectively, with significantly lower memory footprint. Full article
Show Figures

Figure 1

18 pages, 14913 KiB  
Article
Camera and LiDAR Fusion for Urban Scene Reconstruction and Novel View Synthesis via Voxel-Based Neural Radiance Fields
by Xuanzhu Chen, Zhenbo Song, Jun Zhou, Dong Xie and Jianfeng Lu
Remote Sens. 2023, 15(18), 4628; https://doi.org/10.3390/rs15184628 - 20 Sep 2023
Cited by 1 | Viewed by 2160
Abstract
3D reconstruction of urban scenes is an important research topic in remote sensing. Neural Radiance Fields (NeRFs) offer an efficient solution for both structure recovery and novel view synthesis. The realistic 3D urban models generated by NeRFs have potential future applications in simulation [...] Read more.
3D reconstruction of urban scenes is an important research topic in remote sensing. Neural Radiance Fields (NeRFs) offer an efficient solution for both structure recovery and novel view synthesis. The realistic 3D urban models generated by NeRFs have potential future applications in simulation for autonomous driving, as well as in Augmented and Virtual Reality (AR/VR) experiences. Previous NeRF methods struggle with large-scale, urban environments. Due to the limited model capability of NeRF, directly applying them to urban environments may result in noticeable artifacts in synthesized images and inferior visual fidelity. To address this challenge, we propose a sparse voxel-based NeRF. First, our approach leverages LiDAR odometry to refine frame-by-frame LiDAR point cloud alignment and derive accurate initial camera pose through joint LiDAR-camera calibration. Second, we partition the space into sparse voxels and perform voxel interpolation based on 3D LiDAR point clouds, and then construct a voxel octree structure to disregard empty voxels during subsequent ray sampling in the NeRF, which can increase the rendering speed. Finally, the depth information provided by the 3D point cloud on each viewpoint image supervises our NeRF model, which is further optimized using a depth consistency loss function and a plane constraint loss function. In the real-world urban scenes, our method significantly reduces the training time to around an hour and enhances reconstruction quality with a PSNR improvement of 1–2 dB, outperforming other state-of-the-art NeRF models. Full article
Show Figures

Figure 1

20 pages, 13848 KiB  
Article
Learning Contours for Point Cloud Completion
by Jiabo Xu, Zeyun Wan and Jingbo Wei
Remote Sens. 2023, 15(17), 4338; https://doi.org/10.3390/rs15174338 - 03 Sep 2023
Cited by 1 | Viewed by 1096
Abstract
The integrity of a point cloud frequently suffers from discontinuous material surfaces or coarse sensor resolutions. Existing methods focus on reconstructing the overall structure, but salient points or small irregular surfaces are difficult to be predicted. Toward this issue, we propose a new [...] Read more.
The integrity of a point cloud frequently suffers from discontinuous material surfaces or coarse sensor resolutions. Existing methods focus on reconstructing the overall structure, but salient points or small irregular surfaces are difficult to be predicted. Toward this issue, we propose a new end-to-end neural network for point cloud completion. To avoid non-uniform point density, the regular voxel centers are selected as reference points. The encoder and decoder are designed with Patchify, transformers, and multilayer perceptrons. An implicit classifier is incorporated in the decoder to mark the valid voxels that are allowed for diffusion after removing vacant grids from completion. With newly designed loss function, the classifier is trained to learn the contours, which helps to identify the grids that are difficult to be judged for diffusion. The effectiveness of the proposed model is validated in the experiments on the indoor ShapeNet dataset, the outdoor KITTI dataset, and the airbone laser dataset by competing with state-of-the-art methods, which show that our method can predict more accurate point coordinates with rich details and uniform point distributions. Full article
Show Figures

Graphical abstract

22 pages, 5773 KiB  
Article
Saint Petersburg 3D: Creating a Large-Scale Hybrid Mobile LiDAR Point Cloud Dataset for Geospatial Applications
by Sergey Lytkin, Vladimir Badenko, Alexander Fedotov, Konstantin Vinogradov, Anton Chervak, Yevgeny Milanov and Dmitry Zotov
Remote Sens. 2023, 15(11), 2735; https://doi.org/10.3390/rs15112735 - 24 May 2023
Viewed by 1149
Abstract
At the present time, many publicly available point cloud datasets exist, which are mainly focused on autonomous driving. The objective of this study is to develop a new large-scale mobile 3D LiDAR point cloud dataset for outdoor scene semantic segmentation tasks, which has [...] Read more.
At the present time, many publicly available point cloud datasets exist, which are mainly focused on autonomous driving. The objective of this study is to develop a new large-scale mobile 3D LiDAR point cloud dataset for outdoor scene semantic segmentation tasks, which has a classification scheme suitable for geospatial applications. Our dataset (Saint Petersburg 3D) contains both real-world (34 million points) and synthetic (34 million points) subsets that were acquired using real and virtual sensors with the same characteristics. An original classification scheme is proposed that contains a set of 10 universal object categories into which any scene represented by dense outdoor mobile LiDAR point clouds can be divided. The evaluation procedure for semantic segmentation of point clouds for geospatial applications is described. An experiment with the Kernel Point Fully Convolution Neural Network model trained on the proposed dataset was carried out. We obtained an overall 92.56% mIoU, which demonstrates the high efficiency of using deep learning models for point cloud semantic segmentation for geospatial applications in accordance with the proposed classification scheme. Full article
Show Figures

Figure 1

18 pages, 2021 KiB  
Article
R-PCR: Recurrent Point Cloud Registration Using High-Order Markov Decision
by Xiaoya Cheng, Shen Yan, Yan Liu, Maojun Zhang and Chen Chen
Remote Sens. 2023, 15(7), 1889; https://doi.org/10.3390/rs15071889 - 31 Mar 2023
Cited by 1 | Viewed by 1105
Abstract
Despite the fact that point cloud registration under noisy conditions has recently begun to be tackled by several non-correspondence algorithms, they neither struggle to fuse the global features nor abandon early state estimation during the iterative alignment. To solve the problem, we propose [...] Read more.
Despite the fact that point cloud registration under noisy conditions has recently begun to be tackled by several non-correspondence algorithms, they neither struggle to fuse the global features nor abandon early state estimation during the iterative alignment. To solve the problem, we propose a novel method named R-PCR (recurrent point cloud registration). R-PCR employs a lightweight cross-concatenation module and large receptive network to improve global feature performance. More importantly, it treats the point registration procedure as a high-order Markov decision process and introduces a recurrent neural network for end-to-end optimization. The experiments on indoor and outdoor benchmarks show that R-PCR outperforms state-of-the-art counterparts. The mean average error of rotation and translation of the aligned point cloud pairs are, respectively, reduced by 75% and 66% on the indoor benchmark (ScanObjectNN), and simultaneously by 50% and 37.5% on the outdoor benchmark (AirLoc). Full article
Show Figures

Figure 1

22 pages, 7923 KiB  
Article
Reconstruction of LoD-2 Building Models Guided by Façade Structures from Oblique Photogrammetric Point Cloud
by Feng Wang, Guoqing Zhou, Han Hu, Yuefeng Wang, Bolin Fu, Shiming Li and Jiali Xie
Remote Sens. 2023, 15(2), 400; https://doi.org/10.3390/rs15020400 - 09 Jan 2023
Cited by 7 | Viewed by 3943
Abstract
Due to the façade visibility, intuitive expression, and multi-view redundancy, oblique photogrammetry can provide optional data for large-scale urban LoD-2 reconstruction. However, the inherent noise in oblique photogrammetric point cloud resulting from the image-dense matching limits further model reconstruction applications. Thus, this paper [...] Read more.
Due to the façade visibility, intuitive expression, and multi-view redundancy, oblique photogrammetry can provide optional data for large-scale urban LoD-2 reconstruction. However, the inherent noise in oblique photogrammetric point cloud resulting from the image-dense matching limits further model reconstruction applications. Thus, this paper proposes a novel method for the efficient reconstruction of LoD-2 building models guided by façade structures from an oblique photogrammetric point cloud. First, a building planar layout is constructed combined with footprint data and the vertical planes of the building based on spatial consistency constraints. The cells in the planar layout represent roof structures with a distinct altitude difference. Then, we introduce regularity constraints and a binary integer programming model to abstract the façade with the best-fitting monotonic regularized profiles. Combined with the planar layout and regularized profiles, a 2D building topology is constructed. Finally, the vertices of building roof facets can be derived from the 2D building topology, thus generating a LoD-2 building model. Experimental results using real datasets indicate that the proposed method can generate reliable reconstruction results compared with two state-of-the-art methods. Full article
Show Figures

Graphical abstract

21 pages, 7697 KiB  
Article
Using Relative Projection Density for Classification of Terrestrial Laser Scanning Data with Unknown Angular Resolution
by Maolin Chen, Xinyi Zhang, Cuicui Ji, Jianping Pan and Fengyun Mu
Remote Sens. 2022, 14(23), 6043; https://doi.org/10.3390/rs14236043 - 29 Nov 2022
Viewed by 1019
Abstract
Point cloud classification is a key step for three-dimensional (3D) scene analysis in terrestrial laser scanning but is commonly affected by density variation. Many density-adaptive methods are used to weaken the impact of density variation and angular resolution, which denotes the angle between [...] Read more.
Point cloud classification is a key step for three-dimensional (3D) scene analysis in terrestrial laser scanning but is commonly affected by density variation. Many density-adaptive methods are used to weaken the impact of density variation and angular resolution, which denotes the angle between two horizontally or vertically adjacent laser beams and are commonly used as known parameters in those methods. However, it is difficult to avoid the case of unknown angular resolution, which limits the generality of such methods. Focusing on these problems, we propose a density-adaptive feature extraction method, considering the case when the angular resolution is unknown. Firstly, we present a method for angular resolution estimation called neighborhood analysis of randomly picked points (NARP). In NARP, n points are randomly picked from the original data and the k nearest points of each point are searched to form the neighborhood. The angles between the beams of each picked point and its corresponding neighboring points are used to construct a histogram, and the angular resolution is calculated by finding the adjacent beams of each picked point under this histogram. Then, a grid feature called relative projection density is proposed to weaken the effect of density variation based on the estimated angular resolution. Finally, a 12-dimensional feature vector is constructed by combining relative projection density and other commonly used geometric features, and the semantic label is generated utilizing a Random Forest classifier. Five datasets with a known angular resolution are used to validate the NARP method and an urban scene with a scanning distance of up to 1 km is used to compare the relative projection density with traditional projection density. The results demonstrate that our method achieves an estimation error of less than 0.001° in most cases and is stable with respect to different types of targets and parameter settings. Compared with traditional projection density, the proposed relative projection density can improve the performance of classification, particularly for small-size objects, such as cars, poles, and scanning artifacts. Full article
Show Figures

Figure 1

21 pages, 8295 KiB  
Article
A Method for the Automatic Extraction of Support Devices in an Overhead Catenary System Based on MLS Point Clouds
by Shengyuan Zhang, Qingxiang Meng, Yulong Hu, Zhongliang Fu and Lijin Chen
Remote Sens. 2022, 14(23), 5915; https://doi.org/10.3390/rs14235915 - 22 Nov 2022
Cited by 1 | Viewed by 1158
Abstract
A mobile laser scanning (MLS) system can acquire railway scene information quickly and provide a data foundation for regular railway inspections. The location of the catenary support device in an electrified railway system has a direct impact on the regular operation of the [...] Read more.
A mobile laser scanning (MLS) system can acquire railway scene information quickly and provide a data foundation for regular railway inspections. The location of the catenary support device in an electrified railway system has a direct impact on the regular operation of the power supply system. However, multi-type support device data accounts for a tiny proportion of the whole railway scene, resulting in its poor characteristic expression in the scene. Therefore, using traditional point cloud filtering or point cloud segmentation methods alone makes it difficult to achieve an effective segmentation and extraction of the support device. As a result, this paper proposes an automatic extraction algorithm for complex railway support devices based on MLS point clouds. First, the algorithm obtains hierarchies of the pillar point clouds and the support device point clouds in the railway scene through high stratification and then realizes the noise that was point-cloud-filtered in the scene. Then, the center point of the pillar device is retrieved from the pillar corridor by a neighborhood search, and then the locating and initial extracting of the support device are realized based on the relatively stable spatial topological relationship between the pillar and the support device. Finally, a post-processing optimization method integrating the pillar filter and the voxelized projection filter is designed to achieve the accurate and efficient extraction of the support device based on the feature differences between the support device and other devices in the initial extraction results. Furthermore, in the experimental part, we evaluate the treatment effect of the algorithm in six types of support devices, three types of support device distribution scenes, and two types of railway units. The experimental results show that the average extraction IoU of the multi-type support device, support device distribution scenes, and railway unit were 97.20%, 94.29%, and 96.11%, respectively. In general, the proposed algorithm can achieve the accurate and efficient extraction of various support devices in different scenes, and the influence of the algorithm parameters on the extraction accuracy and efficiency is elaborated in the discussion section. Full article
Show Figures

Figure 1

22 pages, 5943 KiB  
Article
MFNet: Multi-Level Feature Extraction and Fusion Network for Large-Scale Point Cloud Classification
by Yong Li, Qi Lin, Zhenxin Zhang, Liqiang Zhang, Dong Chen and Feng Shuang
Remote Sens. 2022, 14(22), 5707; https://doi.org/10.3390/rs14225707 - 11 Nov 2022
Cited by 6 | Viewed by 1904
Abstract
The accuracy with which a neural network interprets a point cloud depends on the quality of the features expressed by the network. Addressing this issue, we propose a multi-level feature extraction layer (MFEL) which collects local contextual feature and global information by modeling [...] Read more.
The accuracy with which a neural network interprets a point cloud depends on the quality of the features expressed by the network. Addressing this issue, we propose a multi-level feature extraction layer (MFEL) which collects local contextual feature and global information by modeling point clouds at different levels. The MFEL is mainly composed of three independent modules, including the aggregated GAPLayer, the spatial position perceptron, and the RBFLayer, which learn point cloud features from three different scales. The aggregated GAPLayer aggregates the geometry features of neighboring points in a local coordinate system to centroid by graph convolution. Then, the spatial position perceptron independently learns the position features of each point in the world coordinate system. Finally, the RBFLayer aggregates points into pointsets according to the correlation between features, and extracts features from the pointset scale through the quantization layer. Based on the MFEL, an end-to-end classification and segmentation network, namely the MFNet and MFNet-S, is proposed. In the proposed network, the channel-attention mechanism is employed to better aggregate multi-level features. We conduct classification and semantic segmentation experiments on four standard datasets. The results show that the proposed method outperforms the compared methods on the multiple datasets, resulting in 93.1% classification accuracy in ModelNet40. Furthermore, the mIoU of part semantic segmentation in ShapeNet is 85.4%, and the mIoU for semantic segmentation in S3DIS and Semantic3D is 62.9% and 71.9%, respectively. Full article
Show Figures

Graphical abstract

Back to TopTop