remotesensing-logo

Journal Browser

Journal Browser

Geospatial Foundation Model in Urban Environments: Challenges and New Technologies

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Urban Remote Sensing".

Deadline for manuscript submissions: closed (31 October 2023) | Viewed by 12905

Special Issue Editors

School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
Interests: GIS; remote sensing; machine learning; sparse represetation; brain theory
Guangdong Key Laboratory of Urban Informatics, School of Architecture & Urban Planning, Shenzhen University, Shenzhen 518060, China
Interests: GIS; urban informatics; smart mobility; urban studies
Department of Geography, Environment and Society, University of Minnesota, Twin Cities, MN 55455, USA
Interests: GIScience; social sensing; GeoAI; intelligent spatial analytics; urban complexity
Special Issues, Collections and Topics in MDPI journals
Department of Land-Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Interests: space-sime GIS; human mobility mining and modelling; urban data analytics and visualization; transport geography; computational social science
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

As Guest Editor for Remote Sensing, I am very happy to announce this Special Issue on “Geospatial Foundation Model in Urban Environments: Challenges and New Technologies”. The Editors and I are warmly inviting you to contribute with state-of-the-art research papers on the urban environments, urban dynamics, urban public health, urban human mobility, urban spatial networks, urban socioeconomic sustainability, and urban crime with modern machine learning theory and technology.

Innovative contributions on employing CNN, Transformer model, GNN, and Foundation model to support the SDGs of urban are welcome, as well as papers on the combined use of remote sensing and human trajectory data to discover hidden human behavior patterns. Also, we look forward to the presentation on your part of research results on innovative applications and progress on the topic of urban and carbon neutrality employing Remote Sensing and Artificial Intelligence.

Review contributions are welcomed, as well as papers describing new measurement concepts/sensors.

Dr. Haifeng Li
Dr. Wei Tu
Dr. Di Zhu
Dr. Yang Xu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • urban environments
  • urban human mobility
  • urban socioeconomic sustainability
  • urban crime
  • SDGs
  • carbon neutrality
  • deep learning
  • foundation models
  • CNN
  • transformer model
  • GNN

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

18 pages, 7220 KiB  
Article
Co-ECL: Covariant Network with Equivariant Contrastive Learning for Oriented Object Detection in Remote Sensing Images
by Yunsheng Zhang, Zijing Ren, Zichen Ding, Hong Qian, Haiqiang Li and Chao Tao
Remote Sens. 2024, 16(3), 516; https://doi.org/10.3390/rs16030516 - 29 Jan 2024
Viewed by 698
Abstract
Contrastive learning allows us to learn general features for downstream tasks without the need for labeled data by leveraging intrinsic signals within remote sensing images. Existing contrastive learning methods encourage invariant feature learning by bringing positive samples defined by random transformations in feature [...] Read more.
Contrastive learning allows us to learn general features for downstream tasks without the need for labeled data by leveraging intrinsic signals within remote sensing images. Existing contrastive learning methods encourage invariant feature learning by bringing positive samples defined by random transformations in feature spaces closer, where transformed samples of the same image at different intensities are considered equivalent. However, remote sensing images differ from natural images in their top-down perspective results in the arbitrary orientation of objects and in that the images contain rich in-plane rotation information. Maintaining invariance to rotation transformations can lead to the loss of rotation information in features, thereby affecting angle information predictions for differently rotated samples in downstream tasks. Therefore, we believe that contrastive learning should not focus only on strict invariance but encourage features to be equivariant to rotation while maintaining invariance to other transformations. To achieve this goal, we propose an invariant–equivariant covariant network (Co-ECL) based on collaborative and reverse mechanisms. The collaborative mechanism encourages rotation equivariance by predicting the rotation transformations of input images and combines invariant and equivariant learning tasks to jointly supervise the feature learning process to achieve collaborative learning. The reverse mechanism introduces a reverse rotation module in the feature learning stage, applying reverse rotation transformations with equal intensity to features in invariant learning tasks as in the data transformation stage, thereby ensuring their independent realization. In experiments conducted on three publicly available oriented object detection datasets of remote sensing images, our method consistently demonstrated the best performance. Additionally, these experiments on multi-angle datasets demonstrated that our method has good robustness on rotation-related tasks. Full article
Show Figures

Figure 1

18 pages, 3181 KiB  
Article
HAVANA: Hard Negative Sample-Aware Self-Supervised Contrastive Learning for Airborne Laser Scanning Point Cloud Semantic Segmentation
by Yunsheng Zhang, Jianguo Yao, Ruixiang Zhang, Xuying Wang, Siyang Chen and Han Fu
Remote Sens. 2024, 16(3), 485; https://doi.org/10.3390/rs16030485 - 26 Jan 2024
Viewed by 810
Abstract
Deep Neural Network (DNN)-based point cloud semantic segmentation has presented significant breakthrough using large-scale labeled aerial laser point cloud datasets. However, annotating such large-scaled point clouds is time-consuming. Self-Supervised Learning (SSL) is a promising approach to this problem by pre-training a DNN model [...] Read more.
Deep Neural Network (DNN)-based point cloud semantic segmentation has presented significant breakthrough using large-scale labeled aerial laser point cloud datasets. However, annotating such large-scaled point clouds is time-consuming. Self-Supervised Learning (SSL) is a promising approach to this problem by pre-training a DNN model utilizing unlabeled samples followed by a fine-tuned downstream task involving very limited labels. The traditional contrastive learning for point clouds selects the hardest negative samples by solely relying on the distance between the embedded features derived from the learning process, potentially evolving some negative samples from the same classes to reduce the contrastive learning effectiveness. This work proposes a hard-negative sample-aware self-supervised contrastive learning algorithm to pre-train the model for semantic segmentation. We designed a k-means clustering-based Absolute Positive And Negative samples (AbsPAN) strategy to filter the possible false-negative samples. Experiments on two typical ALS benchmark datasets demonstrate that the proposed method is more appealing than supervised training schemes without pre-training. Especially when the labels are severely inadequate (10% of the ISPRS training set), the results obtained by the proposed HAVANA method still exceed 94% of the supervised paradigm performance with full training set. Full article
Show Figures

Figure 1

18 pages, 8127 KiB  
Article
GFCNet: Contrastive Learning Network with Geography Feature Space Joint Negative Sample Correction for Land Cover Classification
by Zhaoyang Zhang, Wenxuan Jing, Haifeng Li, Chao Tao and Yunsheng Zhang
Remote Sens. 2023, 15(20), 5056; https://doi.org/10.3390/rs15205056 - 21 Oct 2023
Cited by 1 | Viewed by 871
Abstract
With the continuous improvement in the volume and spatial resolution of remote sensing images, the self-supervised contrastive learning paradigm driven by a large amount of unlabeled data is expected to be a promising solution for large-scale land cover classification with limited labeled data. [...] Read more.
With the continuous improvement in the volume and spatial resolution of remote sensing images, the self-supervised contrastive learning paradigm driven by a large amount of unlabeled data is expected to be a promising solution for large-scale land cover classification with limited labeled data. However, due to the richness and scale diversity of ground objects contained in remote sensing images, self-supervised contrastive learning encounters two challenges when performing large-scale land cover classification: (1) Self-supervised contrastive learning models treat random spatial–spectral transformations of different images as negative samples, even though they may contain the same ground objects, which leads to serious class confusion in land cover classification. (2) The existing self-supervised contrastive learning models simply use the single-scale features extracted by the feature extractor for land cover classification, which limits the ability of the model to capture different scales of ground objects in remote sensing images. In this study, we propose a contrastive learning network with Geography Feature space joint negative sample Correction (GFCNet) for land cover classification. To address class confusion, we propose a Geography Feature space joint negative sample Correction Strategy (GFCS), which integrates the geography space and feature space relationships of different images to construct negative samples, reducing the risk of negative samples containing the same ground object. In order to improve the ability of the model to capture the features of different scale ground objects, we adopt a Multi-scale Feature joint Fine-tuning Strategy (MFFS) to integrate different scale features obtained by the self-supervised contrastive learning network for land cover classification tasks. We evaluate the proposed GFCNet on three public land cover classification datasets and achieve the best results compared to seven baselines of self-supervised contrastive learning methods. Specifically, on the LoveDA Rural dataset, the proposed GFCNet improves 3.87% in Kappa and 1.54% in mIoU compared with the best baseline. Full article
Show Figures

Figure 1

20 pages, 5064 KiB  
Article
Delineating Peri-Urban Areas Using Multi-Source Geo-Data: A Neural Network Approach and SHAP Explanation
by Xiaomeng Sun, Xingjian Liu and Yang Zhou
Remote Sens. 2023, 15(16), 4106; https://doi.org/10.3390/rs15164106 - 21 Aug 2023
Cited by 1 | Viewed by 1231
Abstract
Delineating urban and peri-urban areas has often used information from multiple sources including remote sensing images, nighttime light images, and points-of-interest (POIs). Human mobility from big geo-spatial data could also be relevant for delineating peri-urban areas but its use is not fully explored. [...] Read more.
Delineating urban and peri-urban areas has often used information from multiple sources including remote sensing images, nighttime light images, and points-of-interest (POIs). Human mobility from big geo-spatial data could also be relevant for delineating peri-urban areas but its use is not fully explored. Moreover, it is necessary to assess how individual data sources are associated with identification results. Aiming at these gaps, we apply a neural network model to integrate indicators from multi-sources including land cover maps, nighttime light imagery as well as incorporating information about human movement from taxi trips to identify peri-urban areas. SHapley Additive exPlanations (SHAP) values are used as an explanation tool to assess how different data sources and indicators may be associated with delineation results. Wuhan, China is selected as a case study. Our findings highlight that socio-economic indicators, such as nighttime light intensity, have significant impacts on the identification of peri-urban areas. Spatial/physical attributes derived from land cover images and road density have relative low associations. Moreover, taxi intensity as a typical human movement dataset may complement nighttime light and POIs datasets, especially in refining boundaries between peri-urban and urban areas. Our study could inform the selection of data sources for identifying peri-urban areas, especially when facing data availability issues. Full article
Show Figures

Figure 1

22 pages, 15308 KiB  
Article
Sensing Travel Source–Sink Spatiotemporal Ranges Using Dockless Bicycle Trajectory via Density-Based Adaptive Clustering
by Yan Shi, Da Wang, Xiaolong Wang, Bingrong Chen, Chen Ding and Shijuan Gao
Remote Sens. 2023, 15(15), 3874; https://doi.org/10.3390/rs15153874 - 4 Aug 2023
Cited by 1 | Viewed by 925
Abstract
The travel source–sink phenomenon is a typical urban traffic anomaly that reflects the imbalanced dissipation and aggregation of human mobility activities. It is useful for pertinently balancing urban facilities and optimizing urban structures to accurately sense the spatiotemporal ranges of travel source–sinks, such [...] Read more.
The travel source–sink phenomenon is a typical urban traffic anomaly that reflects the imbalanced dissipation and aggregation of human mobility activities. It is useful for pertinently balancing urban facilities and optimizing urban structures to accurately sense the spatiotemporal ranges of travel source–sinks, such as for public transportation station optimization, sharing resource configurations, or stampede precautions among moving crowds. Unlike remote sensing using visual features, it is challenging to sense imbalanced and arbitrarily shaped source–sink areas using human mobility trajectories. This paper proposes a density-based adaptive clustering method to identify the spatiotemporal ranges of travel source–sink patterns. Firstly, a spatiotemporal field is utilized to construct a stable neighborhood of origin and destination points. Then, binary spatiotemporal statistical hypothesis tests are proposed to identify the source and sink core points. Finally, a density-based expansion strategy is employed to detect the spatial areas and temporal durations of sources and sinks. The experiments conducted using bicycle trajectory data in Shanghai show that the proposed method can accurately extract significantly imbalanced dissipation and aggregation events. The travel source–sink patterns detected by the proposed method have practical reference, meaning that they can provide useful insights into the redistribution of bike-sharing and station resources. Full article
Show Figures

Figure 1

19 pages, 1748 KiB  
Article
RiSSNet: Contrastive Learning Network with a Relaxed Identity Sampling Strategy for Remote Sensing Image Semantic Segmentation
by Haifeng Li, Wenxuan Jing, Guo Wei, Kai Wu, Mingming Su, Lu Liu, Hao Wu, Penglong Li and Ji Qi
Remote Sens. 2023, 15(13), 3427; https://doi.org/10.3390/rs15133427 - 6 Jul 2023
Cited by 1 | Viewed by 1126
Abstract
Contrastive learning techniques make it possible to pretrain a general model in a self-supervised paradigm using a large number of unlabeled remote sensing images. The core idea is to pull positive samples defined by data augmentation techniques closer together while pushing apart randomly [...] Read more.
Contrastive learning techniques make it possible to pretrain a general model in a self-supervised paradigm using a large number of unlabeled remote sensing images. The core idea is to pull positive samples defined by data augmentation techniques closer together while pushing apart randomly sampled negative samples to serve as supervised learning signals. This strategy is based on the strict identity hypothesis, i.e., positive samples are strictly defined by each (anchor) sample’s own augmentation transformation. However, this leads to the over-instancing of the features learned by the model and the loss of the ability to fully identify ground objects. Therefore, we proposed a relaxed identity hypothesis governing the feature distribution of different instances within the same class of features. The implementation of the relaxed identity hypothesis requires the sampling and discrimination of the relaxed identical samples. In this study, to realize the sampling of relaxed identical samples under the unsupervised learning paradigm, the remote sensing image was used to show that nearby objects often present a large correlation; neighborhood sampling was carried out around the anchor sample; and the similarity between the sampled samples and the anchor samples was defined as the semantic similarity. To achieve sample discrimination under the relaxed identity hypothesis, the feature loss was calculated and reordered for the samples in the relaxed identical sample queue and the anchor samples, and the feature loss between the anchor samples and the sample queue was defined as the feature similarity. Through the sampling and discrimination of the relaxed identical samples, the leap from instance-level features to class-level features was achieved to a certain extent while enhancing the network’s invariant learning of features. We validated the effectiveness of the proposed method on three datasets, and our method achieved the best experimental results on all three datasets compared to six self-supervised methods. Full article
Show Figures

Graphical abstract

21 pages, 1755 KiB  
Article
A Siamese Network with a Multiscale Window-Based Transformer via an Adaptive Fusion Strategy for High-Resolution Remote Sensing Image Change Detection
by Chao Tao, Dongsheng Kuang, Kai Wu, Xiaomei Zhao, Chunyan Zhao, Xin Du and Yunsheng Zhang
Remote Sens. 2023, 15(9), 2433; https://doi.org/10.3390/rs15092433 - 5 May 2023
Viewed by 1687
Abstract
Remote sensing image change detection (RS-CD) has made impressive progress with the help of deep learning techniques. Small object change detection (SoCD) still faces many challenges. On the one hand, when the scale of changing objects varies greatly, deep learning models with overall [...] Read more.
Remote sensing image change detection (RS-CD) has made impressive progress with the help of deep learning techniques. Small object change detection (SoCD) still faces many challenges. On the one hand, when the scale of changing objects varies greatly, deep learning models with overall accuracy as the optimization goal tend to focus on large object changes and ignore small object changes to some extent. On the other hand, the RS-CD model based on deep convolutional networks needs to perform multiple spatial pooling operations on the feature map to obtain deep semantic features, which leads to the loss of small object feature-level information in the local space. Therefore, we propose a Siamese transformer change detection network with a multiscale window via an adaptive fusion strategy (SWaF-Trans). To solve the problem of ignoring small object changes, we compute self-attention in windows of different scales to model changing objects at the corresponding scales and establish semantic information links through a moving window mechanism to capture more comprehensive small object features in small-scale windows, thereby enhancing the feature representation of multiscale objects. To fuse multiscale features and alleviate the problem of small object feature information loss, we propose a channel-related fusion mechanism to model the global correlation between channels for display and adaptively adjust the fusion weights of channels to enable the network to capture more discriminative features of interest and reduce small object feature information loss. Experiments on the CDD and WHU-CD datasets show that SWaF-Trans exceeds eight advanced baseline methods, with absolute F1 scores as high as 97.10% and 93.90%, achieving maximum increases of 2% and 5.6%, respectively, compared to the baseline methods. Full article
Show Figures

Figure 1

15 pages, 2378 KiB  
Article
The Dynamic Heterogeneous Relationship between Urban Population Distribution and Built Environment in Xi’an, China: A Case Study
by Xiping Yang, Zhiyuan Zhao, Chaoyang Shi, Lin Luo and Wei Tu
Remote Sens. 2023, 15(9), 2257; https://doi.org/10.3390/rs15092257 - 25 Apr 2023
Cited by 1 | Viewed by 1484
Abstract
The interaction between the population and built environment is a constant topic in urban spaces and is the main driving force of urban evolution. Understanding urban population distribution and its relationship with the built environment could provide guidance for urban planning, traffic, and [...] Read more.
The interaction between the population and built environment is a constant topic in urban spaces and is the main driving force of urban evolution. Understanding urban population distribution and its relationship with the built environment could provide guidance for urban planning, traffic, and disaster management. Following this line of thought, this study conducted an empirical analysis in Xi’an, a rapidly developing western city in China. Well-permeated mobile phone location data were used to represent the spatiotemporal dynamics of the population, and the built environment was characterized from five perspectives—transportation, location, building, greenery, and land use—using multisource geospatial data. Finally, the dynamic heterogeneous influence of built environment factors on population distribution was examined using multiscale geographically weighted regression (MGWR). Overall, the influencing coefficients exhibited a significant dynamic changing process from a temporal perspective and simultaneously demonstrated spatial nonstationarity. Moreover, the specific findings about the influence of each built environment factor facilitate a deeper insight into dynamic population distribution and its determinants. Full article
Show Figures

Figure 1

Review

Jump to: Research

34 pages, 3055 KiB  
Review
Deep Learning Methods for Semantic Segmentation in Remote Sensing with Small Data: A Survey
by Anzhu Yu, Yujun Quan, Ru Yu, Wenyue Guo, Xin Wang, Danyang Hong, Haodi Zhang, Junming Chen, Qingfeng Hu and Peipei He
Remote Sens. 2023, 15(20), 4987; https://doi.org/10.3390/rs15204987 - 16 Oct 2023
Cited by 3 | Viewed by 2593
Abstract
The annotations used during the training process are crucial for the inference results of remote sensing images (RSIs) based on a deep learning framework. Unlabeled RSIs can be obtained relatively easily. However, pixel-level annotation is a process that necessitates a high level of [...] Read more.
The annotations used during the training process are crucial for the inference results of remote sensing images (RSIs) based on a deep learning framework. Unlabeled RSIs can be obtained relatively easily. However, pixel-level annotation is a process that necessitates a high level of expertise and experience. Consequently, the use of small sample training methods has attracted widespread attention as they help alleviate reliance on large amounts of high-quality labeled data and current deep learning methods. Moreover, research on small sample learning is still in its infancy owing to the unique challenges faced when completing semantic segmentation tasks with RSI. To better understand and stimulate future research that utilizes semantic segmentation tasks with small data, we summarized the supervised learning methods and challenges they face. We also reviewed the supervised approaches with data that are currently popular to help elucidate how to efficiently utilize a limited number of samples to address issues with semantic segmentation in RSI. The main methods discussed are self-supervised learning, semi-supervised learning, weakly supervised learning and few-shot methods. The solution of cross-domain challenges has also been discussed. Furthermore, multi-modal methods, prior knowledge constrained methods, and future research required to help optimize deep learning models for various downstream tasks in relation to RSI have been identified. Full article
Show Figures

Figure 1

Back to TopTop