Advanced Machine Learning and Deep Learning Approaches for Remote Sensing II

This Special Issue marks the third edition of “Advanced Machine Learning and Deep Learning Approaches for Remote Sensing” [...]


Introduction
This Special Issue marks the third edition of "Advanced Machine Learning and Deep Learning Approaches for Remote Sensing".In this edition, I aim to showcase the latest developments and trends in the application of sophisticated machine learning and deep learning methodologies to the challenges of processing remote sensing data.Remote sensing, the practice of acquiring information about objects or phenomena without direct contact, has been revolutionized by artificial intelligence.Techniques such as machine learning and deep learning have shown substantial promise in addressing the intricacies of processing signals, images, and videos that are obtained through remote sensing.These AI methodologies require significant computational resources, often necessitating the use of GPUs, due to the vast volumes of high-resolution Earth observation data that are now available for global monitoring.Such advancements have propelled remote sensing into a new era of accuracy, reliability, and speed in data analysis.I invited submissions that explore both theoretical and practical aspects of these cutting-edge artificial intelligence and data science techniques within the remote sensing domain.Contributions that offer novel insights and methodologies for the remote sensing research community were particularly encouraged.This edition continues the commitment to advancing the field by addressing the myriad of challenges that are presented by the analysis of time series remote sensing data through advanced machine learning approaches.A total of 16 papers were published in this Special Issue.

Overview of Contributions
The contribution by Liu et al., "Simulation of the Ecological Service Value and Ecological Compensation in Arid Area: A Case Study of Ecologically Vulnerable Oasis", leverages a sophisticated deep learning framework, combining convolutional neural networks (CNNs) with Gated Recurrent Units (GRUs), to forecast the ecological service value across the Wuwei arid oasis for the coming decade (paper 1 in the list of contributions).It employs the ecology-economy harmony index to guide ecological compensation priorities, while the GeoDetector method is utilized to ascertain the impact of various drivers on the ecological service value from 2000 to 2030.The authors' key findings reveal that the integrated model surpasses alternative methodologies in accuracy, indicates an upward trajectory in the ecological service value despite urban expansion pressures, and underscores the critical need for ecological compensation to achieve sustainable development in arid regions, with geographic and environmental factors playing pivotal roles in influencing ecological service outcomes.
The authors of the paper on "Hybrid-Scale Hierarchical Transformer for Remote Sensing Image Super-Resolution" introduce the Hybrid-Scale Hierarchical Transformer Network (HSTNet) to enhance the spatial resolution of remote sensing images, overcoming the constraints of current spaceborne imaging technologies (paper 2 in the list of contributions).This novel approach employs a hybrid-scale feature exploitation module to harness recursive self-similarity across various scales.Furthermore, a cross-scale enhancement transformer is developed to exploit long-range dependencies, enabling precise interaction between high-and low-dimensional features, and thus significantly improving images' super-resolution.The HSTNet sets new benchmarks on the UCMecred and AID datasets, as evidenced by its superior performance in terms of the Peak-Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) metrics.Comparative analyses affirm HSTNet's supremacy over contemporary models, showcasing its exceptional ability in both quantitative and qualitative assessments.
The contribution by Montañez et al., "Application of Data Sensor Fusion Using Extended Kalman Filter Algorithm for Identification and Tracking of Moving Targets from LiDAR-Radar Data", explores the enhancement of surveillance and monitoring systems through the integration of mobile vehicles or unmanned aerial vehicles (UAVs), specifically drones, which offer superior access, range, maneuverability, and safety by enabling omnidirectional movement for exploration, identification, and execution of security operations (paper 3 in the list of contributions).Addressing the challenge of errors and uncertainties in environmental data, this work advances the data acquisition resolution by incorporating a data sensor fusion system that amalgamates inputs from multiple sensors, such as LiDAR and Radar, to observe a singular physical phenomenon.Utilizing the constant turn and rate velocity (CTRV) kinematic model with an added focus on angular velocity-which has previously not been considered-they apply an extended Kalman filter (EKF) for enhanced detection of moving targets.Their evaluation on a dataset combining LiDAR and Radar sensor data of an object undergoing abrupt trajectory changes, with introduced additive white Gaussian noise, demonstrates a significant 0.4 improvement in object detection accuracy using the RMSE metric compared to traditional models that neglect such dynamic path variations.
The contribution by Wang et al., "A Prediction Method of Ionospheric hmF2 Based on Machine Learning", introduces an interpretable, long-term prediction model for the peak electron density height (hmF2) of the ionospheric F2 layer, which is crucial for enhancing high-frequency (HF) radio wave propagation forecasting and communication frequency selection (paper 4 in the list of contributions).Utilizing statistical machine learning (SML), this model has undergone validation with ionospheric data from Moscow station, from August 2011 to October 2016.Inputs such as the sunspot number, month, and universal time enable accurate hmF2 predictions.The comparative analysis with the International Reference Ionospheric (IRI) model demonstrates the proposed model's enhanced stability and reliability, evidencing a notable decrease in the average statistical root mean square error (RMSE) by 5.20 km and in the relative root mean square error (RRMSE) by 1.78%.This methodology holds promise for globally accurate ionospheric parameter predictions.
The contribution by Sheng et al., "An Integrated Framework for Spatiotemporally Merging Multi-Sources Precipitation Based on F-SVD and ConvLSTM", introduces a sophisticated framework that is designed to refine the accuracy and reliability of precipitation estimates by leveraging machine learning technologies to integrate diverse data sources (paper 5 in the list of contributions).The novelty of their approach lies in harnessing the Funk-Singular Value Decomposition (F-SVD) within a recommender system to meticulously interpolate the spatial distribution of precipitation from the rain gauge data, coupled with the deployment of Convolutional Long Short-Term Memory (ConvLSTM) for the effective fusion of interpolated and satellite-derived precipitation data by exploring their spatiotemporal correlations.Applied to the Jianxi Basin in Southeast China, using data from 2006 to 2018, their FS-ConvLSTM framework markedly outperforms traditional models, exhibiting enhanced precision in precipitation distribution and superiority in capturing rainfall variations.Notably, it achieves a significant reduction in the root mean square error (RSME) by 63.6% and an increase in the probability of detection (POD) by 22.9% over baseline Global Precipitation Measurement (GPM) data.This method not only consolidates the strengths of various data sources but also effectively addresses their limitations, ensuring close alignment with observed precipitation patterns.Thus, their framework emerges as a crucial advancement for enhancing the accuracy of precipitation estimations, offering significant benefits for water resource management and disaster preparedness efforts.
The contribution by Hu et al., "Estimation of the Two-Dimensional Direction of Arrival for Low-Elevation and Non-Low-Elevation Targets Based on Dilated Convolutional Networks", addresses the challenge of estimating the two-dimensional direction-of-arrival (2D DOA) for targets at varying elevations using L-shaped uniform and sparse arrays (paper 6 in the list of contributions).The authors introduce a novel 2D DOA estimation algorithm leveraging a dilated convolutional network framework, comprising a dilated convolutional autoencoder for multipath signal suppression and a dilated convolutional neural network for direct 2D DOA estimation.This dual-component approach enables effective multipath signal mitigation and precise DOA estimation for both low-elevation and higher targets.The study's simulation results demonstrate the algorithm's superior performance, achieving near-perfect accuracy in elevation and azimuth angle estimation across various arrays, even in low signal-to-noise scenarios.This advancement significantly outperforms traditional DOA estimation methods, offering a robust solution for complex signal environments.
The contribution by Lei et al., "Network Collaborative Pruning Method for Hyperspectral Image Classification Based on Evolutionary Multi-Task Optimization", presents a novel network collaborative pruning method that is tailored for hyperspectral image classification, leveraging evolutionary multi-task optimization (paper 7 in the list of contributions).Aimed at addressing the complexity of deploying neural network models on mobile platforms, this method enhances the model's efficiency by optimizing the storage and processing requirements without compromising accuracy.Utilizing an automated approach, it bypasses the need for manually crafted pruning rules while tackling the challenge of search efficiency in complex networks.Through simultaneous application on multiple hyperspectral images, the method facilitates knowledge transfer between tasks, identifying crucial network structures.A self-adaptive knowledge transfer strategy, informed by historical data and incorporating a dormancy mechanism, mitigates potential negative impacts and conserves computational resources.The experimental validation demonstrates the method's effectiveness in compressing neural networks and maintaining a high classification accuracy, even with limited labeled data, across various hyperspectral datasets.
In the contribution by Lu et al. on "Adversarial Robustness Enhancement of UAV-Oriented Automatic Image Recognition Based on Deep Ensemble Models", the authors explore the vulnerabilities of unmanned aerial vehicles (UAVs) that are equipped with deep neural network (DNN)-based visual recognition systems for adversarial attacks, including camouflaged patterns and imperceptible image perturbations (paper 8 in the list of contributions).The authors propose two ensemble defense strategies, combining convolutional neural networks (CNNs) and transformers for proactive model robustness and reactive adversarial detection, which are tailored for the resource constraints of UAVs.These strategies employ a mix of output probability distributions and integrated scoring functions to counteract attacks.The authors' experimental evaluation across optical and synthetic aperture radar (SAR) datasets demonstrates that these ensemble approaches significantly enhance the defense against adversarial threats, surpassing the efficacy of single models without necessitating additional training.Furthermore, they introduce AREP-RSIs, a comprehensive platform for assessing and fortifying the adversarial robustness of remote sensing image recognition models, contributing to ongoing research in the field.
In the contribution by Asanjan et al., "Probabilistic Wildfire Segmentation Using Supervised Deep Generative Model from Satellite Imagery", the authors introduce a novel supervised deep generative machine learning model that is designed for stochastic wildfire detection, facilitating rapid and comprehensive uncertainty quantification for both individual and collective wildfire events (paper 9 in the list of contributions).This model is specifically tailored to addressing the challenges that are posed by the patchy and discontinuous data from the Moderate Resolution Imaging Spectroradiometer (MODIS), utilizing both raw and combined band data to improve the fire detection accuracy.Through generating varied yet plausible wildfire boundary segmentations, their approach simulates expert disagreement on wildfire delineation.Incorporating dual streams for learning stochastic latent distributions and visual features, the model culminates in a sophisticated stochastic image-to-image wildfire detection system.Compared with two stochastic baseline models employing permanent dropout and Stochastic ReLU activations, their proposed Probabilistic U-Net demonstrates superior alignment with the ground truth data and an enhanced grasp of wildfire dynamics, as evidenced by its visual and statistical metrics across multiple evaluation scenarios.
In the contribution by Zuo et al., "A Pattern Classification Distribution Method for Geostatistical Modeling Evaluation and Uncertainty Quantification", the authors explore the application of a Pattern Classification Distribution (PCD) method to quantitatively assess geostatistical modeling, which is pivotal for generating reliable geological model realizations (paper 10 in the list of contributions).Initially, they introduce a correlationdriven template approach to identify geological patterns, employing region growing and elbow-point detection to devise an adaptive template based on the spatial dependencies of the training image (TI).Subsequently, they recommend a hybrid approach of clustering and classification for delineating geological structures, utilizing hierarchical clustering and decision trees for straightforward parameter specification.Additionally, they propose a multi-grid analysis stacking framework, determining the contribution of each grid via the morphological characteristics of the TI.A comprehensive evaluation across various models, including channel models, 2D nonstationary flume systems, Antarctic subglacial bed topographies, and 3D sandstone models, confirms the PCD's efficacy in capturing diverse geological categories, continuous variables, and complex high-dimensional structures.
In the contribution by Peng et al., "Spectral-Swin Transformer with Spatial Feature Extraction Enhancement for Hyperspectral Image Classification", the authors introduce a Spectral Shifted Window Self-Attention-based Transformer (SSWT) network, designed to enhance hyperspectral image classification (HSI) by effectively capturing the inherent long-range dependencies in HSI spectral dimensions (paper 11 in the list of contributions).Unlike conventional convolutional neural network (CNN)-based approaches, which struggle with long-range dependencies, their proposed SSWT network, coupled with a spatial feature extraction (SFE) module and Spatial Position Encoding (SPE), significantly improves local feature extraction and addresses the transformers' limitations in spatial feature representation.The SFE module is specifically developed to bolster the transformer's spatial feature extraction capabilities, while the SPE technique compensates for the loss of spatial structure in HSI data.Extensive testing on three public datasets, comparing their model against several state-of-the-art deep learning models, reveals that their approach not only exhibits superior efficiency but also outperforms its contemporaries in HSI classification tasks.
In the contribution by Gao et al., "Moving Point Target Detection Based on Temporal Transient Disturbance Learning in Low SNR", the authors introduce a novel framework for detecting moving point targets in optical remote sensing under low signal-to-noise ratios (SNR), employing an end-to-end network-specifically, a 1D-ResNet-to assess the distribution features of transient disturbances within the temporal profile (TP) that are generated as a target traverses a pixel (paper 12 in the list of contributions).This approach translates image-based point target detection into transient disturbance detection within the TP, for which mathematical models are developed.Utilizing these models, the authors create a simulated TP dataset to train the 1D-ResNet, incorporating a CBR-1D structure for initial feature extraction and dual LBR modules for advanced classification and disturbance location identification.A multi-task weighted loss function is introduced to ensure effective training.Extensive testing confirms their method's superior ability to detect low-SNR moving point targets, outperforming benchmarks in detection rate, false alarm minimization, and overall efficiency.
In the contribution by Wang et al., "SSANet: An Adaptive Spectral-Spatial Attention Autoencoder Network for Hyperspectral Unmixing", the authors introduce SSANet, an adaptive spectral-spatial attention autoencoder network, which is designed to address the challenge of mixed pixels in hyperspectral imagery (paper 13 in the list of contributions).The network uniquely employs an adaptive spectral-spatial attention module that enhances the feature refinement by applying spectral attention for selecting pertinent spectral bands and spatial attention for sequentially isolating relevant spatial information.Further, SSANet capitalizes on the hyperspectral image's geometric properties and abundance sparsity, incorporating minimum volume and sparsity regularization terms into its loss function to optimize the endmember extraction and abundance estimation.Evaluations conducted on a synthetic dataset and four real hyperspectral scenes-Samson, Jasper Ridge, Houston, and Urban-demonstrate SSANet's superior performance in unmixing accuracy, evidenced by its competitive root mean square error and spectral angle distance metrics against established and contemporary unmixing methodologies.
In the contribution by Guo et al., "3D-UNet-LSTM: A Deep Learning-Based Radar Echo Extrapolation Model for Convective Nowcasting", the authors propose a novel 3D-UNet-LSTM model to address the challenges in radar echo extrapolation for convective nowcasting, which aims to predict the short-term evolution of convective systems (paper 14 in the list of contributions).The model employs a 3D-UNet for comprehensive spatiotemporal feature extraction from radar imagery, coupled with a Seq2Seq network in the forecaster module that utilizes ConvLSTM layers to iteratively predict future radar images across different timestamps.Conducting experiments on the public MeteoNet dataset for 0-1 h forecasting, their quantitative assessments affirm the efficacy of the 3D-UNet extractor and the Seq2Seq forecaster.Case studies further illustrate the model's superior capability in capturing the complex nonlinear dynamics of convective processes, showcasing a significant advancement in spatiotemporal modeling for weather nowcasting.
In the contribution by Gao et al., "Center-Ness and Repulsion: Constraints to Improve Remote Sensing Object Detection via RepPoints", the authors address the complexities of remote sensing object detection, particularly the challenges that are posed by objects' dense packing, arbitrary orientations, and intricate backgrounds (paper 15 in the list of contributions).Distinct from prevailing methods that overlook the spatial dynamics among objects, their study introduces a shape-adaptive repulsion constraint for point representation to precisely delineate the geometric attributes of densely situated remote sensing objects.Their approach encompasses (1) a shape-adaptive center-ness quality assessment strategy to refine the bounding box accuracy by penalizing significant deviations from the center point and (2) an innovative focused repulsion regression loss, designed to enhance target discrimination in densely clustered environments.Comprehensive evaluations conducted across four demanding datasets, DOTA, HRSC2016, UCAS-AOD, and WHU-RSONE-OBB, affirm the superiority of their method in navigating the complexities of remote sensing imagery.
In the contribution by Li et al., "SquconvNet: Deep Sequencer Convolutional Network for Hyperspectral Image Classification", the authors introduce SquconvNet, a novel architecture that synergizes convolutional neural networks (CNNs) with the Sequencer block, aiming to enhance performance in hyperspectral image (HSI) classification (paper 16 in the list of contributions).This development is achieved in the context of the transformative impact that transformer models have had in computer vision over the past five years, despite their relatively underwhelming results in HSI classification to date.The Sequencer structure, notable for replacing the transformer's self-attention layer with a BiLSTM2D layer, has shown promising outcomes in image classification.The authors rigorously test the SquconvNet on three benchmark datasets for HSI classification, and their findings demonstrate distinct advantages in classification accuracy and stability over existing methods, marking a significant advancement in the field.

Conclusions
This Special Issue presents 16 groundbreaking research findings on advanced machine learning and deep learning techniques for remote sensing.The insights shared herein are expected to foster further advancements and research in the domain of artificial intelligencebased remote sensing in the future.