MDPI - Publisher of Open Access Journals

29 pages, 482 KiB

Open AccessReview

AI in Maritime Security: Applications, Challenges, Future Directions, and Key Data Sources

by Kashif Talpur, Raza Hasan, Ismet Gocer, Shakeel Ahmad and Zakirul Bhuiyan

Information 2025, 16(8), 658; https://doi.org/10.3390/info16080658 (registering DOI) - 31 Jul 2025

The growth and sustainability of today’s global economy heavily relies on smooth maritime operations. The increasing security concerns to marine environments pose complex security challenges, such as smuggling, illegal fishing, human trafficking, and environmental threats, for traditional surveillance methods due to their limitations. [...] Read more.

The growth and sustainability of today’s global economy heavily relies on smooth maritime operations. The increasing security concerns to marine environments pose complex security challenges, such as smuggling, illegal fishing, human trafficking, and environmental threats, for traditional surveillance methods due to their limitations. Artificial intelligence (AI), particularly deep learning, has offered strong capabilities for automating object detection, anomaly identification, and situational awareness in maritime environments. In this paper, we have reviewed the state-of-the-art deep learning models mainly proposed in recent literature (2020–2025), including convolutional neural networks, recurrent neural networks, Transformers, and multimodal fusion architectures. We have highlighted their success in processing diverse data sources such as satellite imagery, AIS, SAR, radar, and sensor inputs from UxVs. Additionally, multimodal data fusion techniques enhance robustness by integrating complementary data, yielding more detection accuracy. There still exist challenges in detecting small or occluded objects, handling cluttered scenes, and interpreting unusual vessel behaviours, especially under adverse sea conditions. Additionally, explainability and real-time deployment of AI models in operational settings are open research areas. Overall, the review of existing maritime literature suggests that deep learning is rapidly transforming maritime domain awareness and response, with significant potential to improve global maritime security and operational efficiency. We have also provided key datasets for deep learning models in the maritime security domain. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

20 pages, 1776 KiB

Open AccessReview

Bridging Theory and Practice: A Review of AI-Driven Techniques for Ground Penetrating Radar Interpretation

by Lilong Zou, Ying Li, Kevin Munisami and Amir M. Alani

Appl. Sci. 2025, 15(15), 8177; https://doi.org/10.3390/app15158177 - 23 Jul 2025

Viewed by 247

Abstract

Artificial intelligence (AI) has emerged as a powerful tool for advancing the interpretation of ground penetrating radar (GPR) data, offering solutions to long-standing challenges in manual analysis, such as subjectivity, inefficiency, and limited scalability. This review investigates recent developments in AI-driven techniques for [...] Read more.

Artificial intelligence (AI) has emerged as a powerful tool for advancing the interpretation of ground penetrating radar (GPR) data, offering solutions to long-standing challenges in manual analysis, such as subjectivity, inefficiency, and limited scalability. This review investigates recent developments in AI-driven techniques for GPR interpretation, with a focus on machine learning, deep learning, and hybrid approaches that incorporate physical modeling or multimodal data fusion. We systematically analyze the application of these techniques across various domains, including utility detection, infrastructure monitoring, archeology, and environmental studies. Key findings highlight the success of convolutional neural networks in hyperbola detection, the use of segmentation models for stratigraphic analysis, and the integration of AI with robotic and real-time systems. However, challenges remain with generalization, data scarcity, model interpretability, and operational deployment. We identify promising directions, such as domain adaptation, explainable AI, and edge-compatible solutions for practical implementation. By synthesizing current progress and limitations, this review aims to bridge the gap between theoretical advancements in AI and the practical needs of GPR practitioners, guiding future research towards more reliable, transparent, and field-ready systems. Full article

(This article belongs to the Special Issue Machine Learning and Data Analysis: Bridging Theory and Real-World Solutions)

► Show Figures

Figure 1

26 pages, 6798 KiB

Open AccessArticle

Robust Optical and SAR Image Matching via Attention-Guided Structural Encoding and Confidence-Aware Filtering

by Qi Kang, Jixian Zhang, Guoman Huang and Fei Liu

Remote Sens. 2025, 17(14), 2501; https://doi.org/10.3390/rs17142501 - 18 Jul 2025

Viewed by 371

Abstract

Accurate feature matching between optical and synthetic aperture radar (SAR) images remains a significant challenge in remote sensing due to substantial modality discrepancies in texture, intensity, and geometric structure. In this study, we proposed an attention-context-aware deep learning framework (ACAMatch) for robust and [...] Read more.

Accurate feature matching between optical and synthetic aperture radar (SAR) images remains a significant challenge in remote sensing due to substantial modality discrepancies in texture, intensity, and geometric structure. In this study, we proposed an attention-context-aware deep learning framework (ACAMatch) for robust and efficient optical–SAR image registration. The proposed method integrates a structure-enhanced feature extractor, RS2FNet, which combines dual-stage Res2Net modules with a bi-level routing attention mechanism to capture multi-scale local textures and global structural semantics. A context-aware matching module refines correspondences through self- and cross-attention, coupled with a confidence-driven early-exit pruning strategy to reduce computational cost while maintaining accuracy. Additionally, a match-aware multi-task loss function jointly enforces spatial consistency, affine invariance, and structural coherence for end-to-end optimization. Experiments on public datasets (SEN1-2 and WHU-OPT-SAR) and a self-collected Gaofen (GF) dataset demonstrated that ACAMatch significantly outperformed existing state-of-the-art methods in terms of the number of correct matches, matching accuracy, and inference speed, especially under challenging conditions such as resolution differences and severe structural distortions. These results indicate the effectiveness and generalizability of the proposed approach for multimodal image registration, making ACAMatch a promising solution for remote sensing applications such as change detection and multi-sensor data fusion. Full article

(This article belongs to the Special Issue Advancements of Vision-Language Models (VLMs) in Remote Sensing)

► Show Figures

Figure 1

18 pages, 2702 KiB

Open AccessArticle

How to Talk to Your Classifier: Conditional Text Generation with Radar–Visual Latent Space

by Julius Ott, Huawei Sun, Lorenzo Servadei and Robert Wille

Sensors 2025, 25(14), 4467; https://doi.org/10.3390/s25144467 - 17 Jul 2025

Viewed by 353

Abstract

Many radar applications rely primarily on visual classification for their evaluations. However, new research is integrating textual descriptions alongside visual input and showing that such multimodal fusion improves contextual understanding. A critical issue in this area is the effective alignment of coded text [...] Read more.

Many radar applications rely primarily on visual classification for their evaluations. However, new research is integrating textual descriptions alongside visual input and showing that such multimodal fusion improves contextual understanding. A critical issue in this area is the effective alignment of coded text with corresponding images. To this end, our paper presents an adversarial training framework that generates descriptive text from the latent space of a visual radar classifier. Our quantitative evaluations show that this dual-task approach maintains a robust classification accuracy of 98.3% despite the inclusion of Gaussian-distributed latent spaces. Beyond these numerical validations, we conduct a qualitative study of the text output in relation to the classifier’s predictions. This analysis highlights the correlation between the generated descriptions and the assigned categories and provides insight into the classifier’s visual interpretation processes, particularly in the context of normally uninterpretable radar data. Full article

(This article belongs to the Special Issue AI-Powered RF Sensing and Signal Intelligence: Advances in Detection and Classification Techniques)

► Show Figures

Graphical abstract

21 pages, 5313 KiB

Open AccessArticle

MixtureRS: A Mixture of Expert Network Based Remote Sensing Land Classification

by Yimei Liu, Changyuan Wu, Minglei Guan and Jingzhe Wang

Remote Sens. 2025, 17(14), 2494; https://doi.org/10.3390/rs17142494 - 17 Jul 2025

Viewed by 326

Abstract

Accurate land-use classification is critical for urban planning and environmental monitoring, yet effectively integrating heterogeneous data sources such as hyperspectral imagery and laser radar (LiDAR) remains challenging. To address this, we propose MixtureRS, a compact multimodal network that effectively integrates hyperspectral imagery and [...] Read more.

Accurate land-use classification is critical for urban planning and environmental monitoring, yet effectively integrating heterogeneous data sources such as hyperspectral imagery and laser radar (LiDAR) remains challenging. To address this, we propose MixtureRS, a compact multimodal network that effectively integrates hyperspectral imagery and LiDAR data for land-use classification. Our approach employs a 3-D plus heterogeneous convolutional stack to extract rich spectral–spatial features, which are then tokenized and fused via a cross-modality transformer. To enhance model capacity without incurring significant computational overhead, we replace conventional dense feed-forward blocks with a sparse Mixture-of-Experts (MoE) layer that selectively activates the most relevant experts for each token. Evaluated on a 15-class urban benchmark, MixtureRS achieves an overall accuracy of 88.6%, an average accuracy of 90.2%, and a Kappa coefficient of 0.877, outperforming the best homogeneous transformer by over 12 percentage points. Notably, the largest improvements are observed in water, railway, and parking categories, highlighting the advantages of incorporating height information and conditional computation. These results demonstrate that conditional, expert-guided fusion is a promising and efficient strategy for advancing multimodal remote sensing models. Full article

(This article belongs to the Topic Advances in Multi-Scale Geographic Environmental Monitoring: Ecosystem Differences and Multi-Scale Comparisons)

► Show Figures

Graphical abstract

14 pages, 2247 KiB

Open AccessArticle

Design and Simulation of Optical Waveguide Digital Adjustable Delay Lines Based on Optical Switches and Archimedean Spiral Structures

by Ting An, Limin Liu, Guizhou Lv, Chunhui Han, Yafeng Meng, Sai Zhu, Yuandong Niu and Yunfeng Jiang

Photonics 2025, 12(7), 679; https://doi.org/10.3390/photonics12070679 - 5 Jul 2025

Viewed by 277

Abstract

In the field of modern optical communication, radar signal processing and optical sensors, true time delay technology, as a key means of signal processing, can achieve the accurate control of the time delay of optical signals. This study presents a novel design that [...] Read more.

In the field of modern optical communication, radar signal processing and optical sensors, true time delay technology, as a key means of signal processing, can achieve the accurate control of the time delay of optical signals. This study presents a novel design that integrates a 2 × 2 Multi-Mode Interference (MMI) structure with a Mach–Zehnder modulator on a silicon nitride–lithium niobate (SiN-LiNbO₃) heterogeneous integrated optical platform. This configuration enables the selective interruption of optical wave paths. The upper path passes through an ultralow-loss Archimedes’ spiral waveguide delay line made of silicon nitride, where the five spiral structures provide delays of 10 ps, 20 ps, 40 ps, 80 ps, and 160 ps, respectively. In contrast, the lower path is straight through, without introducing an additional delay. By applying an electrical voltage, the state of the SiN-LiNbO₃ switch can be altered, facilitating the switching and reconfiguration of optical paths and ultimately enabling the combination of various delay values. Simulation results demonstrate that the proposed optical true delay line achieves a discrete, adjustable delay ranging from 10 ps to 310 ps with a step size of 10 ps. The delay loss is less than 0.013 dB/ps, the response speed reaches the order of ns, and the 3 dB-EO bandwidth is broader than 67 GHz. In comparison to other optical switches optical true delay lines in terms of the parameters of delay range, minimum adjustable delay, and delay loss, the proposed optical waveguide digital adjustable true delay line, which is based on an optical switch and an Archimedes’ spiral structure, has outstanding advantages in response speed and delay loss. Full article

(This article belongs to the Special Issue Recent Advances in Micro/Nano-Optics and Photonics)

► Show Figures

Figure 1

23 pages, 3791 KiB

Open AccessArticle

A Method for Few-Shot Radar Target Recognition Based on Multimodal Feature Fusion

by Yongjing Zhou, Yonggang Li and Weigang Zhu

Sensors 2025, 25(13), 4162; https://doi.org/10.3390/s25134162 - 4 Jul 2025

Viewed by 370

Abstract

Enhancing generalization capabilities and robustness in scenarios with limited sample sizes, while simultaneously decreasing reliance on extensive and high-quality datasets, represents a significant area of inquiry within the domain of radar target recognition. This study introduces a few-shot learning framework that leverages multimodal [...] Read more.

Enhancing generalization capabilities and robustness in scenarios with limited sample sizes, while simultaneously decreasing reliance on extensive and high-quality datasets, represents a significant area of inquiry within the domain of radar target recognition. This study introduces a few-shot learning framework that leverages multimodal feature fusion. We develop a cross-modal representation optimization mechanism tailored for the target recognition task by incorporating natural resonance frequency features that elucidate the target’s scattering characteristics. Furthermore, we establish a multimodal fusion classification network that integrates bi-directional long short-term memory and residual neural network architectures, facilitating deep bimodal fusion through an encoding-decoding framework augmented by an energy embedding strategy. To optimize the model, we propose a cross-modal equilibrium loss function that amalgamates similarity metrics from diverse features with cross-entropy loss, thereby guiding the optimization process towards enhancing metric spatial discrimination and balancing classification performance. Empirical results derived from simulated datasets indicate that the proposed methodology achieves a recognition accuracy of 95.36% in the 5-way 1-shot task, surpassing traditional unimodal image and concatenation fusion feature approaches by 2.26% and 8.73%, respectively. Additionally, the inter-class feature separation is improved by 18.37%, thereby substantiating the efficacy of the proposed method. Full article

(This article belongs to the Section Radar Sensors)

► Show Figures

Figure 1

15 pages, 7157 KiB

Open AccessArticle

RADAR: Reasoning AI-Generated Image Detection for Semantic Fakes

by Haochen Wang, Xuhui Liu, Ziqian Lu, Cilin Yan, Xiaolong Jiang, Runqi Wang and Efstratios Gavves

Technologies 2025, 13(7), 280; https://doi.org/10.3390/technologies13070280 - 2 Jul 2025

Viewed by 491

Abstract

As modern generative models advance rapidly, AI-generated images exhibit higher resolution and lifelike details. However, the generated images may not adhere to world knowledge and common sense, as there is no such awareness and supervision in the generative models. For instance, the generated [...] Read more.

As modern generative models advance rapidly, AI-generated images exhibit higher resolution and lifelike details. However, the generated images may not adhere to world knowledge and common sense, as there is no such awareness and supervision in the generative models. For instance, the generated images could feature a penguin walking in the desert or a man with three arms, scenarios that are highly unlikely to occur in real life. Current AI-generated image detection methods mainly focus on low-level features, such as detailed texture patterns and frequency domain inconsistency, which are specific to certain generative models, making it challenging to identify the above-mentioned general semantic fakes. In this work, (1) we propose a new task, reasoning AI-generated image detection, which focuses on identifying semantic fakes in generative images that violate world knowledge and common sense. (2) To benchmark the new task, we collect a new dataset Spot the Semantic Fake (STSF). STSF contains 358 images with clear semantic fakes generated by three different modern diffusion models and provides bounding boxes as well as text annotations to locate the fakes. (3) We propose RADAR, a reasoning AI-generated image detection assistor, to locate semantic fakes in the generative images and output corresponding text explanations. Specifically, RADAR contains a specialized multimodal LLM to process given images and detect semantic fakes. To improve the generalization ability, we further incorporate ChatGPT as an assistor to detect unrealistic components in grounded text descriptions. The experiments on the STSF dataset show that RADAR effectively detects semantic fakes in modern generative images. Full article

(This article belongs to the Special Issue Image Analysis and Processing)

► Show Figures

Figure 1

73 pages, 2833 KiB

Open AccessArticle

A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities

by Jungpil Shin, Najmul Hassan, Abu Saleh Musa Miah and Satoshi Nishimura

Sensors 2025, 25(13), 4028; https://doi.org/10.3390/s25134028 - 27 Jun 2025

Cited by 1 | Viewed by 1393

Abstract

Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, [...] Read more.

Human Activity Recognition (HAR) systems aim to understand human behavior and assign a label to each action, attracting significant attention in computer vision due to their wide range of applications. HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals. Each modality provides unique and complementary information suited to different application scenarios. Consequently, numerous studies have investigated diverse approaches for HAR using these modalities. This survey includes only peer-reviewed research papers published in English to ensure linguistic consistency and academic integrity. This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2025, focusing on Machine Learning (ML) and Deep Learning (DL) approaches categorized by input data modalities. We review both single-modality and multi-modality techniques, highlighting fusion-based and co-learning frameworks. Additionally, we cover advancements in hand-crafted action features, methods for recognizing human–object interactions, and activity detection. Our survey includes a detailed dataset description for each modality, as well as a summary of the latest HAR systems, accompanied by a mathematical derivation for evaluating the deep learning model for each modality, and it also provides comparative results on benchmark datasets. Finally, we provide insightful observations and propose effective future research directions in HAR. Full article

(This article belongs to the Special Issue Computer Vision and Sensors-Based Application for Intelligent Systems)

► Show Figures

Figure 1

15 pages, 3945 KiB

Open AccessTechnical Note

Joint SAR–Optical Image Compression with Tunable Progressive Attentive Fusion

by Diego Valsesia and Tiziano Bianchi

Remote Sens. 2025, 17(13), 2189; https://doi.org/10.3390/rs17132189 - 25 Jun 2025

Viewed by 340

Abstract

Remote sensing tasks, such as land cover classification, are increasingly becoming multimodal problems, where information from multiple imaging devices, complementing each other, can be fused. In particular, synergies between optical and synthetic aperture radar (SAR) are widely recognized to be beneficial in a [...] Read more.

Remote sensing tasks, such as land cover classification, are increasingly becoming multimodal problems, where information from multiple imaging devices, complementing each other, can be fused. In particular, synergies between optical and synthetic aperture radar (SAR) are widely recognized to be beneficial in a variety of tasks. At the same time, archival of multimodal imagery for global coverage poses significant storage requirements due to the multitude of available sensors, and their increasingly higher resolutions. In this paper, we exploit redundancies between SAR and optical imaging modalities to create a joint encoding that improves storage efficiency. A novel neural network design with progressive attentive fusion modules is proposed for joint compression. The model is also promptable at test time with a desired tradeoff between the input modalities, to enable flexibility in the fidelity of the joint representation to each of them. Moreover, we show how end-to-end optimization of the joint compression model, including its modality tradeoff prompt, allows for better accuracy on downstream tasks leveraging multimodal inference when a constraint on the rate is to be met. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

22 pages, 11790 KiB

Open AccessArticle

Layered Soil Moisture Retrieval and Agricultural Application Based on Multi-Source Remote Sensing and Vegetation Suppression Technology: A Case Study of Youyi Farm, China

by Zhonghe Zhao, Yuyang Li, Kun Liu, Chunsheng Wu, Bowei Yu, Gaohuan Liu and Youxiao Wang

Remote Sens. 2025, 17(13), 2130; https://doi.org/10.3390/rs17132130 - 21 Jun 2025

Viewed by 460

Abstract

Soil moisture dynamics are a key parameter in regulating agricultural productivity and ecosystem functioning. The accurate monitoring and quantitative retrieval of soil moisture play a crucial role in optimizing agricultural water resource management. In recent years, the development of multi-source remote sensing technologies—such [...] Read more.

Soil moisture dynamics are a key parameter in regulating agricultural productivity and ecosystem functioning. The accurate monitoring and quantitative retrieval of soil moisture play a crucial role in optimizing agricultural water resource management. In recent years, the development of multi-source remote sensing technologies—such as high spatiotemporal resolution optical, radar, and thermal infrared sensors—has opened new avenues for efficient soil moisture retrieval. However, the accuracy of soil moisture retrieval decreases significantly when the soil is covered by vegetation. This study proposes a multi-modal remote sensing collaborative retrieval framework that integrates UAV-based multispectral imagery, Sentinel-1 radar data, and in situ ground sampling. By incorporating a vegetation suppression technique, a random-forest-based quantitative soil moisture model was constructed to specifically address the interference caused by dense vegetation during crop growing seasons. The results demonstrate that the retrieval performance of the model was significantly improved across different soil depths (0–5 cm, 5–10 cm, 10–15 cm, 15–20 cm). After vegetation suppression, the coefficient of determination (R²) exceeded 0.8 for all soil layers, while the mean absolute error (MAE) decreased by 35.1% to 49.8%. This research innovatively integrates optical–radar–thermal multi-source data and a physically driven vegetation suppression strategy to achieve high-accuracy, meter-scale dynamic mapping of soil moisture in vegetated areas. The proposed method provides a reliable technical foundation for precision irrigation and drought early warning. Full article

(This article belongs to the Special Issue Applications of Remote Sensing in Water Quality Assessment of Lakes, Rivers and Reservoirs)

► Show Figures

Graphical abstract

21 pages, 4072 KiB

Open AccessArticle

ST-YOLOv8: Small-Target Ship Detection in SAR Images Targeting Specific Marine Environments

by Fei Gao, Yang Tian, Yongliang Wu and Yunxia Zhang

Appl. Sci. 2025, 15(12), 6666; https://doi.org/10.3390/app15126666 - 13 Jun 2025

Viewed by 361

Abstract

Synthetic Aperture Radar (SAR) image ship detection faces challenges such as distinguishing ships from other terrains and structures, especially in specific marine complex environments. The motivation behind this work is to enhance detection accuracy while minimizing false positives, which is crucial for applications [...] Read more.

Synthetic Aperture Radar (SAR) image ship detection faces challenges such as distinguishing ships from other terrains and structures, especially in specific marine complex environments. The motivation behind this work is to enhance detection accuracy while minimizing false positives, which is crucial for applications like defense vessel monitoring and civilian search and rescue operations. To achieve this goal, we propose several architectural improvements to You Only Look Once version 8 Nano (YOLOv8n) and present Small Target-YOLOv8(ST-YOLOv8)—a novel lightweight SAR ship detection model based on the enhance YOLOv8n framework. The C2f module in the backbone’s transition sections is replaced by the Conv_Online Reparameterized Convolution (C_OREPA) module, reducing convolutional complexity and improving efficiency. The Atrous Spatial Pyramid Pooling (ASPP) module is added to the end of the backbone to extract finer features from smaller and more complex ship targets. In the neck network, the Shuffle Attention (SA) module is employed before each upsampling step to improve upsampling quality. Additionally, we replace the Complete Intersection over Union (C-IoU) loss function with the Wise Intersection over Union (W-IoU) loss function, which enhances bounding box precision. We conducted ablation experiments on two widely used multimodal SAR datasets. The proposed model significantly outperforms the YOLOv8n baseline, achieving 94.1% accuracy, 82% recall, and 87.6% F1 score on the SAR Ship Detection Dataset (SSDD), and 92.7% accuracy, 84.5% recall, and 88.1% F1 score on the SAR Ship Dataset_v0 dataset (SSDv0). Furthermore, the ST-YOLOv8 model outperforms several state-of-the-art multi-scale ship detection algorithms on both datasets. In summary, the ST-YOLOv8 model, by integrating advanced neural network architectures and optimization techniques, significantly improves detection accuracy and reduces false detection rates. This makes it highly suitable for complex backgrounds and multi-scale ship detection. Future work will focus on lightweight model optimization for deployment on mobile platforms to broaden its applicability across different scenarios. Full article

► Show Figures

Figure 1

31 pages, 8699 KiB

Open AccessArticle

Transformer-Based Dual-Branch Spatial–Temporal–Spectral Feature Fusion Network for Paddy Rice Mapping

by Xinxin Zhang, Hongwei Wei, Yuzhou Shao, Haijun Luan and Da-Han Wang

Remote Sens. 2025, 17(12), 1999; https://doi.org/10.3390/rs17121999 - 10 Jun 2025

Viewed by 416

Abstract

Deep neural network fusion approaches utilizing multimodal remote sensing are essential for crop mapping. However, challenges such as insufficient spatiotemporal feature extraction and ineffective fusion strategies still exist, leading to a decrease in mapping accuracy and robustness when these approaches are applied across [...] Read more.

Deep neural network fusion approaches utilizing multimodal remote sensing are essential for crop mapping. However, challenges such as insufficient spatiotemporal feature extraction and ineffective fusion strategies still exist, leading to a decrease in mapping accuracy and robustness when these approaches are applied across spatial‒temporal regions. In this study, we propose a novel rice mapping approach based on dual-branch transformer fusion networks, named RDTFNet. Specifically, we implemented a dual-branch encoder that is based on two improved transformer architectures. One is a multiscale transformer block used to extract spatial–spectral features from a single-phase optical image, and the other is a Restormer block used to extract spatial–temporal features from time-series synthetic aperture radar (SAR) images. Both extracted features were then combined into a feature fusion module (FFM) to generate fully fused spatial–temporal–spectral (STS) features, which were finally fed into the decoder of the U-Net structure for rice mapping. The model’s performance was evaluated through experiments with the Sentinel-1 and Sentinel-2 datasets from the United States. Compared with conventional models, the RDTFNet model achieved the best performance, and the overall accuracy (OA), intersection over union (IoU), precision, recall and F1-score were 96.95%, 88.12%, 95.14%, 92.27% and 93.68%, respectively. The comparative results show that the OA, IoU, accuracy, recall and F1-score improved by 1.61%, 5.37%, 5.16%, 1.12% and 2.53%, respectively, over those of the baseline model, demonstrating its superior performance for rice mapping. Furthermore, in subsequent cross-regional and cross-temporal tests, RDTFNet outperformed other classical models, achieving improvements of 7.11% and 12.10% in F1-score, and 11.55% and 18.18% in IoU, respectively. These results further confirm the robustness of the proposed model. Therefore, the proposed RDTFNet model can effectively fuse STS features from multimodal images and exhibit strong generalization capabilities, providing valuable information for governments in agricultural management. Full article

(This article belongs to the Special Issue Improving Remote Sensing Crop Mapping and Yield Estimation by New Techniques)

► Show Figures

Figure 1

24 pages, 1264 KiB

Open AccessReview

Indoor Abnormal Behavior Detection for the Elderly: A Review

by Tianxiao Gu and Min Tang

Sensors 2025, 25(11), 3313; https://doi.org/10.3390/s25113313 - 24 May 2025

Viewed by 836

Abstract

Due to the increased age of the global population, the proportion of the elderly population continues to rise. The safety of the elderly living alone is becoming an increasingly prominent area of concern. They often miss timely treatment due to undetected falls or [...] Read more.

Due to the increased age of the global population, the proportion of the elderly population continues to rise. The safety of the elderly living alone is becoming an increasingly prominent area of concern. They often miss timely treatment due to undetected falls or illnesses, which pose risks to their lives. In order to address this challenge, the technology of indoor abnormal behavior detection has become a research hotspot. This paper systematically reviews detection methods based on sensors, video, infrared, WIFI, radar, depth, and multimodal fusion. It analyzes the technical principles, advantages, and limitations of various methods. This paper further explores the characteristics of relevant datasets and their applicable scenarios and summarizes the challenges facing current research, including multimodal data scarcity, risk of privacy leakage, insufficient adaptability of complex environments, and human adoption of wearable devices. Finally, this paper proposes future research directions, such as combining generative models, federated learning to protect privacy, multi-sensor fusion for robustness, and abnormal behavior detection on the Internet of Things environment. This paper aims to provide a systematic reference for academic research and practical application in the field of indoor abnormal behavior detection. Full article

(This article belongs to the Section Wearables)

► Show Figures

Figure 1

18 pages, 2972 KiB

Open AccessArticle

Research on Cross-Scene Human Activity Recognition Based on Radar and Wi-Fi Multimodal Fusion

by Zhiyu Chen, Yanpeng Sun and Lele Qu

Electronics 2025, 14(8), 1518; https://doi.org/10.3390/electronics14081518 - 9 Apr 2025

Viewed by 832

Abstract

Radar-based human behavior recognition has significant value in IoT application scenarios such as smart healthcare and intelligent security. However, the existing unimodal perception architecture is susceptible to multipath effects, which can lead to feature drift, and the issue of limited cross-scenario generalization ability [...] Read more.

Radar-based human behavior recognition has significant value in IoT application scenarios such as smart healthcare and intelligent security. However, the existing unimodal perception architecture is susceptible to multipath effects, which can lead to feature drift, and the issue of limited cross-scenario generalization ability has not been effectively addressed. Although Wi-Fi sensing technology has emerged as a promising research direction due to its widespread device applicability and privacy protection, its drawbacks, such as low signal resolution and weak anti-interference ability, limit behavior recognition accuracy. To address these challenges, this paper proposes a dynamic adaptive behavior recognition method based on the complementary fusion of radar and Wi-Fi signals. By constructing a cross-modal spatiotemporal feature alignment module, the method achieves heterogeneous signal representation space mapping. A dynamic weight allocation strategy guided by attention is adopted to effectively suppress environmental interference and improve feature discriminability. Experimental results show that, on a cross-environment behavior dataset, the proposed method achieves an average recognition accuracy of 94.8%, which is a significant improvement compared to the radar unimodal domain adaptation method. Full article

► Show Figures

Figure 1

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI