MDPI - Publisher of Open Access Journals

30 pages, 26441 KB

Open AccessArticle

SARM: Scene-Aware Retinex Mamba for Underwater Image Enhancement

by Zhanbo Fu, Shuang Yang, Aiguo Sun, Rongjun Xiong and Nengcheng Chen

Remote Sens. 2026, 18(10), 1652; https://doi.org/10.3390/rs18101652 - 20 May 2026

Underwater image enhancement is essential for marine visual perception tasks. However, the highly heterogeneous optical degradations in real-world waters, the scarcity of paired training data, and the inherent dilemma for existing models in balancing long-range dependency modeling with computational overhead pose significant challenges. [...] Read more.

Underwater image enhancement is essential for marine visual perception tasks. However, the highly heterogeneous optical degradations in real-world waters, the scarcity of paired training data, and the inherent dilemma for existing models in balancing long-range dependency modeling with computational overhead pose significant challenges. To address these issues, this paper proposes a prior-guided, self-supervised underwater image enhancement framework called Scene-Aware Retinex Mamba (SARM). This framework seamlessly integrates Retinex theoretical priors with state space models (SSMs) and operates without paired supervision by employing a prior-guided pseudo-labeling strategy to guide network optimization. Architecturally, SARM deeply couples the physical Retinex prior with SSM. Its core module integrates multi-color space features and leverages a 2D selective scan mechanism to achieve global context modeling with linear complexity

O (H W)

, effectively removing complex color casts and suppressing non-uniform scattering noise. To further overcome the generalization bottlenecks in cross-domain underwater testing, this paper introduces a Scene-Aware Adapter (SAA), which facilitates dynamic loss scheduling and adaptive feature gating by quantifying scene-specific degradation characteristics. Comprehensive evaluations on multiple benchmark datasets, including UIEB, EUVP, and UCCS, demonstrate that SARM achieves state-of-the-art subjective and objective enhancement quality (e.g., yielding a URanker score of 2.491 and a CCF score of 35.76), while maintaining an ultra-fast inference speed of 136.52 FPS on the UIEB dataset. Furthermore, extended experiments reveal that SARM can significantly boost the performance of downstream vision tasks, validating its potential as a robust preprocessing module for various practical marine vision applications. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

30 pages, 7567 KB

Open AccessArticle

Drone-Assisted Lightweight Authentication Protocol for Unmanned eVTOL Emergency Rescue

by Qi Xie and Huai Chen

Drones 2026, 10(5), 391; https://doi.org/10.3390/drones10050391 - 20 May 2026

Abstract

While drones play important roles in areas such as communication and logistics delivery, they have certain limitations in emergency rescue scenarios due to their inability to carry passengers. Building on mature drone technologies such as autonomous flight and environmental perception, unmanned passenger Electric [...] Read more.

While drones play important roles in areas such as communication and logistics delivery, they have certain limitations in emergency rescue scenarios due to their inability to carry passengers. Building on mature drone technologies such as autonomous flight and environmental perception, unmanned passenger Electric Vertical Take-off and Landing (eVTOL) aircraft are designed with a manned cabin, enabling them to operate without an onboard pilot while rapidly transporting injured people. Consequently, eVTOLs can play a significant role in emergency rescue that cargo-only drones cannot fulfill, as they are capable of rapidly reaching emergency scenes, effectively overcoming the delays caused by traditional ground traffic congestion. Despite their potential, eVTOLs still face several critical obstacles, including signal disruption, limited coverage of dispatching centers, mutual authentication among entities, and concerns related to security and privacy preservation. As a remedy, this paper presents a lightweight authentication protocol leveraging drone assistance to overcome these challenges for unmanned eVTOL emergency rescue. In scenarios where an unmanned eVTOL experiences signal blockage due to dense urban high-rise structures, neighboring drones can serve as a transmission relay to assist the unmanned eVTOL and the dispatch center (DC) in completing mutual authentication and session key negotiation, thereby enabling the unmanned eVTOL to safely complete its mission. To enhance security, physical unclonable functions (PUFs) are integrated into unmanned eVTOLs, drones, and the DC, safeguarding sensitive data against side-channel and physical capture attacks while preserving the confidentiality of unmanned eVTOL identities to mitigate privacy risks. Our protocol achieves provable security in the random oracle model while exhibiting strong resistance to various well-known attacks. Comparative analysis with the existing drone authentication and drone-assisted emergency rescue authentication protocols reveals that our protocol not only provides stronger security guarantees but also maintains a low computational overhead. Full article

(This article belongs to the Section Drone Communications)

► Show Figures

Figure 1

40 pages, 5773 KB

Open AccessArticle

A Multilayer Decision-Making Method for UAV Formation Cooperative Flight in Complex Urban Environments

by Junjie Wang, Dongyu Yan, Yongping Hao and Han Miao

Sensors 2026, 26(10), 3245; https://doi.org/10.3390/s26103245 - 20 May 2026

Abstract

To address the challenges of dynamic obstacles, limited perception, and multi-UAV coordination constraints in complex urban environments, a hierarchical control framework based on a virtual leader-follower architecture is proposed, covering global planning, local obstacle avoidance, and formation coordination. In the global planning layer, [...] Read more.

To address the challenges of dynamic obstacles, limited perception, and multi-UAV coordination constraints in complex urban environments, a hierarchical control framework based on a virtual leader-follower architecture is proposed, covering global planning, local obstacle avoidance, and formation coordination. In the global planning layer, a dynamic adaptive strategy rapidly exploring random tree star (DASRRT*) algorithm is proposed. To address the low sampling efficiency and limited path extension in dense environments that affect traditional RRT*, a hybrid guided sampling strategy, inefficient node optimization strategy, and perception-based adaptive step size strategy are designed. Additionally, a multi-objective cost function is introduced to provide smoother trajectories that better comply with dynamic constraints for trajectory tracking. In the local obstacle-avoidance layer, a distributed controller is constructed based on an improved artificial potential field method, integrating collision avoidance control laws derived from a spring-damper model, dynamic obstacle-avoidance laws that account for obstacle velocities, and formation coordination control laws grounded in consensus theory. In the coordination control layer, a real-time local target selection strategy is established to guide the virtual leader to precisely track the global path, and a dual-mode switching mechanism based on environmental complexity is constructed to dynamically adjust the priority between formation maintenance and autonomous obstacle-avoidance tasks. Comparative experimental results show that the proposed DASRRT* algorithm reduces path planning time by an average of 34.78% and shortens path length by 1.15%. Simulation results for formation flight demonstrate that the proposed hierarchical control framework can adaptively adjust control modes in response to changes in environmental complexity, exhibiting strong adaptability to complex environments and a good ability to generalize to various scenes. Full article

(This article belongs to the Section Navigation and Positioning)

19 pages, 1765 KB

Open AccessArticle

SetConv++: Point Cloud Scene Flow Estimation Constrained by Feature Self-Supervision

by Fei Zhang, Yinghui Wang, Yang Xi and Chunhao Hua

Mathematics 2026, 14(10), 1748; https://doi.org/10.3390/math14101748 - 19 May 2026

Abstract

Point cloud scene flow estimation aims to capture the three-dimensional motion of each point in a sequence of point clouds. Although progress has occurred in this field, existing methods often face significant challenges. In particular, two key issues persist: the absence of corresponding [...] Read more.

Point cloud scene flow estimation aims to capture the three-dimensional motion of each point in a sequence of point clouds. Although progress has occurred in this field, existing methods often face significant challenges. In particular, two key issues persist: the absence of corresponding local information from the source point cloud to the target, preventing correct feature matching, and the presence of highly similar adjacent structures in target regions, which leads to ambiguous correspondences due to indistinguishable point features. To address these problems, this paper introduces a novel self-supervised method for point cloud scene flow estimation. Theoretically, we establish a new framework that integrates discriminative feature learning with probabilistic flow refinement. A new network architecture, SetConv++, is designed to learn more discriminative point feature representations, enhancing differentiation in similar structures. Additionally, a refinement module uses the random walk algorithm to adjust initial flow estimates. This approach reconstructs low-confidence flows with high-confidence surrounding ones, reducing missing correspondence issues. Crucially, a new flow smoothing loss term ensures local consistency while suppressing error propagation—a fundamental limitation in existing methods. Through comprehensive experiments on the KITTI Scene Flow dataset, our method demonstrates superior performance. It significantly outperforms existing self-supervised approaches across multiple standard evaluation metrics. Specifically, on the KITTI Scene Flow dataset, our method reduces the Endpoint Error (EPE) by 13.6% (from 0.0411 to 0.0355) and improves Accuracy Strict (AS) by 2.43 percentage points (from 92.68% to 95.11%) compared to baseline self-supervised approaches, while also reducing the outlier rate (Out) by 1.5 percentage points. This advancement not only provides a robust theoretical framework for handling ambiguous correspondences but also enables more reliable and efficient downstream applications—such as autonomous driving perception systems requiring real-time motion accuracy in complex scenes. Full article

17 pages, 6738 KB

Open AccessArticle

Seeing the Unborn: Artificial Intelligence and the Iconographic Visibility of Pregnancy in Early Modern Iberian Religious Art

by Jose Luis Bartha

Arts 2026, 15(5), 106; https://doi.org/10.3390/arts15050106 - 18 May 2026

Viewed by 107

Abstract

This article examines the visibility and invisibility of pregnancy in early modern Iberian religious art through artificial intelligence. While sacred imagery in Catholic Spain and Portugal between the 14th and 17th centuries includes representations of the pregnant Virgin Mary and scenes related to [...] Read more.

This article examines the visibility and invisibility of pregnancy in early modern Iberian religious art through artificial intelligence. While sacred imagery in Catholic Spain and Portugal between the 14th and 17th centuries includes representations of the pregnant Virgin Mary and scenes related to maternity, such depictions are often symbolically coded and devoid of embodied detail. Using the Astica Vision AI system, a two-phase methodology was applied to a total of fifty-two artworks depicting the Virgin of Expectation, the Visitation, and the Nativity of the Virgin. In the first phase, images were submitted without context to observe whether the AI could independently identify pregnancy or maternal affect. In the second phase, the same images were reanalyzed with iconographic metadata. Findings show that AI frequently fails to detect gestation in sacred images, and even when context is provided, rarely describes bodily signs or relational affect. These findings reflect the visual logic inherent to sacred art, which tends to prioritize theological meaning over biological or emotional realities. The inclusion of a modern secular photograph of pregnancy highlights the contrast: there, AI readily identifies maternal embodiment, emotion, and connection. This contrast reveals how cultural and doctrinal frameworks shape visual codes of legibility. Rather than a neutral observer, the AI becomes a diagnostic tool, amplifying iconographic silences and revealing how sacred art disciplines perception. The article proposes a new methodological role for machine vision in the humanities: not to mimic human reading, but to uncover what remains unseen in visual culture. Full article

► Show Figures

Figure 1

45 pages, 46439 KB

Open AccessReview

Review of Humanoid Robotic Astronauts for Space Missions

by Liping Fang, Jun Zhang, Liang Tang and Quan Hu

Appl. Sci. 2026, 16(10), 5032; https://doi.org/10.3390/app16105032 - 18 May 2026

Viewed by 233

Abstract

As human space missions become longer and more autonomous, robots are expected to assume broader responsibilities in inspection, maintenance, logistics, scientific support, and crew assistance. Among available robot forms, humanoid robotic astronauts are especially relevant because their anthropomorphic embodiment is compatible with human-centered [...] Read more.

As human space missions become longer and more autonomous, robots are expected to assume broader responsibilities in inspection, maintenance, logistics, scientific support, and crew assistance. Among available robot forms, humanoid robotic astronauts are especially relevant because their anthropomorphic embodiment is compatible with human-centered habitats, tools, interfaces, and procedures. Their deployment in orbital and planetary environments, however, introduces challenges that differ from those of terrestrial humanoids, including floating-base dynamics, intermittent contact, whole-body coordination, constrained perception, and delayed supervision. This review contributes a mission-oriented and astronaut-centered synthesis of humanoid robotic astronauts, distinguishing itself from platform-by-platform or morphology-only surveys. It treats these systems as mission-compatible embodied agents whose feasibility depends on the coupling among mission context, morphology, contact behavior, perception, autonomy, and validation evidence. The primary goals are threefold: to classify representative platforms according to mission context, to synthesize the core technical foundations required for mission-compatible operation, and to identify cross-cutting deployment bottlenecks and benchmarking priorities for future development. Representative systems are organized into intravehicular assistance, extravehicular operations and on-orbit servicing, and surface exploration or transitional scenarios, showing how mission demands shape embodiment, mobility, manipulation, autonomy, and validation strategies. This review further summarizes recent progress in microgravity dynamics and contact mechanics, multimodal perception and scene understanding, whole-body motion planning and control, teleoperation and supervised autonomy, and evaluation and benchmarking methods. The analysis indicates that humanoid robotic astronauts are not simple extensions of terrestrial humanoids but astronaut-oriented embodied systems for mission-constrained environments. Three priorities are identified for future development: contact-rich whole-body intelligence under support transitions, delay-tolerant supervised autonomy with explicit authority handoff, and systematic benchmarking pipelines that connect simulation, ground analogs, short-duration microgravity tests, human-in-the-loop trials, and mission-context demonstrations. Full article

(This article belongs to the Topic Advances in Robot Vision Perception and Control Technology)

► Show Figures

Figure 1

37 pages, 10460 KB

Open AccessArticle

Research on Visual Recognition and Harvesting Point Localization System for Grape-Picking Robots in Smart Agriculture

by Tao Lin, Qiurong Lv, Fuchun Sun, Wei Ma and Xiaoxiao Li

Agriculture 2026, 16(10), 1073; https://doi.org/10.3390/agriculture16101073 - 14 May 2026

Viewed by 156

Abstract

To improve grape target perception and picking-point positioning for intelligent harvesting robots, this study develops a vision-based method for orchard grape detection and harvesting-point localization. The method is intended to address missed detections, insufficient recognition accuracy, and unsatisfactory peduncle segmentation caused by illumination [...] Read more.

To improve grape target perception and picking-point positioning for intelligent harvesting robots, this study develops a vision-based method for orchard grape detection and harvesting-point localization. The method is intended to address missed detections, insufficient recognition accuracy, and unsatisfactory peduncle segmentation caused by illumination variation, occlusion, and interference from branches and leaves in complex orchard scenes. For grape cluster and peduncle detection, a lightweight YOLOv7-derived model, termed YOLO-FES, was established. In this model, FasterNet and SCConv were introduced to refine the backbone and neck structures, and the EMA mechanism was incorporated to lower parameter complexity and computational cost while improving detection performance. For suspended grape structure association and peduncle extraction, the GJK algorithm was combined with nearest-neighbor rectangular discrimination, and an improved YOLACT-based peduncle segmentation network, named M-YOLACT, was constructed. With the integration of the MLCA mechanism and the Mish activation function, accurate peduncle segmentation was achieved. In addition, a stereo depth camera was employed to obtain two-dimensional picking-point information and further recover the corresponding three-dimensional spatial coordinates. Experimental results showed that the mAP@0.5 of YOLO-FES for grape clusters and peduncles reached 95.37%. For grape peduncle segmentation, the mAP@0.5 values of the bounding boxes and masks produced by M-YOLACT reached 95.73% and 94.36%, respectively. The proposed method achieved an overall harvesting success rate of 89.2%, with an average time consumption of 11 s for a single harvesting operation. By integrating deep-learning-based detection and segmentation with binocular-vision localization, this study provides a practical technical solution and useful reference for the visual system design of grape-harvesting robots. Full article

(This article belongs to the Special Issue Key Technology Research and Applications of Agricultural Inspection Robots Based on Machine Vision and Artificial Intelligence)

► Show Figures

Figure 1

25 pages, 3056 KB

Open AccessReview

Artificial Intelligence in Smart Agriculture Across the Production-to-Postharvest Continuum: Progress, Challenges, and Future Directions

by Junhao Sun, Quanjin Wang, Qinghua Li, Guangfei Xu, Bowen Liang, Chuanzhe Ma, Shiao Tian and Qimin Gao

Sustainability 2026, 18(10), 4908; https://doi.org/10.3390/su18104908 - 14 May 2026

Viewed by 172

Abstract

Artificial intelligence is transforming agriculture from a mechanized, labor-intensive sector into a data-driven, perception-enabled, and increasingly autonomous production system. In this review, AI serves as an umbrella term encompassing machine learning, computer vision, and robotic control, among other technologies. We synthesize recent advances [...] Read more.

Artificial intelligence is transforming agriculture from a mechanized, labor-intensive sector into a data-driven, perception-enabled, and increasingly autonomous production system. In this review, AI serves as an umbrella term encompassing machine learning, computer vision, and robotic control, among other technologies. We synthesize recent advances across the tillage–sowing–management–harvesting (TSMH) workflow, covering intelligent tillage, precision sowing, field management, and robotic harvesting. The literature shows that AI has significantly improved agricultural perception, prediction, and task-level decision-making. However, large-scale adoption remains constrained by data heterogeneity, limited cross-scene generalization, environmental uncertainty, and insufficient integration across operational stages. Future progress will depend on multimodal data fusion, lightweight and interpretable models, cloud-edge collaboration, and full-chain decision architectures. By framing current research within the TSMH pipeline, this review highlights both technical advances and the critical bottlenecks that must be addressed to move smart agriculture from stage-specific intelligence toward system-level autonomy. Representative studies indicate that AI models can improve soil-property prediction and reduce sowing miss-detection rates to below 3% under controlled or bench-top conditions. However, field deployment may be affected by environmental variability, including illumination changes, dust, vibration, occlusion, and hardware constraints. These limitations highlight the need for robust and edge-compatible architectures. Full article

(This article belongs to the Special Issue Smart Agricultural Technologies, Sustainable Livestock Production and Environmental Sustainability)

► Show Figures

Figure 1

19 pages, 8033 KB

Open AccessArticle

Parameter-Efficient Domain Adaptation and Lightweight Decoding for Agricultural Monocular Depth Estimation

by Yanliang Mao, Wenhao Zhao and Liping Chen

Agronomy 2026, 16(10), 972; https://doi.org/10.3390/agronomy16100972 (registering DOI) - 13 May 2026

Viewed by 78

Abstract

Reliable monocular depth estimation (MDE) is essential for agricultural robots and unmanned platforms, where low-cost visual perception is required for safe navigation and scene understanding in complex field environments. However, general-purpose depth foundation models remain limited by substantial domain gaps in agriculture, while [...] Read more.

Reliable monocular depth estimation (MDE) is essential for agricultural robots and unmanned platforms, where low-cost visual perception is required for safe navigation and scene understanding in complex field environments. However, general-purpose depth foundation models remain limited by substantial domain gaps in agriculture, while full fine-tuning of large backbones is computationally expensive and less suitable for deployment on resource-constrained platforms. In this paper, an efficient agricultural MDE framework, termed AgriLoRA-DA, is proposed based on Depth-Anything-V2. Specifically, the pretrained DINOv2 encoder is kept frozen and adapted using LoRA in selected attention projections, while the original Dense Prediction Transformer (DPT) decoder is replaced with a lightweight Lite-FPNHead to reduce decoding overhead and improve deployment efficiency. Experiments conducted on the WE3DS dataset indicate that, although Depth-Anything-V3 provides the strongest zero-shot generalization among the evaluated baselines, target-domain adaptation is still necessary for WE3DS agricultural scenes. After adaptation, AgriLoRA-DA achieves the best overall performance with AbsRel = 0.0133, SqRel = 3.518, RMSE = 132.264, log10 = 0.0057, and delta1 = 0.9990, while requiring only 0.19 M (0.87%) trainable parameters. These results suggest that parameter-efficient adaptation and lightweight decoding provide a practical direction for deployable depth estimation in crop-row scenes similar to WE3DS, while broader cross-dataset validation remains an important direction for future work. Full article

(This article belongs to the Special Issue Enhancing Generalization in Agricultural AI: Bridging Data Gaps and Boosting Model Robustness)

► Show Figures

Figure 1

29 pages, 19640 KB

Open AccessArticle

Target-Aware Fusion: A Diffusion Model for Infrared and Visible Image Integration to Enhance Object Detection

by Jinyong Chen, Tingyu Zhu and Gang Wang

Remote Sens. 2026, 18(10), 1545; https://doi.org/10.3390/rs18101545 - 13 May 2026

Viewed by 136

Abstract

There are differences in imaging characteristics between infrared and visible light images: visible light images can provide rich texture and color information, but imaging is limited in harsh weather conditions. Infrared images are based on the target’s thermal radiation characteristics and have the [...] Read more.

There are differences in imaging characteristics between infrared and visible light images: visible light images can provide rich texture and color information, but imaging is limited in harsh weather conditions. Infrared images are based on the target’s thermal radiation characteristics and have the ability to resist environmental interference but lack details and background information. Effectively integrating the two can significantly enhance scene understanding ability and improve environmental perception and target recognition performance in applications such as intelligent driving. However, existing fusion methods still face challenges, especially in complex scenes where it is difficult to balance the full preservation of target information with the complete presentation of background details, often resulting in difficulties in extracting differentiated features from different modalities. This article proposes a target detection method based on the visible light infrared fusion diffusion model. This method introduces the Stable Diffusion architecture and designs a target perception spatial fusion weight module that can adaptively generate a spatial fusion weight map based on modal differences. By implementing a multi-stage dynamic fusion strategy, the fusion ratio is automatically adjusted at different diffusion stages. A full-step multi-step prediction mechanism is adopted to improve fusion quality and stability. Compared with existing methods, the method proposed in this article has significant advantages. Experiments on multiple publicly available datasets have shown that this method outperforms existing mainstream methods in key metrics such as Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE), and ean Absolute Error (MAE) and also demonstrates good detection performance in downstream tasks for object detection. Full article

► Show Figures

Figure 1

30 pages, 3787 KB

Open AccessArticle

HyperNCMD: A Scene-Adaptive Clutter Measurement Density Estimator for Radar Tracking via Hypernetworks and Normalizing Flows

by Zongqing Cao, Jianchao Yang, Wang Sun, Xingyu Lu, Ke Tan, Zheng Dai, Wenchao Yu and Hong Gu

Remote Sens. 2026, 18(10), 1541; https://doi.org/10.3390/rs18101541 - 13 May 2026

Viewed by 125

Abstract

Accurateestimation of clutter measurement density (CMD) is crucial for radar-based multi-target tracking (MTT), especially under spatially non-uniform and temporally varying environments. Existing methods, including finite mixture models, kernel density estimation, and normalizing flows, often require scene-specific tuning and exhibit limited generalization. To address [...] Read more.

Accurateestimation of clutter measurement density (CMD) is crucial for radar-based multi-target tracking (MTT), especially under spatially non-uniform and temporally varying environments. Existing methods, including finite mixture models, kernel density estimation, and normalizing flows, often require scene-specific tuning and exhibit limited generalization. To address these limitations, we propose HyperNCMD, a scene-adaptive CMD estimator that employs hypernetworks to dynamically generate the parameters of normalizing flows. To capture spatial variability, radar measurements are first embedded using Random Fourier Features (RFFs), and then processed by a spatio-temporal encoder that jointly models spatial structures and temporal clutter dynamics. The hypernetwork leverages the encoded embedding to adaptively produce flow parameters, enabling flexible CMD estimation across diverse environments. Lightweight data augmentation is further applied to make the estimator more robust across diverse environments, while a Feature-wise Linear Modulation (FiLM)-based fine-tuning scheme enhances test-time adaptation. Experiments on both synthetic and real radar datasets demonstrate that HyperNCMD achieves superior accuracy and robustness, achieving up to 10.5% reduction in per-point negative log-likelihood under dynamically varying conditions. These results highlight the potential of hypernetwork-driven CMD modeling for reliable radar perception in complex sensing environments. Full article

(This article belongs to the Topic Radar Signal and Data Processing with Applications, 2nd Edition)

► Show Figures

Figure 1

15 pages, 2289 KB

Open AccessArticle

FOR: Point Cloud Outlier Removal Based on Fuzzy Theory and Informativeness and Its Application to 3D Object Detection

by Lili Gan, Zhengyi Yang, Yiyi Liu, Yaqi Wang and Xinyan An

Sensors 2026, 26(10), 3070; https://doi.org/10.3390/s26103070 - 13 May 2026

Viewed by 305

Abstract

LiDAR is widely used in autonomous driving. Although LiDAR point cloud data can provide stable and reliable information about the environment, it also faces the problem of a huge amount of data. One of the reasons is that point cloud data contains a [...] Read more.

LiDAR is widely used in autonomous driving. Although LiDAR point cloud data can provide stable and reliable information about the environment, it also faces the problem of a huge amount of data. One of the reasons is that point cloud data contains a large amount of noise and outliers. Outlier removal of point clouds can reduce the impact of these disturbances and improve the quality of the point cloud, but it will inevitably eliminate some valid points, which affects subsequent perception tasks. To overcome this limitation, this paper proposes a fuzzy outlier removal (FOR) method based on fuzzy theory and informativeness. It uses fuzzy theory to model the uncertainty of the membership degree of each point in each dimension, calculates the informativeness sum of each point based on membership degree, and filters points according to the informativeness. FOR is characterized by filtering the point cloud in the edge region on the premise of retaining the point cloud in the center region, so as to preserve the environmental information in the center region and reduce the impact of outlier removal on subsequent perception tasks. The experiments focus on the contradictory relationship between outlier removal and perception accuracy, and verify the effectiveness of FOR with multiple object detection models on the autonomous driving datasets KITTI and nuScenes. The experimental results indicate that, compared with other point cloud outlier removal methods, FOR has the advantage of reducing inference time while retaining detection accuracy, demonstrating balanced high performance across different datasets and detection models. Full article

(This article belongs to the Special Issue Recent Progress in 3D Computer Vision and Robotics)

► Show Figures

Figure 1

26 pages, 10781 KB

Open AccessArticle

Explicit Illumination Modeling for Object Detection in Low-Light Environments

by Wenkang Cao, Peng Yang and Wensheng Lyu

Electronics 2026, 15(10), 2057; https://doi.org/10.3390/electronics15102057 - 12 May 2026

Viewed by 234

Abstract

Under complex lighting conditions, particularly in low-light environments, general object detectors often suffer from degraded detection performance due to insufficient brightness, severe noise, and loss of discriminative details. This issue is especially critical in underground mining scenarios, where weak illumination, complex backgrounds, dust [...] Read more.

Under complex lighting conditions, particularly in low-light environments, general object detectors often suffer from degraded detection performance due to insufficient brightness, severe noise, and loss of discriminative details. This issue is especially critical in underground mining scenarios, where weak illumination, complex backgrounds, dust interference, and frequent small or partially occluded targets make reliable visual perception highly challenging. To address this issue, we propose an Illumination-Aware Detection Network (IADNet) for object detection in low-light environments. Specifically, an Illumination Modeling Subnetwork (IMS) is designed to extract illumination-aware and degradation-aware auxiliary features from low-light images. Within the IMS, an Adaptive Weighted Downsampling (AWD) layer is introduced to reduce noise interference during feature downsampling and enhance illumination-aware representation learning. Furthermore, a Global Feature Enhancement Module (GFEM) is incorporated to strengthen global context modeling and improve feature representation capability in complex scenes. In addition, an extra contrastive loss is introduced to constrain the optimization of the IMS, and weighting factors are employed to balance the detection loss and the contrastive loss during training. Extensive experiments conducted on multiple datasets demonstrate the effectiveness of the proposed method. On the public ExDark dataset, IADNet achieves an mAP@50 of 80.3%, outperforming the baseline YOLO11m by 3.4 percentage points. On the self-constructed mining low-light dataset Lowlight_Mine, the proposed method achieves 92.3% Precision, 82.0% Recall, 89.3% mAP@50, and 57.8% mAP@50:95, showing favorable performance in object detection tasks under mining-related low-light scenarios. On the DARK FACE dataset, IADNet achieves 54.6% mAP@50 and 31.2% mAP@50:95, further indicating its robustness under real low-light conditions. On the synthetic low-light Dark_VOC dataset, IADNet attains an mAP@50 of 91.6%, and on the normal-light VOC dataset, it achieves an mAP@50 of 93.0%, suggesting that the proposed method maintains stable detection performance under the evaluated illumination conditions. These results indicate that IADNet improves low-light object detection performance and provides a useful experimental reference for object detection tasks in mining-related low-light scenarios. Full article

► Show Figures

Figure 1

49 pages, 8417 KB

Open AccessFeature PaperArticle

Ontology Neural Network and ORTSF: A Framework for Topological Reasoning and Delay-Robust Control

by Jaehong Oh

Int. J. Topol. 2026, 3(2), 9; https://doi.org/10.3390/ijt3020009 (registering DOI) - 12 May 2026

Viewed by 185

Abstract

The advancement of autonomous robotic systems has led to significant capabilities in perception, localization, mapping, and control, yet a critical challenge remains in representing and preserving relational semantics, contextual reasoning, and cognitive transparency essential for collaboration in dynamic, human-centric environments. This paper introduces [...] Read more.

The advancement of autonomous robotic systems has led to significant capabilities in perception, localization, mapping, and control, yet a critical challenge remains in representing and preserving relational semantics, contextual reasoning, and cognitive transparency essential for collaboration in dynamic, human-centric environments. This paper introduces a unified architecture comprising the Ontology Neural Network (ONN) and the Ontological Real-Time Semantic Fabric (ORTSF) to address this challenge. The ONN formalizes relational semantic reasoning as a dynamic topological process by embedding Forman–Ricci curvature, persistent homology, and semantic tensor structures within a unified loss formulation, aiming to maintain relational integrity as scenes evolve. Building upon ONN, the ORTSF transforms reasoning traces into actionable control commands while compensating for system delays through predictive operators designed to preserve phase margins. Theoretical analysis and extensive simulations demonstrate that ORTSF maintains designed phase margins, offering advantages over classical delay compensation methods. Empirical studies indicate the framework’s effectiveness in unifying semantic cognition and robust control, providing a mathematically principled solution for cognitive robotics. Full article

(This article belongs to the Topic Topological, Quantum, and Molecular Information Approaches to Computation and Intelligence)

► Show Figures

Figure 1

21 pages, 30038 KB

Open AccessArticle

DGS-Net: A Lightweight Deformable and Occlusion-Aware Network for Paddy Weed Detection on Edge Devices

by Yu Zhuang, Zhanpeng Luo, Shiyu Cao, Jiayuan Zhu, Le Zheng, Xinhua Ma and Yijia Wang

Agriculture 2026, 16(10), 1039; https://doi.org/10.3390/agriculture16101039 - 11 May 2026

Viewed by 341

Abstract

To address the dual challenges of discriminating weeds from rice seedlings for precision weed management operations, such as targeted spraying and robotic weeding, in complex paddy-field scenes and deploying high-precision models on resource-limited edge devices, we propose DGS-Net, a deformable attention, GSConv-based feature [...] Read more.

To address the dual challenges of discriminating weeds from rice seedlings for precision weed management operations, such as targeted spraying and robotic weeding, in complex paddy-field scenes and deploying high-precision models on resource-limited edge devices, we propose DGS-Net, a deformable attention, GSConv-based feature fusion, and SEAM-enhanced lightweight network based on YOLOv11n. The backbone incorporates a convolutional block with parallel split attention and deformable attention transformer (C2PSA_DAT) module to improve the extraction of irregular and fine-grained weed features, the neck integrates a VoV-GSCSP module to enable lightweight multi-scale feature fusion for small and densely distributed targets, and a separated and enhancement attention module (SEAM) is placed before the detection head to enhance robustness under leaf occlusion and complex paddy-field background interference. In comparative experiments conducted on the paddy-field dataset under unified training and evaluation settings, DGS-Net achieved 91.7% precision, 86.8% recall, and 92.4% mean average precision (mAP), with a model size of 5.8 MB and a computational cost of 6.2 giga floating-point operations (GFLOPs). Compared with representative lightweight baseline detectors, DGS-Net showed a more favorable balance between detection accuracy and deployment efficiency. In additional edge-device deployment tests using the test set, the model sustained real-time inference at 32.5 FPS and achieved mAP@0.5, precision, and recall of approximately 0.928, 0.919, and 0.867, respectively. Overall, DGS-Net improves irregular feature extraction, enables lightweight multi-scale feature fusion, and increases robustness to occlusion while retaining strong deployability. The method therefore provides practical visual-perception support for precise, real-time crop–weed discrimination and precision weed management in complex paddy-field environments. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

Search Results (815)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (815)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI