Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (880)

Search Parameters:
Keywords = visual illumination

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 15438 KB  
Article
Day–Night All-Sky Scene Classification with an Attention-Enhanced EfficientNet
by Wuttichai Boonpook, Peerapong Torteeka, Kritanai Torsri, Daroonwan Kamthonkiat, Yumin Tan, Asamaporn Sitthi, Patcharin Kamsing, Chomchanok Arunplod, Utane Sawangwit, Thanachot Ngamcharoensuktavorn and Kijnaphat Suksod
ISPRS Int. J. Geo-Inf. 2026, 15(2), 66; https://doi.org/10.3390/ijgi15020066 - 3 Feb 2026
Abstract
All-sky cameras provide continuous hemispherical observations essential for atmospheric monitoring and observatory operations; however, automated classification of sky conditions in tropical environments remains challenging due to strong illumination variability, atmospheric scattering, and overlapping thin-cloud structures. This study proposes EfficientNet-Attention-SPP Multi-scale Network (EASMNet), a [...] Read more.
All-sky cameras provide continuous hemispherical observations essential for atmospheric monitoring and observatory operations; however, automated classification of sky conditions in tropical environments remains challenging due to strong illumination variability, atmospheric scattering, and overlapping thin-cloud structures. This study proposes EfficientNet-Attention-SPP Multi-scale Network (EASMNet), a physics-aware deep learning framework for robust all-sky scene classification using hemispherical imagery acquired at the Thai National Observatory. The proposed architecture integrates Squeeze-and-Excitation (SE) blocks for radiometric channel stabilization, the Convolutional Block Attention Module (CBAM) for spatial–semantic refinement, and Spatial Pyramid Pooling (SPP) for hemispherical multi-scale context aggregation within a fully fine-tuned EfficientNetB7 backbone, forming a domain-aware atmospheric representation framework. A large-scale dataset comprising 122,660 RGB images across 13 day–night sky-scene categories was curated, capturing diverse tropical atmospheric conditions including humidity, haze, illumination transitions, and sensor noise. Extensive experimental evaluations demonstrate that the EASMNet achieves 93% overall accuracy, outperforming representative convolutional (VGG16, ResNet50, DenseNet121) and transformer-based architectures (Swin Transformer, Vision Transformer). Ablation analyses confirm the complementary contributions of hierarchical attention and multi-scale aggregation, while class-wise evaluation yields F1-scores exceeding 0.95 for visually distinctive categories such as Day Humid, Night Clear Sky, and Night Noise. Residual errors are primarily confined to physically transitional and low-contrast atmospheric regimes. These results validate the EASMNet as a reliable, interpretable, and computationally feasible framework for real-time observatory dome automation, astronomical scheduling, and continuous atmospheric monitoring, and provide a scalable foundation for autonomous sky-observation systems deployable across diverse climatic regions. Full article
Show Figures

Figure 1

14 pages, 2196 KB  
Article
Toward Realistic Autonomous Driving Dataset Augmentation: A Real–Virtual Fusion Approach with Inconsistency Mitigation
by Sukwoo Jung, Myeongseop Kim, Jean Oh, Jonghwa Kim and Kyung-Taek Lee
Sensors 2026, 26(3), 987; https://doi.org/10.3390/s26030987 - 3 Feb 2026
Abstract
Autonomous driving systems rely on vast and diverse datasets for robust object recognition. However, acquiring real-world data, especially for rare and hazardous scenarios, is prohibitively expensive and risky. While purely synthetic data offers flexibility, it often suffers from a significant reality gap due [...] Read more.
Autonomous driving systems rely on vast and diverse datasets for robust object recognition. However, acquiring real-world data, especially for rare and hazardous scenarios, is prohibitively expensive and risky. While purely synthetic data offers flexibility, it often suffers from a significant reality gap due to discrepancies in visual fidelity and physics. To address these challenges, this paper proposes a novel real–virtual fusion framework for efficiently generating highly realistic augmented image datasets for autonomous driving. Our methodology leverages real-world driving data from South Korea’s K-City, synchronizing it with a digital twin environment in Morai Sim (v24.R2) through a robust look-up table and fine-tuned localization approach. We then seamlessly inject diverse virtual objects (e.g., pedestrians, vehicles, traffic lights) into real image backgrounds. A critical contribution is our focus on inconsistency mitigation, employing advanced techniques such as illumination matching during virtual object injection to minimize visual discrepancies. We evaluate the proposed approach through experiments. Our results show that this real–virtual fusion strategy significantly bridges the reality gap, providing a cost-effective and safe solution for enriching autonomous driving datasets and improving the generalization capabilities of perception models. Full article
Show Figures

Figure 1

23 pages, 10699 KB  
Article
YOLOv11-IMP: Anchor-Free Multiscale Detection Model for Accurate Grape Yield Estimation in Precision Viticulture
by Shaoxiong Zheng, Xiaopei Yang, Peng Gao, Qingwen Guo, Jiahong Zhang, Shihong Chen and Yunchao Tang
Agronomy 2026, 16(3), 370; https://doi.org/10.3390/agronomy16030370 - 2 Feb 2026
Abstract
Estimating grape yields in viticulture is hindered by persistent challenges, including strong occlusion between grapes, irregular cluster morphologies, and fluctuating illumination throughout the growing season. This study introduces YOLOv11-IMP, an improved multiscale anchor-free detection framework extending YOLOv11, tailored to vineyard environments. Its architecture [...] Read more.
Estimating grape yields in viticulture is hindered by persistent challenges, including strong occlusion between grapes, irregular cluster morphologies, and fluctuating illumination throughout the growing season. This study introduces YOLOv11-IMP, an improved multiscale anchor-free detection framework extending YOLOv11, tailored to vineyard environments. Its architecture comprises five specialized components: (i) a viticulture-oriented backbone employing cross-stage partial fusion with depthwise convolutions for enriched feature extraction, (ii) a bifurcated neck enhanced by large-kernel attention to expand the receptive field coverage, (iii) a scale-adaptive anchor-free detection head for robust multiscale localization, (iv) a cross-modal processing module integrating visual features with auxiliary textual descriptors to enable fine-grained cluster-level yield estimation, and (v) aross multiple scales. This work evaluated YOLOv11-IMP on five grape varieties collecten augmented spatial pyramid pooling module that aggregates contextual information acd under diverse environmental conditions. The framework achieved 94.3% precision and 93.5% recall for cluster detection, with a mean absolute error (MAE) of 0.46 kg per vine. The robustness tests found less than 3.4% variation in accuracy across lighting and weather conditions. These results demonstrate that YOLOv11-IMP can deliver high-fidelity, real-time yield data, supporting decision-making for precision viticulture and sustainable agricultural management. Full article
(This article belongs to the Special Issue Innovations in Agriculture for Sustainable Agro-Systems)
Show Figures

Figure 1

45 pages, 5418 KB  
Review
Visual and Visual–Inertial SLAM for UGV Navigation in Unstructured Natural Environments: A Survey of Challenges and Deep Learning Advances
by Tiago Pereira, Carlos Viegas, Salviano Soares and Nuno Ferreira
Robotics 2026, 15(2), 35; https://doi.org/10.3390/robotics15020035 - 2 Feb 2026
Viewed by 25
Abstract
Localization and mapping remain critical challenges for Unmanned Ground Vehicles (UGVs) operating in unstructured natural environments, such as forests and agricultural fields. While Visual SLAM (VSLAM) and Visual–Inertial SLAM (VI-SLAM) have matured significantly in structured and urban scenarios, their extension to outdoor natural [...] Read more.
Localization and mapping remain critical challenges for Unmanned Ground Vehicles (UGVs) operating in unstructured natural environments, such as forests and agricultural fields. While Visual SLAM (VSLAM) and Visual–Inertial SLAM (VI-SLAM) have matured significantly in structured and urban scenarios, their extension to outdoor natural domains introduces severe challenges, including dynamic vegetation, illumination variations, a lack of distinctive features, and degraded GNSS availability. Recent advances in Deep Learning have brought promising developments to VSLAM- and VI-SLAM-based pipelines, ranging from learned feature extraction and matching to self-supervised monocular depth prediction and differentiable end-to-end SLAM frameworks. Furthermore, emerging methods for adaptive sensor fusion, leveraging attention mechanisms and reinforcement learning, open new opportunities to improve robustness by dynamically weighting the contributions of camera and IMU measurements. This review provides a comprehensive overview of Visual and Visual–Inertial SLAM for UGVs in unstructured environments, highlighting the challenges posed by natural contexts and the limitations of current pipelines. Classic VI-SLAM frameworks and recent Deep-Learning-based approaches were systematically reviewed. Special attention is given to field robotics applications in agriculture and forestry, where low-cost sensors and robustness against environmental variability are essential. Finally, open research directions are discussed, including self-supervised representation learning, adaptive sensor confidence models, and scalable low-cost alternatives. By identifying key gaps and opportunities, this work aims to guide future research toward resilient, adaptive, and economically viable VSLAM and VI-SLAM pipelines, tailored for UGV navigation in unstructured natural environments. Full article
(This article belongs to the Special Issue Localization and 3D Mapping of Intelligent Robotics)
Show Figures

Figure 1

20 pages, 1202 KB  
Article
Adaptive ORB Accelerator on FPGA: High Throughput, Power Consumption, and More Efficient Vision for UAVs
by Hussam Rostum and József Vásárhelyi
Signals 2026, 7(1), 13; https://doi.org/10.3390/signals7010013 - 2 Feb 2026
Viewed by 42
Abstract
Feature extraction and description are fundamental components of visual perception systems used in applications such as visual odometry, Simultaneous Localization and Mapping (SLAM), and autonomous navigation. In resource-constrained platforms, such as Unmanned Aerial Vehicles (UAVs), achieving real-time hardware acceleration on Field-Programmable Gate Arrays [...] Read more.
Feature extraction and description are fundamental components of visual perception systems used in applications such as visual odometry, Simultaneous Localization and Mapping (SLAM), and autonomous navigation. In resource-constrained platforms, such as Unmanned Aerial Vehicles (UAVs), achieving real-time hardware acceleration on Field-Programmable Gate Arrays (FPGAs) is challenging. This work demonstrates an FPGA-based implementation of an adaptive ORB (Oriented FAST and Rotated BRIEF) feature extraction pipeline designed for high-throughput and energy-efficient embedded vision. The proposed architecture is a completely new design for the main algorithmic blocks of ORB, including the FAST (Features from Accelerated Segment Test) feature detector, Gaussian image filtering, moment computation, and descriptor generation. Adaptive mechanisms are introduced to dynamically adjust thresholds and filtering behavior, improving robustness under varying illumination conditions. The design is developed using a High-Level Synthesis (HLS) approach, where all processing modules are implemented as reusable hardware IP cores and integrated at the system level. The architecture is deployed and evaluated on two FPGA platforms, PYNQ-Z2 and KRIA KR260, and its performance is compared against CPU and GPU implementations using a dedicated C++ testbench based on OpenCV. Experimental results demonstrate significant improvements in throughput and energy efficiency while maintaining stable and scalable performance, making the proposed solution suitable for real-time embedded vision applications on UAVs and similar platforms. Notably, the FPGA implementation increases DSP utilization from 11% to 29% compared to the previous designs implemented by other researchers, effectively offloading computational tasks from general purpose logic (LUTs and FFs), reducing LUT usage by 6% and FF usage by 13%, while maintaining overall design stability, scalability, and acceptable thermal margins at 2.387 W. This work establishes a robust foundation for integrating the optimized ORB pipeline into larger drone systems and opens the door for future system-level enhancements. Full article
Show Figures

Figure 1

20 pages, 1165 KB  
Article
Dynamic Attitude Estimation Method Based on LSTM-Enhanced Extended Kalman Filter
by Zhengnan Guo, Zhi Xiong, Ziyue Zhao, Haosen Han, Fan Wang, Shufang Jia and Zhongsheng Zhai
Appl. Sci. 2026, 16(3), 1466; https://doi.org/10.3390/app16031466 - 31 Jan 2026
Viewed by 110
Abstract
Visual–inertial attitude estimation systems often suffer from accuracy degradation and instability when visual measurements are intermittently lost due to occlusion or illumination changes. To address this issue, this paper proposes an LSTM-EKF framework for dynamic attitude estimation under visual information loss. In the [...] Read more.
Visual–inertial attitude estimation systems often suffer from accuracy degradation and instability when visual measurements are intermittently lost due to occlusion or illumination changes. To address this issue, this paper proposes an LSTM-EKF framework for dynamic attitude estimation under visual information loss. In the proposed method, an LSTM-based vision prediction network is designed to learn the temporal evolution of visual attitude measurements and to provide reliable pseudo-observations when camera data are unavailable, thereby maintaining continuous EKF updates. The algorithm is validated through turntable experiments, including long-term reciprocating rotation tests, continuous visual occlusion scanning experiments, and attitude accuracy evaluation experiments over an extended angular range. Experimental results show that the proposed LSTM-EKF effectively suppresses IMU error accumulation during visual outages and achieves lower RMSE compared with conventional EKF and AKF methods. In particular, the LSTM-EKF maintains stable estimation performance under a certain degree of visual occlusion and extends the effective attitude measurement range beyond the camera’s observable limits. These results demonstrate that the proposed method improves robustness and accuracy of visual–inertial attitude estimation in environments with intermittent visual degradation. Full article
27 pages, 20812 KB  
Article
A Lightweight Radar–Camera Fusion Deep Learning Model for Human Activity Recognition
by Minkyung Jeon and Sungmin Woo
Sensors 2026, 26(3), 894; https://doi.org/10.3390/s26030894 - 29 Jan 2026
Viewed by 219
Abstract
Human activity recognition in privacy-sensitive indoor environments requires sensing modalities that remain robust under illumination variation and background clutter while preserving user anonymity. To this end, this study proposes a lightweight radar–camera fusion deep learning model that integrates motion signatures from FMCW radar [...] Read more.
Human activity recognition in privacy-sensitive indoor environments requires sensing modalities that remain robust under illumination variation and background clutter while preserving user anonymity. To this end, this study proposes a lightweight radar–camera fusion deep learning model that integrates motion signatures from FMCW radar with coarse spatial cues from ultra-low-resolution camera frames. The radar stream is processed as a Range–Doppler–Time cube, where each frame is flattened and sequentially encoded using a Transformer-based temporal model to capture fine-grained micro-Doppler patterns. The visual stream employs a privacy-preserving 4×5-pixel camera input, from which a temporal sequence of difference frames is extracted and modeled with a dedicated camera Transformer encoder. The two modality-specific feature vectors—each representing the temporal dynamics of motion—are concatenated and passed through a lightweight fully connected classifier to predict human activity categories. A multimodal dataset of synchronized radar cubes and ultra-low-resolution camera sequences across 15 activity classes was constructed for evaluation. Experimental results show that the proposed fusion model achieves 98.74% classification accuracy, significantly outperforming single-modality baselines (single-radar and single-camera). Despite its performance, the entire model requires only 11 million floating-point operations (11 MFLOPs), making it highly efficient for deployment on embedded or edge devices. Full article
(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems—2nd Edition)
Show Figures

Figure 1

26 pages, 2114 KB  
Article
Foreign Object Detection on Conveyor Belts in Coal Mines Based on RTA-YOLOv11
by Liwen Wang, Kehan Hu, Xiaonan Shi and Junhe Chen
Appl. Sci. 2026, 16(3), 1375; https://doi.org/10.3390/app16031375 - 29 Jan 2026
Viewed by 95
Abstract
To address the challenges of limited detection accuracy and the difficulty of deployment on edge devices caused by dust obstruction, low illumination, and complex background interference in coal mine conveyor belt foreign object detection, this paper proposes an improved algorithm model, RTA-YOLOv11, based [...] Read more.
To address the challenges of limited detection accuracy and the difficulty of deployment on edge devices caused by dust obstruction, low illumination, and complex background interference in coal mine conveyor belt foreign object detection, this paper proposes an improved algorithm model, RTA-YOLOv11, based on the YOLOv11 framework. First, a Receptive Field Enhancement Module (RFEM) is utilized to expand the field of view by fusing multi-scale perception paths, strengthening the network’s semantic capture capability for subtle targets. Second, a Triplet Attention mechanism is introduced to suppress environmental noise and enhance the saliency of low-contrast foreign objects through cross-dimensional joint modeling of spatial and channel information. Finally, a lightweight detection head based on MBConv is designed, utilizing inverted bottleneck structures and re-parameterization strategies to compress redundant parameters and improve deployment efficiency on edge devices. Experimental results indicate that the mAP@0.5 of the improved RTA-YOLOv11 model is 4.0 percentage points higher than that of the original YOLOv11, with an inference speed of 79 FPS and a reduction in parameters of approximately 22%. Compared with algorithms such as Faster R-CNN, SSD, and YOLOv8, this model demonstrates a superior balance between accuracy and speed, providing an efficient and practical solution for intelligent mine visual perception systems. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
21 pages, 358 KB  
Article
Plato’s Tragicomic Ascent
by Louis A. Ruprecht
Religions 2026, 17(2), 156; https://doi.org/10.3390/rel17020156 - 29 Jan 2026
Viewed by 96
Abstract
This article explores the richly visual vocabulary characteristic of the Platonic corpus. Focusing on Plato’s linkage of seeing and knowing, it will explore a two-fold paradox: first, that the soul’s ascent is consistently depicted as a painful matter by Plato; and second, that [...] Read more.
This article explores the richly visual vocabulary characteristic of the Platonic corpus. Focusing on Plato’s linkage of seeing and knowing, it will explore a two-fold paradox: first, that the soul’s ascent is consistently depicted as a painful matter by Plato; and second, that it customarily involves some emphatically bodily mechanics. These textual and rhetorical details may, in their turn, call for a significant re-thinking of several truisms regarding Platonic spirituality and “Platonic love.” Four revisions follow. First, Platonic philosophy was not radically dualistic. Second, it was not aggressively rationalist, and secularist, informed by a blanket opposition to myth, to poetry, and to religious images. Third, it aspired to illumination without breezily claiming to bathe in that light. And fourth, it embraced and ennobled the ecstatic transport vouchsafed to embodied creatures by eros, that subtle species of desire that was, if not divine, then surely sublime. Full article
20 pages, 9487 KB  
Article
YOLO-DFBL: An Improved YOLOv11n-Based Method for Pressure-Relief Borehole Detection in Coal Mine Roadways
by Xiaofei An, Zhongbin Wang, Dong Wei, Jinheng Gu, Futao Li, Cong Zhang and Gangdong Xia
Machines 2026, 14(2), 150; https://doi.org/10.3390/machines14020150 - 29 Jan 2026
Viewed by 155
Abstract
Accurate detection of pressure-relief boreholes is crucial for evaluating drilling quality and monitoring safety in coal mine roadways. Nevertheless, the highly challenging underground environment—characterized by insufficient lighting, severe dust and water mist disturbances, and frequent occlusions—poses substantial difficulties for current object detection approaches, [...] Read more.
Accurate detection of pressure-relief boreholes is crucial for evaluating drilling quality and monitoring safety in coal mine roadways. Nevertheless, the highly challenging underground environment—characterized by insufficient lighting, severe dust and water mist disturbances, and frequent occlusions—poses substantial difficulties for current object detection approaches, particularly in identifying small-scale and low-visibility targets. To effectively tackle these issues, a lightweight and robust detection framework, referred to as YOLO-DFBL, is developed using the YOLOv11n architecture. The proposed approach incorporates a DualConv-based lightweight convolution module to optimize the efficiency of feature extraction, a Frequency Spectrum Dynamic Aggregation (FSDA) module for noise-robust enhancement, and a Biformer (Bi-level Routing Transformer)-based routing attention mechanism for improved long-range dependency modeling. In addition, a Lightweight Shared Convolution Head (LSCH) is incorporated to effectively decrease the overall model complexity. Experimental results on a real coal mine roadway dataset demonstrate that YOLO-DFBL achieves an mAP@50:95 of 78.9%, with a compact model size of 1.94 M parameters, a computational complexity of 4.7 GFLOPs, and an inference speed of 157.3 FPS, demonstrating superior accuracy–efficiency trade-offs compared with representative lightweight YOLO variants and classical detectors. Field experiments under challenging low-illumination and occlusion environments confirm the robustness of the proposed approach in real mining scenarios. The developed method enables reliable visual perception for underground drilling equipment and facilitates safer and more intelligent operations in coal mine engineering. Full article
(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)
Show Figures

Figure 1

45 pages, 827 KB  
Article
Real-Time Visual Anomaly Detection in High-Speed Motorsport: An Entropy-Driven Hybrid Retrieval- and Cache-Augmented Architecture
by Rubén Juárez Cádiz and Fernando Rodríguez-Sela
J. Imaging 2026, 12(2), 60; https://doi.org/10.3390/jimaging12020060 - 28 Jan 2026
Viewed by 135
Abstract
At 300 km/h, an end-to-end vision delay of 100 ms corresponds to 8.3 m of unobserved travel; therefore, real-time anomaly monitoring must balance sensitivity with strict tail-latency constraints at the edge. We propose a hybrid cache–retrieval inference architecture for visual anomaly detection in [...] Read more.
At 300 km/h, an end-to-end vision delay of 100 ms corresponds to 8.3 m of unobserved travel; therefore, real-time anomaly monitoring must balance sensitivity with strict tail-latency constraints at the edge. We propose a hybrid cache–retrieval inference architecture for visual anomaly detection in high-speed motorsport that exploits lap-to-lap spatiotemporal redundancy while reserving local similarity retrieval for genuinely uncertain events. The system combines a hierarchical visual encoder (a lightweight backbone with selective refinement via a Nested U-Net for texture-level cues) and an uncertainty-driven router that selects between two memory pathways: (i) a static cache of precomputed scene embeddings for track/background context and (ii) local similarity retrieval over historical telemetry–vision patterns to ground ambiguous frames, improve interpretability, and stabilize decisions under high uncertainty. Routing is governed by an entropy signal computed from prediction and embedding uncertainty: low-entropy frames follow a cache-first path, whereas high-entropy frames trigger retrieval and refinement to preserve decision stability without sacrificing latency. On a high-fidelity closed-circuit benchmark with synchronized onboard video and telemetry and controlled anomaly injections (tire degradation, suspension chatter, and illumination shifts), the proposed approach reduces mean end-to-end latency to 21.7 ms versus 48.6 ms for a retrieval-only baseline (55.3% reduction) while achieving Macro-F1 = 0.89 at safety-oriented operating points. The framework is designed for passive monitoring and decision support, producing advisory outputs without actuating ECU control strategies. Full article
(This article belongs to the Special Issue AI-Driven Image and Video Understanding)
19 pages, 1710 KB  
Article
Bacterial Colony Counting and Classification System Based on Deep Learning Model
by Chuchart Pintavirooj, Manao Bunkum, Naphatsawan Vongmanee, Jindapa Nampeng and Sarinporn Visitsattapongse
Appl. Sci. 2026, 16(3), 1313; https://doi.org/10.3390/app16031313 - 28 Jan 2026
Viewed by 138
Abstract
Microbiological analysis is crucial for identifying species, assessing infections, and diagnosing infectious diseases, thereby supporting both research studies and medical diagnosis. In response to these needs, accurate and efficient identification of bacterial colonies is essential. Conventionally, this process is performed through manual counting [...] Read more.
Microbiological analysis is crucial for identifying species, assessing infections, and diagnosing infectious diseases, thereby supporting both research studies and medical diagnosis. In response to these needs, accurate and efficient identification of bacterial colonies is essential. Conventionally, this process is performed through manual counting and visual inspection of colonies on agar plates. However, this approach is prone to several limitations arising from human error and external factors such as lighting conditions, surface reflections, and image resolution. To overcome these limitations, an automated bacterial colony counting and classification system was developed by integrating a custom-designed imaging device with advanced deep learning models. The imaging device incorporates controlled illumination, matte-coated surfaces, and a high-resolution camera to minimize reflections and external noise, thereby ensuring consistent and reliable image acquisition. Image-processing algorithms implemented in MATLAB were employed to detect bacterial colonies, remove background artifacts, and generate cropped colony images for subsequent classification. A dataset comprising nine bacterial species was compiled and systematically evaluated using five deep learning architectures: ResNet-18, ResNet-50, Inception V3, GoogLeNet, and the state-of-the-art EfficientNet-B0. Experimental results demonstrated high colony-counting accuracy, with a mean accuracy of 90.79% ± 5.25% compared to manual counting. The coefficient of determination (R2 = 0.9083) indicated a strong correlation between automated and manual counting results. For colony classification, EfficientNet-B0 achieved the best performance, with an accuracy of 99.78% and a macro-F1 score of 0.99, demonstrating strong capability in distinguishing morphologically distinct colonies such as Serratia marcescens. Compared with previous studies, this research provides a time-efficient and scalable solution that balances high accuracy with computational efficiency. Overall, the findings highlight the potential of combining optimized imaging systems with modern lightweight deep learning models to advance microbiological diagnostics and improve routine laboratory workflows. Full article
(This article belongs to the Special Issue AI-Based Biomedical Signal and Image Processing)
Show Figures

Figure 1

16 pages, 3367 KB  
Article
Utilizing Multimodal Logic Fusion to Identify the Types of Food Waste Sources
by Dong-Ming Gao, Jia-Qi Song, Zong-Qiang Fu, Zhi Liu and Gang Li
Sensors 2026, 26(3), 851; https://doi.org/10.3390/s26030851 - 28 Jan 2026
Viewed by 107
Abstract
It is a challenge to identify food waste sources in all-weather industrial environments, as variable lighting conditions can compromise the effectiveness of visual recognition models. This study proposes and validates a robust, interpretable, and adaptive multimodal logic fusion method in which sensor dominance [...] Read more.
It is a challenge to identify food waste sources in all-weather industrial environments, as variable lighting conditions can compromise the effectiveness of visual recognition models. This study proposes and validates a robust, interpretable, and adaptive multimodal logic fusion method in which sensor dominance is dynamically assigned based on real-time illuminance intensity. The method comprises two foundational components: (1) a lightweight MobileNetV3 + EMA model for image recognition; and (2) an audio model employing Fast Fourier Transform (FFT) for feature extraction and Support Vector Machine (SVM) for classification. The key contribution of this system lies in its environment-aware conditional logic. The image model MobileNetV3 + EMA achieves an accuracy of 99.46% within the optimal brightness range (120–240 cd m−2), significantly outperforming the audio model. However, its performance degrades significantly outside the optimal range, while the audio model maintains an illumination-independent accuracy of 0.80, a recall of 0.78, and an F1 score of 0.80. When light intensity falls below the threshold of 84 cd m−2, the audio recognition results take precedence. This strategy ensures robust classification accuracy under variable environmental conditions, preventing model failure. Validated on an independent test set, the fusion method achieves an overall accuracy of 90.25%, providing an interpretable and resilient solution for real-world industrial deployment. Full article
(This article belongs to the Special Issue Multi-Sensor Data Fusion)
Show Figures

Figure 1

22 pages, 3743 KB  
Review
A Science Mapping Analysis of Computational Methods and Exploration of Electrical Transport Studies in Solar Cells
by Noor ul ain Ahmed, Patrizia Lamberti and Vincenzo Tucci
Materials 2026, 19(3), 452; https://doi.org/10.3390/ma19030452 - 23 Jan 2026
Viewed by 197
Abstract
This study investigates the state of the art related to the computational methods for solar cells. Numerical modeling is a basic pillar that is used to ensure the robust design of any device. In this paper, the results of a detailed science mapping-based [...] Read more.
This study investigates the state of the art related to the computational methods for solar cells. Numerical modeling is a basic pillar that is used to ensure the robust design of any device. In this paper, the results of a detailed science mapping-based analysis on the publications that focus on the “numerical modelling of solar cells” are presented. The query was conducted on the Web of Science for 2014–2024, and a subsequent filtering was performed. The results of this analysis provided the answers to the five research questions posed. The paper has been divided into two parts. In the first part, the literature search began with a broad examination, and 3259 studies were included in the analysis. To present the results in a visual form, graphs created using VOS viewer software have been used to identify the pattern of co-authorship, the geographical distribution of the authors, and the keywords most frequently used. In the second part, the analysis focused on three main aspects: (i) the influence of absorber layer thickness on optical absorption and device efficiency, (ii) the role of different ETL/HTL materials in charge transport, and (iii) the effect of illumination conditions on carrier dynamics and photovoltaic performance. By integrating the results across these dimensions, the study provides a comprehensive understanding of how these parameters collectively determine the efficiency and reliability of perovskite solar cells. Full article
Show Figures

Graphical abstract

36 pages, 4183 KB  
Article
Distinguishing a Drone from Birds Based on Trajectory Movement and Deep Learning
by Andrii Nesteruk, Valerii Nikitin, Yosyp Albrekht, Łukasz Ścisło, Damian Grela and Paweł Król
Sensors 2026, 26(3), 755; https://doi.org/10.3390/s26030755 - 23 Jan 2026
Viewed by 165
Abstract
Unmanned aerial vehicles (UAVs) increasingly share low-altitude airspace with birds, making early distinguishing between drones and biological targets critical for safety and security. This work addresses long-range scenarios where objects occupy only a few pixels and appearance-based recognition becomes unreliable. We develop a [...] Read more.
Unmanned aerial vehicles (UAVs) increasingly share low-altitude airspace with birds, making early distinguishing between drones and biological targets critical for safety and security. This work addresses long-range scenarios where objects occupy only a few pixels and appearance-based recognition becomes unreliable. We develop a model-driven simulation pipeline that generates synthetic data with a controlled camera model, atmospheric background and realistic motion of three aerial target types: multicopter, fixed-wing UAV and bird. From these sequences, each track is encoded as a time series of image-plane coordinates and apparent size, and a bidirectional long short-term memory (LSTM) network is trained to classify trajectories as drone-like or bird-like. The model learns characteristic differences in smoothness, turning behavior and velocity fluctuations, and to achieve reliable separation between drone and bird motion patterns on synthetic test data. Motion-trajectory cues alone can support early distinguishing of drones from birds when visual details are scarce, providing a complementary signal to conventional image-based detection. The proposed synthetic data and sequence classification pipeline forms a reproducible testbed that can be extended with real trajectories from radar or video tracking systems and used to prototype and benchmark trajectory-based recognizers for integrated surveillance solutions. The proposed method is designed to generalize naturally to real surveillance systems, as it relies on trajectory-level motion patterns rather than appearance-based features that are sensitive to sensor quality, illumination, or weather conditions. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Graphical abstract

Back to TopTop