Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (700)

Search Parameters:
Keywords = occlusion training

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
46 pages, 22936 KB  
Article
A 3D Gaussian Splatting Method with Deterministic Structure-Sensitive Adaptive Density Control for UAV Orthophoto Generation
by Ke Yan, Hui Wang, Zhuxin Li, Yuting Wang, Shuo Li and Liyong Wang
Remote Sens. 2026, 18(9), 1400; https://doi.org/10.3390/rs18091400 - 1 May 2026
Abstract
Unmanned Aerial Vehicle (UAV) orthophoto generation in complex environments remains challenging because weak textures, reflective surfaces, occlusions, and large scene extents can cause incomplete reconstruction, ghosting, and seam artifacts. Although 3D Gaussian Splatting (3DGS) offers an efficient explicit scene representation, its use in [...] Read more.
Unmanned Aerial Vehicle (UAV) orthophoto generation in complex environments remains challenging because weak textures, reflective surfaces, occlusions, and large scene extents can cause incomplete reconstruction, ghosting, and seam artifacts. Although 3D Gaussian Splatting (3DGS) offers an efficient explicit scene representation, its use in large-scale UAV orthophoto generation is limited by high memory consumption, unstable densification, and insufficient support for mapping-oriented orthographic rendering. This paper proposes a single-GPU 3DGS framework for UAV orthophoto generation by integrating adaptive spatial block partitioning, deterministic structure-sensitive adaptive density control, and core–buffer tiled orthographic rendering with weighted blending. The proposed framework decomposes large scenes into resource-bounded subregions, guides Gaussian densification using fixed multi-view neighborhoods and edge-enhanced dynamic consistency, and generates large-format orthophotos with reduced boundary and seam artifacts. Experiments on MatrixCity-S and multiple UAV photogrammetric datasets show that the method achieves competitive reconstruction quality and improved resource efficiency. On MatrixCity-S, it reaches 29.01 dB PSNR and 0.901 SSIM, while completing training in 1 h 49 min on a single NVIDIA RTX 3090 GPU. Compared with BlockGS, peak VRAM consumption is reduced by more than 38% across datasets. Under geo-aligned comparison conditions, line-measurement comparisons with MetaShape and Pix4DMapper yield RMSE values of 0.099 m and 0.087 m, respectively. These results demonstrate the potential of the proposed framework for memory-efficient 3DGS-based UAV orthophoto generation under constrained hardware resources, while further control-point-based validation is still needed for rigorous surveying-grade applications. Full article
(This article belongs to the Special Issue 3D Scene Perception and Reconstruction of Remote Sensing Imagery)
29 pages, 4742 KB  
Article
DistSense: A Distributed P2P System for Privacy-Preserving and Robust Audiovisual Activity Recognition in Smart Homes
by José Manuel Torres, Luis P. Mota, Rui S. Moreira, Christophe Soares and Pedro Sobral
Appl. Sci. 2026, 16(9), 4407; https://doi.org/10.3390/app16094407 - 30 Apr 2026
Abstract
Ambient Assisted Living (AAL) systems have become increasingly relevant as aging populations intensify the demand for technologies that promote autonomy, safety, and quality of life. However, the widespread adoption of audiovisual sensing in smart homes raises critical concerns regarding data protection, privacy, and [...] Read more.
Ambient Assisted Living (AAL) systems have become increasingly relevant as aging populations intensify the demand for technologies that promote autonomy, safety, and quality of life. However, the widespread adoption of audiovisual sensing in smart homes raises critical concerns regarding data protection, privacy, and user trust. Ensuring secure processing while maintaining accurate activity recognition remains a key challenge. This work introduces DistSense, a distributed Peer-to-Peer (P2P) system designed to enhance activity detection in domestic environments through collaborative inference among intelligent audiovisual sensors. DistSense prioritizes privacy by performing local processing, sharing only high-level events, and leveraging distributed ledger mechanisms to ensure data integrity and auditability and support cross-device validation. This collaborative strategy reduces false positives caused by occlusions, illumination variability, and acoustic noise. To assess the system, functional tests were conducted for each module, followed by two use cases evaluated in both simulated and real edge hardware environments. The trained models achieved 88% accuracy for audio and 80% for video, and the system demonstrated effective performance in detecting daily activities and domestic hazards under varying noise conditions. Results indicate that DistSense successfully balances security, user acceptance, and inference robustness, positioning it as a viable solution for privacy-preserving activity monitoring in smart home contexts. Full article
Show Figures

Figure 1

44 pages, 2726 KB  
Article
A Tiny Vision-Based Model for Real-Time Student Attention Detection in Online Classes
by Chaymae Yahyati, Ismail Lamaakal, Yassine Maleh, Khalid El Makkaoui and Ibrahim Ouahbi
Mach. Learn. Knowl. Extr. 2026, 8(5), 116; https://doi.org/10.3390/make8050116 - 28 Apr 2026
Viewed by 113
Abstract
Online and blended classrooms widen access but remove the in-person cues instructors use to gauge attention. Prior work typically relies on heavy, cloud-bound or multimodal models that are hard to deploy on commodity laptops, treats attention as an unordered label without calibrated probabilities, [...] Read more.
Online and blended classrooms widen access but remove the in-person cues instructors use to gauge attention. Prior work typically relies on heavy, cloud-bound or multimodal models that are hard to deploy on commodity laptops, treats attention as an unordered label without calibrated probabilities, and evaluates on subject-overlapping splits with limited robustness analysis. This creates a gap in Tiny, deployable, calibration-aware methods validated under realistic protocols. We address this gap with a TinyML, vision-only pipeline that estimates four attention levels: (Very Low, low, high, Very High ) from short webcam clips under strict on-device budgets. Each clip of T=30 frames at 224×224 is processed by a compact hybrid encoder: a CNN extracts per frame spatial features, a BiLSTM models temporal context, and a lightweight GRU refines dynamics; three parallel branches with staggered widths encourage feature diversity before fusion. We apply structured pruning of convolutional channels and recurrent units, post-training INT8 quantization, and temperature scaling for calibrated probabilities; models are exported as ONNX. On DAiSEE with subject-independent splits, the baseline attains 99.86% accuracy and 0.998 macro-F1, with strong ordinal agreement (QWK = 0.998, ordinal MAE = 0.03). The compressed model preserves reliability (macro-F1 = 0.995, QWK = 0.995), remains robust to low light, partial occlusion, and head yaw, and yields ∼4× smaller size and ∼2.3× CPU speedups. These results indicate a deployable, privacy-preserving approach to fine-grained, on-device attention analytics. Full article
Show Figures

Figure 1

19 pages, 1314 KB  
Review
Blood Flow Restriction in Athletic Populations—Part 2: Applications in Resistance Training Across the Loading Spectrum
by Chris Gaviglio, Christian J. Cook and Stephen P. Bird
J. Funct. Morphol. Kinesiol. 2026, 11(2), 176; https://doi.org/10.3390/jfmk11020176 - 27 Apr 2026
Viewed by 206
Abstract
Background: Blood flow restriction (BFR) resistance exercise has emerged as a training methodology capable of inducing muscular adaptations comparable to traditional high-load training despite substantially lower mechanical loads. While low-load BFR protocols (20–50% 1RM) are well-established, emerging evidence supports applications across the full [...] Read more.
Background: Blood flow restriction (BFR) resistance exercise has emerged as a training methodology capable of inducing muscular adaptations comparable to traditional high-load training despite substantially lower mechanical loads. While low-load BFR protocols (20–50% 1RM) are well-established, emerging evidence supports applications across the full loading spectrum, including moderate-to-high loads (>50–90% 1RM), contralateral training effects, and proximal–distal adaptations. In this second installment of the Blood Flow Restriction in Athletic Populations series, we review current evidence on BFR resistance exercise in athletic populations, with emphasis on morphological, neuromuscular, and functional adaptations across diverse application contexts. Methods: A narrative review of research examining BFR resistance exercise in trained and athletic populations was conducted via a PubMed/MEDLINE search. Search terms: (“blood flow restriction” OR “BFR” OR “occlusion training” OR “KAATSU”) AND (“resistance training” OR “resistance exercise” OR “strength training”) AND (“athletes” OR “athletic” OR “trained” OR “elite” OR “sport”) AND (“cross-education” OR “contralateral” OR “cross transfer” OR “proximal” OR “distal”). Studies investigating low-load (20–50% 1RM) and moderate-to-high load (>50% 1RM) protocols, contralateral cross-education effects, and proximal–distal adaptations were evaluated. Primary outcomes included muscle hypertrophy, strength, power, and sport-specific performance measures. Results: Low-load BFR resistance exercise has been shown to produce significant improvements in muscle hypertrophy and strength gains over 4–12 week interventions compared to low-load control conditions. Moderate-to-high load BFR enhanced barbell velocity and power output, particularly at loads > 80% 1RM with intermittent inflation protocols. Contralateral and cross-transfer effects of BFR training demonstrate variable efficacy across muscle groups, with the most consistent evidence supporting cross-transfer enhancement of training adaptations when BFR is applied to one body region while exercising another. Proximal BFR application induced adaptations in both proximal and distal musculature, suggesting systemic mechanisms beyond local vascular restriction. Conclusions: BFR resistance exercise represents a versatile training modality producing meaningful morphological and neuromuscular adaptations across the loading spectrum. Contralateral and proximal–distal effects expand practical applications for injury rehabilitation and targeted adaptation. These findings support BFR integration within periodized training programs when mechanical load management is prioritized. Full article
Show Figures

Figure 1

13 pages, 1341 KB  
Review
Blood Flow Restriction in Athletic Populations—Part 1: Safety Considerations, and Methodological Frameworks
by Chris Gaviglio, Christian J. Cook and Stephen P. Bird
J. Funct. Morphol. Kinesiol. 2026, 11(2), 175; https://doi.org/10.3390/jfmk11020175 - 27 Apr 2026
Viewed by 161
Abstract
Background: Blood flow restriction (BFR) training induces morphological and neuromuscular adaptations using low-intensity exercise (20–40% 1RM), offering a reduced mechanical load alternative to traditional high-load resistance training. Safe and effective implementation, however, requires a clear understanding of physiological mechanisms, contraindications, and pressure [...] Read more.
Background: Blood flow restriction (BFR) training induces morphological and neuromuscular adaptations using low-intensity exercise (20–40% 1RM), offering a reduced mechanical load alternative to traditional high-load resistance training. Safe and effective implementation, however, requires a clear understanding of physiological mechanisms, contraindications, and pressure determination methodologies. In this three-part series, we provide a comprehensive review of BFR for athletic populations and provide strength and conditioning coaches with a structured framework for screening, safety, and methodological considerations to support BFR integration in high-performance settings. Methods: A narrative review of the literature examining BFR safety, contraindication screening, adverse event reporting, and occlusion pressure determination was conducted using a PubMed and MEDLINE search. Search terms included combinations of (“blood flow restriction” OR “BFR” OR “occlusion training” OR “KAATSU”) AND (“safety” OR “contraindications” OR “risk stratification”) AND (“arterial occlusion pressure” OR “limb occlusion pressure” OR “occlusion pressure” OR “Doppler” OR “handheld Doppler” OR “pulse oximetry” OR “cuff width” OR “capillary refill time” OR “monitoring”). Studies examining contraindication screening systems, arterial occlusion pressure calculation methods, and real-time monitoring protocols were evaluated. Primary considerations included risk stratification frameworks, pressure determination accuracy, and control parameter validation for ensuring vascular safety during application. Results: Risk stratification systems can effectively identify absolute and relative contraindications requiring medical clearance prior to BFR use. Epidemiological data indicate that adverse events are transient and non-serious, while serious events appear rare when evidence-informed protocols are applied. Doppler-based assessment remains a criterion approach for determining inflation pressure, although validated estimation methods using limb circumference and systolic blood pressure offer a pragmatic and comparable alternative for applied environments. Inflation pressures of 50–80% arterial occlusion, adjusted for cuff width, produce effective and safe stimulus. Real-time monitoring through capillary refill time, pulse strength palpation, and skin coloration can support iterative pressure optimization and help identify excessive restriction pressures. Conclusions: BFR implementation in athletic populations requires systematic screening protocols, individualized inflation pressure determination using validated methods, and real-time monitoring parameters. These foundations provide the essential safety infrastructure required before progressing to specific training applications across resistance, cardiovascular, and other performance and rehabilitation modalities. Full article
Show Figures

Figure 1

30 pages, 10578 KB  
Article
IMAU-Net: A Hybrid Multi-Scale Deep Learning Framework for Liver Segmentation from Laparoscopic Images
by Syeda Sitara Waseem, Sarang Shaikh and Syed Rizwan Hassan
Sensors 2026, 26(9), 2695; https://doi.org/10.3390/s26092695 - 27 Apr 2026
Viewed by 327
Abstract
Accurate liver segmentation in laparoscopic surgery is critical but remains challenging due to low contrast, occlusion, and irregular organ boundaries. While deep learning has advanced medical image segmentation, existing models often trade off between accuracy, computational efficiency, and boundary precision. We propose IMAU-Net, [...] Read more.
Accurate liver segmentation in laparoscopic surgery is critical but remains challenging due to low contrast, occlusion, and irregular organ boundaries. While deep learning has advanced medical image segmentation, existing models often trade off between accuracy, computational efficiency, and boundary precision. We propose IMAU-Net, a hybrid architecture integrating a pre-trained InceptionV3 encoder with a novel bottleneck combining Multi-Core Pooling (MCP) and enhanced Atrous Spatial Pyramid Pooling (ASPP). The MCP module captures fine-to-medium spatial details through parallel multi-kernel pooling, while ASPP extracts multi-scale contextual information via dilated convolutions. Evaluated on the M2CAI dataset with 5-fold cross-validation, IMAU-Net achieves a mean Dice coefficient of 0.9179 ± 0.012 and IoU of 0.8483 ± 0.015. Furthermore, external validation on the independent CholecSeg8K dataset (250 test samples) demonstrates generalizability across different laparoscopic procedures, achieving a Dice coefficient of 0.8745 ± 0.0312 and AUC of 0.9542, with a performance degradation of only 4.3% despite domain shift between liver surgery and cholecystectomy. Comparative analysis with state of the art methods demonstrates superior performance, with computational efficiency suitable for real-time applications (45 FPS, 42.3 M parameters). The proposed architecture provides an optimal balance between accuracy and efficiency for intraoperative guidance systems. While evaluated on retrospective laparoscopic image datasets rather than real-time intraoperative workflows, the model demonstrates potential for integration into surgical guidance systems pending prospective validation. Full article
Show Figures

Figure 1

31 pages, 6114 KB  
Article
A Multi-Stage YOLOv11-Based Deep Learning Framework for Robust Instance Segmentation and Material Quantification of Mixed Plastic Waste
by Andrew N. Shafik, Mohamed H. Khafagy, Alber S. Aziz and Shereen A. Hussein
Computers 2026, 15(5), 271; https://doi.org/10.3390/computers15050271 - 24 Apr 2026
Viewed by 155
Abstract
Instance segmentation in heterogeneous waste scenes remains challenging due to object variability, deformable shapes, partial occlusion, and large appearance differences across packaging types. This study presents a YOLOv11-based deep learning framework for mixed plastic waste instance segmentation, developed to connect visual perception with [...] Read more.
Instance segmentation in heterogeneous waste scenes remains challenging due to object variability, deformable shapes, partial occlusion, and large appearance differences across packaging types. This study presents a YOLOv11-based deep learning framework for mixed plastic waste instance segmentation, developed to connect visual perception with reliable material quantification. The framework integrates curated instance-level annotations, strict split isolation, multi-stage optimization, training strategy ablation, and seed-robustness analysis to support reproducible model selection. Experimental results on a held-out test set show that the optimized model achieves a mask mAP@50:95 of 0.9337, indicating strong segmentation performance under heterogeneous waste-scene conditions. To extend the analysis beyond standard vision metrics, the framework incorporates a physics-informed mask-to-mass module that converts predicted masks into class-specific mass estimates using geometric calibration and material priors. Applied to a representative stream of 1253 detected objects, the system estimated a total plastic mass of 15.48 ± 1.08 kg, corresponding to a theoretical H2 potential of 0.41 ± 0.04 kg and a greenhouse-gas avoidance of 34.57 ± 4.15 kg CO2e. Overall, the proposed framework extends waste-scene understanding beyond vision-level assessment toward physically grounded, data-driven decision support for smart material recovery systems. Full article
(This article belongs to the Special Issue Machine Learning: Innovation, Implementation, and Impact)
30 pages, 1431 KB  
Article
Feasibility Analysis of Static-Image-Based Traffic Accident Detection Under Domain Shift for Edge-AI Surveillance Systems
by Chien-Chung Wu and Wei-Cheng Chen
Electronics 2026, 15(9), 1803; https://doi.org/10.3390/electronics15091803 - 23 Apr 2026
Viewed by 147
Abstract
Traffic accident detection is a critical component of intelligent transportation systems (ITS), enabling timely incident response and traffic management. While most existing approaches rely on temporal information from video sequences, such methods are not always applicable in resource-constrained surveillance environments. This study investigates [...] Read more.
Traffic accident detection is a critical component of intelligent transportation systems (ITS), enabling timely incident response and traffic management. While most existing approaches rely on temporal information from video sequences, such methods are not always applicable in resource-constrained surveillance environments. This study investigates the feasibility of detecting traffic accidents from single static images by formulating the task as a binary classification problem. Representative architectures, including Vision Transformer (ViT), Swin Transformer, and ResNet-50, are systematically evaluated on the Car Crash Dataset (CCD) under multiple training configurations. To assess generalization capability, cross-domain evaluation is conducted using an external crash video dataset (ECVD) constructed to approximate real-world deployment conditions. Experimental results show that all models achieve strong performance under in-domain evaluation. However, cross-domain testing reveals substantial performance degradation, particularly in recall, indicating limited generalization capability under domain shift. Qualitative analysis further shows that missed detections are associated with weak visual cues, occlusion, and complex traffic environments, while false positives are caused by visually ambiguous patterns resembling accident scenarios. Unlike prior studies that primarily report performance improvements, this work provides empirical evidence that model behavior in static-image-based accident detection is governed by dataset composition rather than architectural design. Therefore, static-image-based accident detection should be interpreted as a coarse-level screening tool rather than a fully reliable decision-making system. This study highlights the importance of data-centric design and cross-domain evaluation for improving real-world applicability. Full article
(This article belongs to the Section Computer Science & Engineering)
18 pages, 1019 KB  
Article
Pose-Driven Cow Behavior Recognition in Complex Barn Environments: A Method Combining Knowledge Distillation and Deployment Optimization
by Jie Hu, Xuan Li, Ruyue Ren, Shujie Wang, Mingkai Yang, Jianing Zhao, Juan Liu and Fuzhong Li
Animals 2026, 16(9), 1301; https://doi.org/10.3390/ani16091301 - 23 Apr 2026
Viewed by 158
Abstract
Cattle behavior constitutes important phenotypic information reflecting animals’ health status, activity level, and welfare condition, and is therefore of considerable significance for automated monitoring and precision management in smart livestock farming. However, under complex barn conditions, cattle behavior recognition is easily affected by [...] Read more.
Cattle behavior constitutes important phenotypic information reflecting animals’ health status, activity level, and welfare condition, and is therefore of considerable significance for automated monitoring and precision management in smart livestock farming. However, under complex barn conditions, cattle behavior recognition is easily affected by factors such as illumination variation, partial occlusion, background interference, and individual differences, thereby reducing recognition stability and generalization capability. To address these challenges, this study proposes a pose-driven method for cattle behavior recognition in complex barn environments. First, a 16-keypoint annotation scheme suitable for describing bovine posture, termed cow16, was constructed. Based on this scheme, OpenPose was employed to extract heatmaps (HMs) and part affinity fields (PAFs), which were then used to build an intermediate HM/PAF posture representation. Subsequently, this representation was taken as the input to a lightweight convolutional neural network for classifying three behavioral categories: stand, walk, and lying. On this basis, class-imbalance correction during training and a multi-random-seed logits ensemble strategy during inference were further introduced. In addition, knowledge distillation was adopted to transfer knowledge from a high-performance teacher model to a lightweight student model. Experimental results demonstrate that training-stage class-imbalance correction and inference-stage multi-random-seed logits ensembling exhibit strong complementarity; when combined, the AB configuration improves the test-set Macro-F1 by 3.83 percentage points. Moreover, the distilled student model still achieves competitive recognition performance while maintaining 1× inference cost, indicating a favorable trade-off between accuracy and efficiency. This study provides a useful reference for deployment-oriented cattle behavior recognition in smart farming scenarios and offers a lightweight technical basis for subsequent practical applications. Full article
(This article belongs to the Section Cattle)
22 pages, 3855 KB  
Article
Application of Improved Genetic Algorithm Based on Voronoi Partitioning in Pseudolite Deployment for Tunnel Positioning Systems
by Kun Xie, Chenglin Cai, Zhouwang Yang and Jundao Pan
Sensors 2026, 26(9), 2596; https://doi.org/10.3390/s26092596 - 23 Apr 2026
Viewed by 416
Abstract
Reliable high-precision positioning in railway tunnels is essential for intelligent train operation and safety monitoring, yet GNSS signals are severely degraded by blockage and multipath. This paper proposes a deployment-oriented numerical framework to optimize pseudolite layouts in tunnels by explicitly modeling visibility obstruction [...] Read more.
Reliable high-precision positioning in railway tunnels is essential for intelligent train operation and safety monitoring, yet GNSS signals are severely degraded by blockage and multipath. This paper proposes a deployment-oriented numerical framework to optimize pseudolite layouts in tunnels by explicitly modeling visibility obstruction and controlling worst-case geometry along the train trajectory. A high-fidelity 3D tunnel–train model is established, in which line-of-sight (LoS) availability is screened under vehicle occlusion and trajectory-level geometric quality is evaluated accordingly. Instead of optimizing only the average PDOP, the proposed framework minimizes the trajectory 90th-percentile PDOP (qPDOP) to suppress tail-risk geometric degradation, while interpreting PDOP as an error amplification factor that directly affects positioning reliability under measurement noise and local multipath. The core contribution is a Voronoi-partition-constrained improved genetic algorithm (IGA) for tunnel pseudolite deployment. Voronoi partitioning enforces segment-wise coverage by requiring at least one pseudolite in each partition cell and avoids clustering-induced blind zones. Meanwhile, the IGA incorporates improved search and constraint-handling mechanisms to satisfy practical engineering requirements, including feasible installation regions, minimum spacing, mounting-face balance (ceiling/side walls), communication range, and continuous satellite visibility. Comparative simulations and ablation studies demonstrate that the proposed method achieves more uniform coverage and significantly improves full-trajectory geometric stability, reducing high-quantile PDOP and mitigating local spikes in occlusion-sensitive sections under cost-constrained sparse deployments. The proposed framework provides a practical and flexible toolchain for designing positioning-oriented pseudolite infrastructures in underground transportation environments. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

18 pages, 24765 KB  
Article
Field-Transformation-Based Light-Field Hologram Generation from a Single RGB Image
by Xiaoming Chen, Xiaoyu Jiang, Yingqing Huang, Xi Wang and Chaoqun Ma
Photonics 2026, 13(5), 407; https://doi.org/10.3390/photonics13050407 - 22 Apr 2026
Viewed by 341
Abstract
We propose a field-transformation-based framework for generating phase-only light-field holograms from a single RGB image. The method establishes an explicit pipeline from monocular scene inference to holographic wavefront synthesis, without requiring multi-view capture or task-specific hologram-network training. First, we construct a layered occlusion [...] Read more.
We propose a field-transformation-based framework for generating phase-only light-field holograms from a single RGB image. The method establishes an explicit pipeline from monocular scene inference to holographic wavefront synthesis, without requiring multi-view capture or task-specific hologram-network training. First, we construct a layered occlusion RGB-D model from the input image using monocular depth estimation, connectivity-based layer decomposition, and occlusion-aware inpainting, which provides a lightweight 3D prior for sparse-view rendering in the small-parallax regime. Second, we transform the rendered sparse RGB-D light field into a target complex wavefront on the recording plane through local frequency mapping, thereby bridging explicit scene geometry and wave-optical field construction. Third, we optimize the phase-only hologram under multi-plane amplitude constraints using a geometrically consistent initial phase and an error-driven adaptive depth-sampling strategy, which improves convergence stability and reconstruction quality under a limited computational budget. Numerical experiments show that the proposed method achieves better depth continuity, occlusion fidelity, and lower speckle noise than representative layer-based and point-based methods, and improves the average PSNR and SSIM by approximately 3 dB and 0.15, respectively, over Hogel-Free Holography. Optical experiments further confirm the physical feasibility and robustness of the proposed framework. Full article
Show Figures

Figure 1

22 pages, 481 KB  
Article
PrivAgriVolt: Privacy-Preserving Shadow-Aware Vision for Crop Stress Diagnosis in Agrivoltaic Photovoltaic Systems
by Zuoming Yin, Yifei Zhang, Qiangqiang Lei and Fang Feng
Electronics 2026, 15(8), 1762; https://doi.org/10.3390/electronics15081762 - 21 Apr 2026
Viewed by 156
Abstract
Agrivoltaic systems co-locate photovoltaic (PV) arrays and crops, offering land-use efficiency and potential microclimate benefits, yet they introduce new challenges for computer-vision-based crop monitoring. PV structures produce strong, spatially varying shadows, specular reflections, and periodic occlusions that confound visual cues for diagnosing crop [...] Read more.
Agrivoltaic systems co-locate photovoltaic (PV) arrays and crops, offering land-use efficiency and potential microclimate benefits, yet they introduce new challenges for computer-vision-based crop monitoring. PV structures produce strong, spatially varying shadows, specular reflections, and periodic occlusions that confound visual cues for diagnosing crop diseases and abiotic stresses. Meanwhile, agrivoltaic deployments are often distributed across farms and operators, making centralized data collection impractical due to privacy, ownership, and regulatory concerns. This paper proposes PrivAgriVolt, a novel privacy-preserving learning framework for agrivoltaic crop issue recognition that explicitly models PV-induced illumination and enables collaborative training without sharing raw images. The core algorithm integrates (i) a PV-geometry-conditioned shadow normalization module that fuses estimated array layout and sun-angle priors into a shadow-aware appearance canonization network, reducing illumination-induced domain shift across times and sites; (ii) a federated contrastive stress learner that aligns stress semantics across farms via prototype-based contrastive objectives while remaining robust to heterogeneous sensors and crop stages; and (iii) an adaptive privacy layer that combines secure aggregation with budget-aware gradient perturbation and client-level clipping to provide formal privacy guarantees while preserving fine-grained diagnostic performance. Extensive experiments on real agricultural vision benchmarks and agrivoltaic shadow variants demonstrate that PrivAgriVolt improves stress recognition and segmentation under PV shading while maintaining strong privacy–utility trade-offs. Full article
(This article belongs to the Special Issue Deep/Machine Learning in Visual Recognition and Anomaly Detection)
Show Figures

Figure 1

17 pages, 1650 KB  
Article
Safe Fall: Use of Predictive Modeling and Machine Vision Techniques for Fall Analysis and Fall Quality
by O. DelCastillo-Andrés, R. Fernández-García, J. C. Pastor-Vicedo, M. A. Lira, M. C. Campos-Mesa, C. Castañeda-Vázquez, E. Genovesi, S. Krstulović, G. Kuvačić, K. Morvay-Sey and R. Sánchez-Reolid
Sensors 2026, 26(8), 2491; https://doi.org/10.3390/s26082491 - 17 Apr 2026
Viewed by 624
Abstract
Falls are a leading cause of paediatric injuries, yet school-based prevention relies heavily on subjective observation rather than objective biomechanical assessment. This paper introduces the Safe Fall framework, integrating a judo-inspired educational programme with an occlusion-robust computer vision pipeline to quantify safe falling [...] Read more.
Falls are a leading cause of paediatric injuries, yet school-based prevention relies heavily on subjective observation rather than objective biomechanical assessment. This paper introduces the Safe Fall framework, integrating a judo-inspired educational programme with an occlusion-robust computer vision pipeline to quantify safe falling strategies. We analysed video recordings of 285 schoolchildren using a multi-stage architecture combining YOLOv8 for detection, SAM 2 for segmentation, and MMPose for skeletal tracking. The intervention yielded significant improvements in 60% of kinematic metrics (p<0.05), most notably a +61.4% increase in descent rate and expanded rolling ranges, indicating a shift from hazardous “freezing” behaviours to controlled energy dissipation. Unsupervised clustering confirmed a migration of students towards safe motor profiles, while a Random Forest classifier achieved an accuracy of 98.3% and an AUC of 0.998 in distinguishing fall quality. These findings demonstrate that integrating pedagogical training with automated vision modelling provides a scalable and evidence-based approach for reducing injury risk in real-world school environments. Full article
Show Figures

Figure 1

17 pages, 2603 KB  
Article
Detection of Pediatric Dental Caries in Panoramic Radiograph Using Deep Learning: A Benchmark Study on MD-OPG
by Hadi Rahimi, Seyed Mohammadrasoul Naeimi, Shayan Darvish, Bahareh Nazemi Salman, Parvin Razzaghi, Ionut Luchian and Dana Gabriela Budala
Sensors 2026, 26(8), 2481; https://doi.org/10.3390/s26082481 - 17 Apr 2026
Viewed by 338
Abstract
Early detection of dental caries in children is critical to prevent irreversible tooth damage and guarantee optimal oral health outcomes. However, interpreting pediatric panoramic radiographs throughout the mixed dentition stage remains a very challenging task due to overlap in anatomical structures and developmental [...] Read more.
Early detection of dental caries in children is critical to prevent irreversible tooth damage and guarantee optimal oral health outcomes. However, interpreting pediatric panoramic radiographs throughout the mixed dentition stage remains a very challenging task due to overlap in anatomical structures and developmental variability. This complexity underscores the need for well curated, representative datasets that enable the development of reliable computer-aided diagnostic models. Herein, this study introduces the Mixed Dentition Orthopantomogram Dataset, a newly developed, publicly available dataset of children that was carefully labeled by dental specialists to identify proximal and occlusal caries regions in the range of 3–12 years. To evaluate the dataset’s applicability for artificial intelligence research, we benchmarked it using both classification and segmentation models. A patch-based classifier achieved an average AUC of 0.89 and Recall 0.85 in distinguishing healthy and carious regions. For segmentation, we evaluated U-Net and Attention U-Net with multiple loss functions, and the Attention U-Net trained with Focal loss achieved the best Dice score of 0.94. Collectively, these findings support the dataset’s utility for pediatric caries analysis and demonstrate the viability of deep learning approaches for mixed dentition panoramic imaging. Full article
Show Figures

Figure 1

28 pages, 7973 KB  
Article
Quantifying the Impact of Data Augmentation on Cross-Domain Building Extraction from High-Resolution Imagery
by Dung Trung Pham, Thuong Van Tran, Nguyen Quang Minh, Jinghan Li and Xuan Zhu
Remote Sens. 2026, 18(8), 1176; https://doi.org/10.3390/rs18081176 - 15 Apr 2026
Viewed by 376
Abstract
Automatic building extraction from high-resolution imagery remains constrained by limited training data and domain shifts across geographic regions and spatial resolutions. Although data augmentation is widely applied in semantic segmentation, its capacity to compensate for scarce labeled samples under varying domain conditions remains [...] Read more.
Automatic building extraction from high-resolution imagery remains constrained by limited training data and domain shifts across geographic regions and spatial resolutions. Although data augmentation is widely applied in semantic segmentation, its capacity to compensate for scarce labeled samples under varying domain conditions remains insufficiently quantified in remotely sensed data. Here, we present a controlled data-centric evaluation to quantify how explicit, label-preserving augmentation influences model generalization under varying domain shifts, rather than proposing a new augmentation algorithm. The experimental design integrates DeepLabV3+ (CNN) and SegFormer (transformer) architectures to assess whether augmentation effects persist across distinct feature-learning paradigms. Four scenarios are constructed, including two intra-domain settings, a resolution shift (0.3 m to 0.1 m), and a geographic shift across heterogeneous urban environments. Training subsets are progressively sampled from 20% to 100% to isolate the interaction between data volume and distributional variability. Geometric, radiometric, and occlusion-based transformations are evaluated individually and in combination. Under cross-domain and low-data regimes, augmentation substantially increases predictive performance. Combined transformations increase mIoU from 0.572 to 0.688 at 20% training data in the resolution shift scenario, while geometric augmentation improves mIoU from 0.444 to 0.533 under geographic transfer. Models trained on 20% augmented data exceed the performance of 100% non-augmented configurations under pronounced domain discrepancies, establishing an operational threshold of data efficiency. Computational analysis indicates negligible overhead (approximately 1 s per epoch) through asynchronous data pipelines. Augmentation functions as a regularization mechanism in intra-domain settings and transitions to a distribution bridging mechanism under cross-domain conditions. Geometric invariance and engineered data diversity partially substitute for manual annotation, enabling improved cross-domain building extraction performance. Full article
(This article belongs to the Special Issue Urban Land Use Mapping Using Deep Learning)
Show Figures

Figure 1

Back to TopTop