Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (11,472)

Search Parameters:
Keywords = feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 47800 KB  
Article
AIGC-Driven Short Video Generation Based on the Controllable Multimodal Fusion Architecture
by Yan Zhu, Wei Li, Caixia Fan and Lu Yu
Electronics 2026, 15(9), 1783; https://doi.org/10.3390/electronics15091783 - 22 Apr 2026
Abstract
The utilization of Artificial Intelligence-Generated Content (AIGC) has attracted widespread attention in video content creation. To generate high-quality videos, this paper presents a controllable multimodal fusion architecture for AIGC-driven short-video production. This architecture employs hierarchical constraint mechanisms and a multimodal attention fusion mechanism [...] Read more.
The utilization of Artificial Intelligence-Generated Content (AIGC) has attracted widespread attention in video content creation. To generate high-quality videos, this paper presents a controllable multimodal fusion architecture for AIGC-driven short-video production. This architecture employs hierarchical constraint mechanisms and a multimodal attention fusion mechanism to enhance video content coherence and user controllability. Specifically, a scene coherence scheme is first designed to construct graph-based global and transition-level constraints by integrating text descriptions, reference images, and audio features. By leveraging the extracted style vector data, preliminary video clips are then generated through a combination of the cross-modal fusion unit and the spatio-temporal consistency unit. Finally, a fine-grained adjustment mechanism is implemented to ensure logical consistency and stylistic uniformity in the AIGC-generated videos. Experimental results indicate that the proposed architecture improves generation quality, controllability, and cross-segment coherence under the adopted evaluation settings. Full article
Show Figures

Figure 1

20 pages, 3665 KB  
Article
SDS-Former: A Transformer-Based Method for Semantic Segmentation of Arid Land Remote Sensing Imagery
by Yujie Du, Junfu Fan, Kuan Li and Yongrui Li
Algorithms 2026, 19(5), 325; https://doi.org/10.3390/a19050325 - 22 Apr 2026
Abstract
Semantic segmentation of land use and land cover (LULC) in arid regions remains challenging due to severe class imbalance, fragmented spatial distributions, and high spectral similarity among different land cover types. These characteristics often lead to an information bottleneck in deep segmentation networks [...] Read more.
Semantic segmentation of land use and land cover (LULC) in arid regions remains challenging due to severe class imbalance, fragmented spatial distributions, and high spectral similarity among different land cover types. These characteristics often lead to an information bottleneck in deep segmentation networks and hinder the extraction of discriminative semantic representations. To address these issues, we propose SDS-Former, a lightweight semantic segmentation network specifically designed for remote sensing imagery in arid environments. SDS-Former incorporates an SSM-inspired Lightweight Semantic Enhancement (LSE) module to strengthen contextual modeling and alleviate the loss of discriminative information in deep features. To tackle scale variations, a Dynamic Selective Feature Fusion (DSFF) module is employed in the decoder to adaptively weight and fuse high-level semantics with low-level spatial details. Furthermore, a Feature Refinement Head (FRH) is introduced to enhance boundary localization and improve the recognition of small-scale and sparsely distributed land cover objects. Extensive ablation and comparative experiments demonstrate that SDS-Former consistently outperforms representative semantic segmentation methods across multiple evaluation metrics. On the Tarim Basin dataset, the proposed network achieves a mean Intersection over Union (mIoU) of 82.51% and an F1 score of 86.47%, indicating its superior effectiveness and robustness. Qualitative results further verify that SDS-Former exhibits clear advantages in distinguishing spectrally similar land cover types and preserving the spatial continuity of ground objects in complex arid-region scenes. Full article
29 pages, 2502 KB  
Article
An Enhanced KNN–ConvLSTM Framework for Short-Term Bus Travel Time Prediction on Signalized Urban Arterials
by Jili Zhang, Wei Quan, Chunjiang Liu, Yuchen Yan, Baicheng Jiang and Hua Wang
Appl. Sci. 2026, 16(9), 4090; https://doi.org/10.3390/app16094090 - 22 Apr 2026
Abstract
Reliable short-term prediction of bus travel time on signalized urban arterials is essential for improving service reliability and may provide a useful forecasting basis for prediction-informed transit signal priority (TSP) and arterial coordination applications. However, bus operations on urban arterials are highly variable [...] Read more.
Reliable short-term prediction of bus travel time on signalized urban arterials is essential for improving service reliability and may provide a useful forecasting basis for prediction-informed transit signal priority (TSP) and arterial coordination applications. However, bus operations on urban arterials are highly variable due to stop dwell times, signal delays, and interactions with mixed traffic, leading to nonlinear and nonstationary travel time patterns with strong spatiotemporal dependence. This study proposes a hybrid KNN–ConvLSTM framework for short-term arterial bus travel time prediction using real-world field data. A K-nearest neighbors (KNNs) module is first employed to retrieve historical operation sequences that are most similar to the current corridor state, thereby reducing interference from mismatched traffic regimes and improving robustness. Smart-card (IC card) transaction data are incorporated as demand-related features to represent passenger activity and its impact on dwell time and travel time variability. The selected sequences are then organized into a corridor-ordered spatiotemporal representation and further refined by lightweight temporal enhancement operations, including relevance gating, multi-scale aggregation, adaptive feature fusion, and residual enhancement, before being fed into the convolutional long short-term memory (ConvLSTM) predictor. The proposed approach is evaluated using weekday service-hour data extracted from 30 days of real-world bus operation records collected from a typical urban arterial corridor in Changchun, China, and is compared with several benchmark models, including ARIMA, KNN, LSTM, CNN, ConvLSTM, Transformer, and DCRNN. The results indicate that the proposed KNN–ConvLSTM framework achieves an MAE of 40.1 s, an RMSE of 55.8 s, a SMAPE of 10.7%, and an R2 of 0.878, outperforming all benchmark models. Specifically, compared with the Transformer baseline, the proposed framework reduces MAE by 1.5%, RMSE by 5.1%, and SMAPE by 7.0%, while increasing R2 by 0.014. Compared with the DCRNN baseline, it reduces MAE by 10.7%, RMSE by 1.9%, and SMAPE by 2.7%, while increasing R2 by 0.008. These findings demonstrate that similarity-aware retrieval combined with spatiotemporal deep learning can substantially enhance short-term bus travel time prediction on signalized urban arterials. More accurate short-term forecasts may support prediction-informed transit signal priority and arterial coordination by providing more reliable downstream arrival-time estimates. However, the generalizability of the reported results is still constrained by the relatively short 30-day observation period and the single-corridor case setting, and the operational and environmental effects of downstream applications remain to be validated through dedicated closed-loop control evaluation in future work. Full article
(This article belongs to the Special Issue Smart Transportation Systems and Logistics Technology)
15 pages, 5165 KB  
Article
Intelligent Defect Identification in Girth Welds of Phased Array Ultrasonic Testing Images Using Median Filtering, Spatial Enrichment, and YOLOv8
by Mingzhe Bu, Shengyuan Niu, Xueda Li and Bin Han
Metals 2026, 16(5), 458; https://doi.org/10.3390/met16050458 - 22 Apr 2026
Abstract
Girth welds are susceptible to defects under high internal pressure and stress. While phased array ultrasonic testing (PAUT) is widely used for non-destructive evaluation, manual inspection remains inefficient and highly dependent on expertise. Furthermore, existing deep learning models often struggle with low accuracy [...] Read more.
Girth welds are susceptible to defects under high internal pressure and stress. While phased array ultrasonic testing (PAUT) is widely used for non-destructive evaluation, manual inspection remains inefficient and highly dependent on expertise. Furthermore, existing deep learning models often struggle with low accuracy and high complexity. This paper proposes a PAUT defect classification method based on YOLOv8. First, median filtering is employed for denoising, and the results show that noise is effectively reduced while preserving key features, achieving PSNR values of 35.132, 35.938, and 36.138 for slag inclusion, pores, and lack of fusion (LOF), respectively. Subsequently, the spatial enrichment algorithm (SEA) is applied to enhance image details without amplifying noise, yielding a PSNR of 33.71 and an SSIM of 0.96. Finally, the YOLOv8 model is implemented for defect recognition. Experimental results demonstrate that the proposed approach achieves a superior balance between precision and recall with high reliability. This method offers a robust and efficient solution for automated PAUT evaluation in practical engineering applications. Full article
Show Figures

Figure 1

21 pages, 3370 KB  
Article
Deep6DHead: A 6D Head Pose Estimation Method Based on Deep Feature Enhancement
by Fake Jiang, Shucheng Huang and Mingxing Li
Symmetry 2026, 18(5), 705; https://doi.org/10.3390/sym18050705 - 22 Apr 2026
Abstract
To address the bottlenecks of accuracy in head pose estimation caused by occlusion and rotational representation ambiguities, we propose Deep6DHead, a 6-degree-of-freedom (6DoF) head pose estimation method based on deep feature enhancement. This method innovatively integrates RGB and depth information to construct a [...] Read more.
To address the bottlenecks of accuracy in head pose estimation caused by occlusion and rotational representation ambiguities, we propose Deep6DHead, a 6-degree-of-freedom (6DoF) head pose estimation method based on deep feature enhancement. This method innovatively integrates RGB and depth information to construct a four-channel input and achieves feature fusion of RGB-D through a dual-branch network. First, a Squeeze-and-Excitation (SE) module adaptively weights the depth geometric features of key anatomical regions to achieve channel recalibration. Second, based on the 6DoF rotation representation framework, we introduce an anatomical constraint loss using the nasal bridge normal. This constraint corrects rotation deviations caused by noise by enforcing consistency in local geometric orientation. Finally, the model outputs the rotation matrix end-to-end for final pose estimation. Experiments on the 300W-LP, BIWI, and AFLW2000 datasets demonstrate that our method significantly improves robustness and accuracy, particularly under extreme head poses. Notably, it achieves state-of-the-art performance on the roll axis (lowest error: 2.05) and a competitive overall MAE of 3.45, providing an effective solution for head pose estimation in complex real-world scenarios including extreme viewing angles. Full article
(This article belongs to the Section Computer)
29 pages, 8989 KB  
Article
Real-Field-Ready and Digitally Sustainable Plant Disease Recognition via Federated Multimodal Edge Learning and Few-Shot Domain Adaptation
by Muhammad Irfan Sharif, Yong Zhong, Muhammad Zaheer Sajid and Francesco Marinello
Agriculture 2026, 16(9), 918; https://doi.org/10.3390/agriculture16090918 - 22 Apr 2026
Abstract
Plant disease diagnosis in real-world agricultural environments is challenged by data scarcity, domain shift, privacy constraints, and limited edge-device resources. This paper proposes FMEL-FSDA, a Federated Multimodal Edge Learning framework with Few-Shot Domain Adaptation for robust field-based plant disease recognition. The framework [...] Read more.
Plant disease diagnosis in real-world agricultural environments is challenged by data scarcity, domain shift, privacy constraints, and limited edge-device resources. This paper proposes FMEL-FSDA, a Federated Multimodal Edge Learning framework with Few-Shot Domain Adaptation for robust field-based plant disease recognition. The framework integrates attention-based RGB–text feature fusion, privacy-preserving federated learning, rapid few-shot personalization, and uncertainty-aware inference within an edge-efficient architecture. Federated training enables collaborative learning across distributed farms without sharing raw data, while few-shot adaptation allows fast deployment to new regions using only 1–10 labeled samples per class. Experiments on the PlantWild in-the-wild dataset show that FMEL-FSDA outperforms centralized, federated, and few-shot baselines, achieving 93.78% accuracy, 93.33% F1-score, and 0.97 AUC. The model maintains strong performance under privacy mechanisms such as gradient perturbation and secure aggregation, reduces communication overhead by up to , and supports low-latency edge inference. Uncertainty estimation and Grad-CAM-based explainability further enhance reliability by identifying low-confidence cases and highlighting disease-relevant regions. Overall, FMEL-FSDA offers a scalable, privacy-aware, and field-ready solution for intelligent plant disease diagnosis in precision agriculture. Full article
25 pages, 19124 KB  
Article
Multi-Scale Fractional-Order Image Fusion Algorithm Based on Polarization Spectral Images
by Zhenduo Zhang, Xueying Cao and Zhen Wang
Appl. Sci. 2026, 16(9), 4087; https://doi.org/10.3390/app16094087 - 22 Apr 2026
Abstract
With the continuous advancement of polarization spectral sensing technology, multi-band polarization image fusion has emerged as a novel approach to image fusion. By integrating spectral and polarization information, this method overcomes the limitations of relying on a single information source and significantly improves [...] Read more.
With the continuous advancement of polarization spectral sensing technology, multi-band polarization image fusion has emerged as a novel approach to image fusion. By integrating spectral and polarization information, this method overcomes the limitations of relying on a single information source and significantly improves overall image quality. To address this, this paper proposes a new polarization spectral fusion algorithm. First, feature matching is employed to achieve pixel-level spatial alignment of multi-band polarization images. Then, a fusion strategy based on multi-scale decomposition and singular value decomposition is adopted to preserve structural information and fine details. Subsequently, fractional-order processing and guided filtering are applied to enhance details and suppress noise. Finally, a progressive reconstruction from low to high scales is performed to ensure hierarchical consistency and information integrity throughout the fusion process. In addition, spectral information is utilized for color restoration, enabling the final image to achieve high spatial resolution while maintaining natural and rich color representation.Experimental results demonstrate that the proposed method effectively integrates features from different spectral bands and polarization information while preserving maximum similarity, leading to significant improvements in both image quality and detail representation. Full article
17 pages, 5236 KB  
Article
Two Non-Learning Filters for the Enhancement of Images Obtained from a Fluorescence Imaging System, a Near-Infrared Camera, and Low-Light Condition
by Jun Hong, Xi He, Haoru Ning, Zhonghuan Su, Ling Zhang, Yingcheng Lin and Ye Wu
Electronics 2026, 15(9), 1777; https://doi.org/10.3390/electronics15091777 - 22 Apr 2026
Abstract
Images obtained from imaging instruments can endure issues such as high degradation, color distortion, and weak brightness. Effective systems for enhancing these images are critically required. To improve the image quality, herein, we propose two filters based on simple functions, including cosine, sine, [...] Read more.
Images obtained from imaging instruments can endure issues such as high degradation, color distortion, and weak brightness. Effective systems for enhancing these images are critically required. To improve the image quality, herein, we propose two filters based on simple functions, including cosine, sine, hyperbolic secant, and the inverse of hyperbolic cosecant. These filters are used for enhancing the images obtained from a fluorescence imaging system, a near-infrared camera, and low-light condition. The contrast is increased while the image quality is improved. They perform better than a matched filter. Moreover, the combination of our filters with the filter based on the watershed algorithm or the matched filter can be used to extract the marginal features from images generated under water environment. Furthermore, their application in image fusion is explored. Our designed filters may be potentially used for future applications on target identification and tracking. Full article
34 pages, 1939 KB  
Article
AutoUAVFormer: Neural Architecture Search with Implicit Super-Resolution for Real-Time UAV Aerial Object Detection
by Li Pan, Huiyao Wan, Pazlat Nurmamat, Jie Chen, Long Sun, Yice Cao, Shuai Wang, Yingsong Li and Zhixiang Huang
Remote Sens. 2026, 18(9), 1268; https://doi.org/10.3390/rs18091268 - 22 Apr 2026
Abstract
The widespread deployment of unmanned aerial vehicles (UAVs) in civil and commercial airspace has raised significant safety concerns, driving the demand for reliable and real-time Anti-UAV visual detection systems. However, existing deep learning-based detectors face substantial challenges in complex low-altitude environments, including drastic [...] Read more.
The widespread deployment of unmanned aerial vehicles (UAVs) in civil and commercial airspace has raised significant safety concerns, driving the demand for reliable and real-time Anti-UAV visual detection systems. However, existing deep learning-based detectors face substantial challenges in complex low-altitude environments, including drastic scale variations, severe background clutter, and weak feature representation of small UAV targets. Moreover, handcrafted Transformer-based architectures often lack adaptability across diverse scenarios and struggle to balance detection accuracy with computational efficiency. To address these limitations, this paper proposes AutoUAVFormer, a super-resolution guided neural architecture search framework for Anti-UAV detection. In contrast to conventional manually designed approaches, AutoUAVFormer leverages joint optimization of a Transformer-based detection objective and a super-resolution reconstruction objective to automatically identify a task-specific optimal network architecture for detecting UAV targets. Specifically, a unified search space is formulated by jointly embedding Transformer hyperparameters and Feature Pyramid Network (FPN) structures, facilitating end-to-end co-optimization of multi-scale feature fusion and global context modeling. To efficiently locate architectures that balance accuracy and computational cost, a three-stage pipeline, combining supernetwork training with evolutionary search, is employed. Additionally, we design a super-resolution auxiliary branch that operates only during training to enhance the model’s ability to learn fine-grained textures and sharpen edge representations of small targets, without introducing any inference overhead. Extensive experiments on three challenging Anti-UAV detection benchmarks, namely DetFly, DUT Anti-UAV, and UAV Swarm, confirm the superiority of AutoUAVFormer over current state-of-the-art methods, with mAP@0.5 scores reaching 98.6%, 95.5%, and 89.9% on the respective datasets while sustaining real-time inference speed. These results demonstrate that AutoUAVFormer achieves strong generalization and maintains robust Anti-UAV detection performance under challenging low-altitude conditions. Full article
25 pages, 2360 KB  
Article
ACF-YOLO: Feature Enhancement and Multi-Scale Alignment for Sustainable Crop Small Object Detection
by Chuanxiang Li, Yihang Li, Wenzhong Yang and Danny Chen
Sustainability 2026, 18(9), 4168; https://doi.org/10.3390/su18094168 - 22 Apr 2026
Abstract
Sustainable precision agriculture is crucial for optimizing resource utilization, reducing chemical inputs, and ensuring global food security. High-precision automatic recognition and monitoring of key crop organs (e.g., wheat heads and flower clusters) serve as the technological foundation for sustainable agricultural management decisions. However, [...] Read more.
Sustainable precision agriculture is crucial for optimizing resource utilization, reducing chemical inputs, and ensuring global food security. High-precision automatic recognition and monitoring of key crop organs (e.g., wheat heads and flower clusters) serve as the technological foundation for sustainable agricultural management decisions. However, visual perception in natural field environments is highly susceptible to external conditions. To address the challenges of severe background interference and feature dilution in crop small object detection within complex agricultural scenarios, this paper proposes an enhanced detection network, ACF-YOLO, based on YOLO11. First, an Aggregated Multi-scale Local-Global Attention (AMLGA) module is designed to enhance the feature representation of weak targets by fusing local details with global semantics. Second, a Context-Guided Fusion Module (CGFM) and a Soft-Neighbor Interpolation (SNI) strategy are introduced. Their synergy alleviates feature aliasing effects and ensures the precise alignment of deep semantic information with shallow spatial details. Furthermore, the Inner-MPDIoU loss function is employed to optimize the bounding box regression accuracy for non-rigid targets by incorporating geometric constraints and auxiliary scale factors. To verify the detection capability of the proposed method, we constructed a UAV Wheat Head Dataset (UWHD) and conducted extensive experiments on the UWHD, GWHD2021, and RFRB datasets. The experimental results demonstrate that ACF-YOLO outperforms other comparative methods, confirming its stable detection performance and contributing to the sustainable development of agriculture. Full article
(This article belongs to the Section Sustainable Agriculture)
23 pages, 2414 KB  
Article
Semantic-Guided Multi-Level Collaborative Fusion Network for Visible and Infrared Images
by Lijun Yuan, Chuanjiang Xie, Ming Yang, Xiaoguang Tu, Qiqin Li and Xinyu Zhu
Sensors 2026, 26(9), 2577; https://doi.org/10.3390/s26092577 - 22 Apr 2026
Abstract
The paramount value of image fusion is manifested in effectively enhancing downstream tasks. However, compatibility with subsequent tasks is compromised due to the semantic deficiency of fusion representations generated by current approaches. To mitigate this limitation, a semantic-guided multi-level collaborative fusion network is [...] Read more.
The paramount value of image fusion is manifested in effectively enhancing downstream tasks. However, compatibility with subsequent tasks is compromised due to the semantic deficiency of fusion representations generated by current approaches. To mitigate this limitation, a semantic-guided multi-level collaborative fusion network is proposed, termed DSIFuse. By leveraging semantic priors and global context extracted from auxiliary segmentation branches, a multi-level interaction space is constructed to explicitly refine cross-modal features. Specifically, a cross-modal feature correction mechanism is designed to enhance semantic alignment by injecting complementary visible–infrared information at each layer, while a three-level interaction strategy gradually integrates unimodal features and semantic maps to generate semantically enriched representations. To mitigate semantic information loss during image reconstruction, a semantic compensation block is employed, incorporating interactive representations from prior layers and global semantic maps into the multi-scale decoder. Finally, the overall loss integrates semantic supervision, gradient, and intensity loss. Experiments conducted on public datasets indicate that clear fusion images are generated by DSIFuse, with improved structural consistency and reduced artifacts. Under a unified benchmark, the fused representations subsequently yield improved performance in downstream object detection tasks. Full article
(This article belongs to the Section Sensing and Imaging)
40 pages, 8223 KB  
Article
An Interpretable Fuzzy Distance-Based Ensemble Framework with SHAP Analysis for Clinically Transparent Prediction of Diabetes
by Asif Hassan Syed, Altyeb Altaher Taha, Ahmed Hamza Osman, Yakubu Suleiman Baguda, Hani Moaiteq Aljahdali and Arda Yunianta
Diagnostics 2026, 16(9), 1254; https://doi.org/10.3390/diagnostics16091254 - 22 Apr 2026
Abstract
Background/Objectives: Diabetes is a chronic metabolic disorder affecting global health, where early prediction can significantly reduce disease severity. Methods: This research proposes an interpretable multi-metric fuzzy distance-based ensemble (MMFDE) that integrates multi-variant gradient-boosting classifiers (GBM, LightGBM, XGBoost, and AdaBoost) through a novel fuzzy [...] Read more.
Background/Objectives: Diabetes is a chronic metabolic disorder affecting global health, where early prediction can significantly reduce disease severity. Methods: This research proposes an interpretable multi-metric fuzzy distance-based ensemble (MMFDE) that integrates multi-variant gradient-boosting classifiers (GBM, LightGBM, XGBoost, and AdaBoost) through a novel fuzzy fusion mechanism designed for intrinsic interpretability. Unlike conventional ensembles relying on opaque averaging or voting, MMFDE transforms base classifier predictions into a high-dimensional fuzzy space quantified via a weighted hybrid distance incorporating Euclidean, Manhattan, Chebyshev, and cosine metrics against ideal diabetic and non-diabetic reference vectors. These distances are translated into membership degrees with the help of exponentially decaying functions, which give clinicians calibrated confidence scores for every prediction. Comprehensive SHAP analysis identifies important clinical risk factors (glucose, BMI, and diabetes pedigree function), which show concordance with the medical literature, thereby giving greater clinical trust. Results: Experimental evaluations on two publicly available datasets, Hospital Frankfurt Germany Diabetes Dataset (HFGDD) and Pima Indians Diabetes Dataset (PIDD), show that MMFDE outperforms all base models with a significant accuracy of 94.83% and Area Under the Curve (AUC) of 97.66% on HFGDD and three different levels of interpretability: geometric transparency via distance-based decisions, confidence-calibrated uncertainty estimates, and feature-level explanations via SHAP. The confidence thresholds enabled in the framework support risk stratification clinical workflows with high-confidence predictions for automated screening and cases with moderate/low confidence flagged out for review by the clinician. Conclusions: By demonstrating that high performance and interpretability need not be mutually exclusive, MMFDE advances trustworthy AI for clinical decision support, addressing the critical need for transparent and clinically actionable diabetes prediction systems. Full article
(This article belongs to the Special Issue Explainable Machine Learning in Clinical Diagnostics)
Show Figures

Figure 1

28 pages, 5345 KB  
Article
Integrated Molecular, Genomic, and Clinical Characterization of Pediatric and Adolescent Translocation Renal Cell Carcinoma: A Report from the Children’s Oncology Group
by Alissa Groenendijk, Bruce J. Aronow, Nicholas Cost, Mariana Cajaiba, Lindsay A. Renfro, Elizabeth J. Perlman, Lisa Dyer, Teresa A. Smolarek, Elizabeth A. Mullen, Sameed Pervaiz, Somak Roy, Phillip J. Dexheimer, Peixin Lu, Peter F. Ehrlich, M. M. van den Heuvel-Eibrink, Jeffrey S. Dome, James I. Geller and on behalf of the COG Renal Tumor Committee
Biomedicines 2026, 14(5), 955; https://doi.org/10.3390/biomedicines14050955 - 22 Apr 2026
Abstract
Background: Translocation morphology renal cell carcinoma (tRCC) accounts for nearly half of all pediatric RCC cases. Biological study AREN14B4-Q aimed to characterize the molecular landscape of tRCC using samples acquired from patients enrolled in the Children’s Oncology Group Risk Classification and Biobanking [...] Read more.
Background: Translocation morphology renal cell carcinoma (tRCC) accounts for nearly half of all pediatric RCC cases. Biological study AREN14B4-Q aimed to characterize the molecular landscape of tRCC using samples acquired from patients enrolled in the Children’s Oncology Group Risk Classification and Biobanking study AREN03B2. Methods: From 2006 to 2014, patients <30 yr old with renal tumors were prospectively enrolled in AREN03B2, a Central IRB-approved biobanking study. All pediatric RCC cases underwent a detailed central pathology review and molecular diagnostics to accurately classify RCC subtypes. Samples with confirmed tRCC and appropriate informed consent were identified with adequate tissue for RNA and DNA extraction, along with germline DNA, for whole-genome sequencing (WGS), RNA sequencing, and DNA methylation analyses. Results: From 41 patients, high-quality samples allowed for 18 tumors and non-tumor DNA to be analyzed via WGS, 19 via DNA methylation, and 36 RNA samples via transcriptome sequencing. Consistent with and extending clinical cytogenetic findings, WGS and fusion transcript analyses confirmed very few additional mutations beyond the tRCC translocation. No recurrent genomic copy number gains/losses were found. RNA and WGS analyses enabled sub-classification of tRCC, closely aligning with the different TFE3 fusion partners. DNA methylation analyses demonstrated less tRCC sub-stratification compared with RNA analyses. Pathways activated in tRCC were involved in epithelial differentiation, extracellular matrix organization, apoptosis, immune regulation, signal transduction, and angiogenesis. Conclusions: Arrested epithelial differentiation is the overarching driver in tRCC and is strongly correlated with the specific subclasses of fusion transcript generated by the genetic translocation TFE fusion partner. Negative regulation of apoptosis, increased M2 macrophage expression, and enhanced angiogenesis also appear to be functional features of tRCCs, as are increased expression of matrix metalloproteinases, PI3K-AKT/mTOR/MAPK signaling, and mitochondrial metabolism, highlighting potential therapeutic options beyond direct targeting of the oncogenic driver fusions. Full article
Show Figures

Figure 1

24 pages, 3206 KB  
Article
Edge-Based Multi-Scale Predator Detection for Stingless Bee Protection Using Attention-Integrated YOLOv11
by Ashan Milinda Bandara Ratnayake, Marha Sahirah Majid, Hartini Yasin, Abdul Ghani Naim and Pg Emeroylariffion Abas
Technologies 2026, 14(5), 246; https://doi.org/10.3390/technologies14050246 - 22 Apr 2026
Abstract
Stingless bee colonies are vulnerable to predators of widely varying sizes, and repeated intrusions can cause stress, reduce productivity, and trigger colony absconding. Existing automated surveillance systems detect only a limited range of predators and often struggle with multi-scale object detection in high-resolution [...] Read more.
Stingless bee colonies are vulnerable to predators of widely varying sizes, and repeated intrusions can cause stress, reduce productivity, and trigger colony absconding. Existing automated surveillance systems detect only a limited range of predators and often struggle with multi-scale object detection in high-resolution images. This study proposes a real-time predator monitoring system that integrates a Multi-Scale Attention module into the YOLOv11-nano architecture (MSYOLO11) to enhance detection performance across both small and large predators. The proposed model combines convolutional features with an attention mechanism to improve global–local feature fusion. Experimental evaluation shows that MSYOLO11 increases overall Recall from 0.830 to 0.853 compared to YOLOv11-nano, with substantial improvements for small-object classes such as ants (+0.096), humans (+0.083), and H. itama (+0.026), while maintaining comparable Precision (0.868 vs 0.842) and mAP50 (0.898 vs 0.896) at a nearly identical computational cost (6.3 GFLOPs). The system operates at 5 FPS on a Jetson Orin Nano, with an end-to-end latency of 181 ms. A Firebase-integrated mobile application delivers instant push notifications, displays detected predators with bounding boxes, and provides real-time data synchronization. The results demonstrate that MSYOLO11 offers a practical and efficient solution for multi-scale predator detection, supporting continuous hive surveillance and timely beekeeper intervention. Full article
(This article belongs to the Special Issue AI-Driven Optimization in Robotics and Precision Agriculture)
Show Figures

Figure 1

25 pages, 17631 KB  
Article
HRM-Net: Hybrid Road Mapping Network for Automated Mine Haul Road Extraction from Remote Sensing Imagery
by Loghman Moradi and Kamran Esmaeili
Remote Sens. 2026, 18(9), 1264; https://doi.org/10.3390/rs18091264 - 22 Apr 2026
Abstract
Haul roads in surface mining are critical infrastructure directly influencing operational productivity, safety, and costs. However, these networks change frequently due to ongoing mining activities, making traditional mapping methods impractical for large-scale or rapidly evolving sites. Remote sensing imagery offers a scalable alternative, [...] Read more.
Haul roads in surface mining are critical infrastructure directly influencing operational productivity, safety, and costs. However, these networks change frequently due to ongoing mining activities, making traditional mapping methods impractical for large-scale or rapidly evolving sites. Remote sensing imagery offers a scalable alternative, yet complex backgrounds, variable road widths, and spectral similarities between roads and surrounding surfaces make accurate extraction challenging. This study proposes HRM-Net, a hybrid transformer–CNN autoencoder framework for automated extraction of mine haul roads from remote sensing imagery. HRM-Net introduces inception-like patch embedding to capture local contextual information and employs a manifold-constrained hyper-connection strategy in the attention and fusion blocks to enhance information flow across the architecture. This hierarchical design enables progressive learning of discriminative semantic representations across multiple spatial resolutions, critical for road extraction in cluttered mining environments. Trained and evaluated on diverse mine sites, HRM-Net achieved 92.53% overall accuracy, 85.12% F1-score, 75.57% mIoU, 83.57% precision, and 86.94% recall, outperforming state-of-the-art transformer-based and CNN-based segmentation models. Furthermore, model interpretability was analyzed through linear probing and boundary alignment evaluations. Results demonstrate that discriminative features emerge at early network stages and are effectively preserved throughout the architecture, while boundary predictions exhibit superior consistency compared to existing approaches. Full article
Show Figures

Figure 1

Back to TopTop