Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (507)

Search Parameters:
Keywords = thermal image dataset

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
10 pages, 4337 KB  
Proceeding Paper
Next-Day Forest Fire Risk Prediction Using Machine Learning and Multimodal Satellite Data
by Prajwal Mohapatra, Swayam Subhankar Sahoo, Adyasha Das and Rururaj Pradhan
Eng. Proc. 2026, 124(1), 120; https://doi.org/10.3390/engproc2026124120 (registering DOI) - 17 Jun 2026
Abstract
Predicting forest fire occurrence is essential for proactive disaster preparedness and environmental protection. We introduce a machine learning-based system that forecasts next-day fire probability at high spatial resolution using satellite-derived, multi-modal geospatial data. In contrast to existing reactive systems that rely on thermal [...] Read more.
Predicting forest fire occurrence is essential for proactive disaster preparedness and environmental protection. We introduce a machine learning-based system that forecasts next-day fire probability at high spatial resolution using satellite-derived, multi-modal geospatial data. In contrast to existing reactive systems that rely on thermal anomaly detection (e.g., MODIS or VIIRS-SNPP), our approach is fully predictive, generating pixel-wise fire risk maps a day in advance. Our study focuses on Uttarakhand, India, which is an ecologically sensitive region that experiences frequent and severe forest fires. We curated a domain-specific geospatial dataset spanning 1 April to 29 May 2016. It includes daily 30 m GeoTIFF images with 10 bands comprising weather (e.g., temperature, wind, precipitation), topography (slope, aspect), fuel map, and fire mask. We constructed this dataset from diverse sources and aligned all bands spatially and temporally. To demonstrate the usefulness of this dataset, we implement a deep convolutional neural network (CNN) using the ResUNet-A architecture, chosen for its robust performance in the semantic segmentation of high-resolution remote sensing data. Our model is trained from scratch to produce high-resolution fire probability maps and classify fire/no-fire pixels. Our solution helps with planning and decision-making for early intervention, especially in areas with high risk. It supports the UN’s SDG 13 (Climate Action) and SDG 15 (Life on Land) by enhancing resilience and conserving ecosystems. The presented dataset and methodology can serve as a benchmark for future research on wildfire risk prediction using Earth observation data. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

21 pages, 3582 KB  
Article
An Improved YOLOv8n Method for Small Thermal Defect Detection of Photovoltaic Modules in UAV Infrared Inspection
by Tengfei He, Zhongyuan Mao and Yuanchang Zhong
Remote Sens. 2026, 18(12), 1986; https://doi.org/10.3390/rs18121986 - 15 Jun 2026
Viewed by 189
Abstract
To address missed detections, false alarms, and deployment limitations in thermal defect detection of photovoltaic modules from unmanned aerial vehicle (UAV) infrared images, this paper proposes an improved detection method based on You Only Look Once version 8 nano (YOLOv8n). The proposed method [...] Read more.
To address missed detections, false alarms, and deployment limitations in thermal defect detection of photovoltaic modules from unmanned aerial vehicle (UAV) infrared images, this paper proposes an improved detection method based on You Only Look Once version 8 nano (YOLOv8n). The proposed method is optimized according to the characteristics of UAV infrared photovoltaic inspection, including small thermal targets, weak and diffuse thermal responses, complex backgrounds, and lightweight deployment requirements. Specifically, a P2 shallow feature layer is introduced to enhance fine-grained feature perception for small thermal defects, while Ghost Convolution (GhostConv) is incorporated into the backbone to reduce model complexity. In addition, C2f-Large Separable Kernel Attention (C2f-LSKA) is embedded in the neck to strengthen contextual and spatial feature modeling under complex infrared backgrounds, and Wise-IoU version 3 (WIoUv3) is adopted to improve bounding box regression and localization stability for boundary-ambiguous thermal anomalies. Experiments are conducted on a self-constructed UAV infrared thermal imaging dataset. From nearly 10,000 inspection images, 3000 representative images are selected and manually annotated, covering typical challenges such as small hot spots, low-contrast defects, complex background interference, and diffuse abnormal temperature-rise regions. Compared with the baseline YOLOv8n, the proposed method improves Precision, Recall, mean average precision at an IoU threshold of 0.5 (mAP@0.5), and mean average precision averaged over IoU thresholds from 0.5 to 0.95 (mAP@0.5:0.95) by 5.1, 11.4, 9.6, and 13.2 percentage points, respectively, while reducing the number of parameters and model size by 65.8% and 61.9%, respectively. These results indicate that the proposed method improves detection accuracy and localization quality under the evaluated UAV infrared inspection setting while maintaining lightweight characteristics. Full article
Show Figures

Figure 1

23 pages, 15463 KB  
Article
Layer-Resolved Grain Morphology and Recrystallized Crystal Evolution in FSP-Assisted Wire Arc Additive Manufacturing of Aluminum Alloy 4043
by Ahmed Nabil Elalem and Xin Wu
Metals 2026, 16(6), 645; https://doi.org/10.3390/met16060645 - 11 Jun 2026
Viewed by 240
Abstract
Wire arc additive manufacturing of aluminum generates coarse, anisotropic solidification microstructures that limit mechanical performance, and interlayer friction stir processing (FSP) is increasingly applied to refine them. This study reports the layer-resolved grain morphology and the recrystallized crystal evolution in MIG + FSP-fabricated [...] Read more.
Wire arc additive manufacturing of aluminum generates coarse, anisotropic solidification microstructures that limit mechanical performance, and interlayer friction stir processing (FSP) is increasingly applied to refine them. This study reports the layer-resolved grain morphology and the recrystallized crystal evolution in MIG + FSP-fabricated aluminum alloy 4043 walls, pairing the FSP spindle torque recorded from the CNC controller with multi-descriptor grain morphology in a coupling that, to the authors’ knowledge, has not been previously reported in the WAAM + FSP literature. Methodologically, two four-bead, three-layer walls were co-fabricated under identical deposition conditions on a HAAS VF-3 CNC platform, one by MIG deposition alone and one by the complete MIG + FSP route; the FSP spindle torque was measured at three positions per layer (118 ± 6 N·m at 600 RPM for L1, and 19.1 ± 1.0 and 26.6 ± 1.3 N·m at 1200 RPM for L2 and L3), and quantitative image analysis of 10,091 grains provided the layer-resolved mean grain area, equivalent diameter, aspect ratio, perimeter-to-area ratio, and circularity. The results show that the mean grain area increased from 8.55 μm2 (L1) to 12.96 μm2 (L3) while the aspect ratio decreased monotonically (1.389 to 1.323), indicating progressive grain equiaxiality with build height; the P/A ratio followed a non-monotonic layer dependence (2.54 to 2.11 to 2.50 μm−1), with the L2 minimum consistent with reduced boundary line density under the combined thermal influence of two adjacent FSP events. The MIG + FSP route produced grain areas 29–48× smaller per layer than the MIG wall and a 45.8% higher hardness (75.8 ± 7.7 versus 52.0 ± 1.3 HV; n = 6; p = 0.0027). In conclusion, the L3 torque exceeds the L2 torque at equal 1200 RPM, qualitatively consistent with the dp term in the grain-size-explicit creep framework γ. = C·(τn/dp)·exp(−Q/RT), although temperature, strain rate, and grain size cannot be fully decoupled from the present three-layer dataset. The morphology and the distributional evidence are consistent with dynamic recrystallization (DRX); discrimination between continuous and discontinuous DRX requires EBSD. Full article
(This article belongs to the Special Issue Advances in the Study of Metal Crystals)
Show Figures

Figure 1

24 pages, 62342 KB  
Article
DCAFuse: A Differential Cross-Attention Transformer Network for Infrared and Visible Image Fusion in UAV-Based Wilderness Search and Rescue
by Yu Jing, Yili Yan, Zhao Li, Fugui Qi, Tao Lei, Jianqi Wang and Guohua Lu
Drones 2026, 10(6), 449; https://doi.org/10.3390/drones10060449 - 9 Jun 2026
Viewed by 269
Abstract
Infrared and visible image fusion is critical for unmanned aerial vehicle (UAV) wilderness search and rescue. By integrating thermal radiation of the targets and texture details of the scenario, it enables accurate search for the wounded and comprehensive perception of disaster areas, thereby [...] Read more.
Infrared and visible image fusion is critical for unmanned aerial vehicle (UAV) wilderness search and rescue. By integrating thermal radiation of the targets and texture details of the scenario, it enables accurate search for the wounded and comprehensive perception of disaster areas, thereby significantly improving emergency rescue efficiency. To alleviate data scarcity, we construct UAV-MSR, an infrared-visible dataset for casualty search, comprising 3889 paired images captured under diverse weather, illumination, and scenarios. Existing Transformer-based fusion methods mainly focus on high-intensity pixels while inadequately modeling low-intensity complementary features, resulting in blurred details and degraded target contrast in fused images. To this end, we propose a novel differential cross-attention Transformer network to address the issue of complementary information loss. Specifically, the encoder integrates convolution operations for local detail extraction and self-attention mechanisms for global context modeling. Then, we design a differential cross-attention guided feature fusion module to enhance the representation and preservation of detailed complementary features. Furthermore, a pixel loss function with a segmentation strategy is employed to improve the saliency of the target, enabling the fused image to facilitate subsequent target detection tasks. Experimental results and ablation studies demonstrate that the proposed method achieves notable performance and generalization ability. In summary, this work delivers a multimodal dataset and an efficient infrared-visible image fusion network to enable comprehensive perception for UAVs in wilderness search and rescue scenarios. Full article
Show Figures

Figure 1

29 pages, 15618 KB  
Article
Automated Mapping of Periglacial Landforms on Mars’ Utopia Planitia Using a Multi-Scale Texture-Enhanced U-Net
by Xiaoyi Chang, Shuanggen Jin and Yanchao Zheng
Sensors 2026, 26(12), 3653; https://doi.org/10.3390/s26123653 - 8 Jun 2026
Viewed by 348
Abstract
Martian periglacial landforms are among the clearest surface clues for investigating ground-ice occurrence, climate evolution, and potential habitability on Mars. Utopia Planitia contains abundant ice-related landforms and is therefore well suited to regional-scale mapping of periglacial features. However, most existing identifications still rely [...] Read more.
Martian periglacial landforms are among the clearest surface clues for investigating ground-ice occurrence, climate evolution, and potential habitability on Mars. Utopia Planitia contains abundant ice-related landforms and is therefore well suited to regional-scale mapping of periglacial features. However, most existing identifications still rely heavily on manual interpretation, which is time-consuming and difficult to keep consistent across large image mosaics. In this paper, using Context Camera (CTX) imagery, a dataset of four representative landform types in Utopia Planitia, namely flat-floored depressions, thermal contraction cracks, scalloped depressions, and brain terrain, was built. A Multi-scale Texture-enhanced U-Net (MTU-Net) was then developed as an automated and standardized mapping solution for semantic segmentation of these landforms. The model incorporates hierarchical attention and multi-scale texture enhancement modules, enabling recognition under complex backgrounds where fine-scale landforms such as thermal contraction cracks and brain terrain exhibit only weak textural details, alongside large scale variations. On the held-out test set, MTU-Net reaches a mean intersection over union (mIoU) of 89.55%, a mean F1-score of 94.71%, and a Kappa coefficient of 91.21%, outperforming the baseline U-Net under the same evaluation protocol. The resulting regional maps show marked spatial heterogeneity in the occurrence of the four landform types across Utopia Planitia. This study provides a methodological basis for automated periglacial landform mapping in Mars. Full article
(This article belongs to the Section Environmental Sensing)
Show Figures

Figure 1

23 pages, 20700 KB  
Article
Edge-Deployable RGB–Thermal UAV Monitoring for Wildfires in Power Transmission Corridors
by Biao Wang, Daochun Huang, Yifeng Lin, Xu He, Zhengxian Guo and Bo Hong
Remote Sens. 2026, 18(12), 1869; https://doi.org/10.3390/rs18121869 - 6 Jun 2026
Viewed by 377
Abstract
Early wildfire monitoring in power transmission corridors requires reliable detection of weak fire and smoke cues under complex field conditions and strict edge-computing constraints. To address these issues, this paper proposes an edge-deployable RGB–thermal framework based on visible and thermal infrared (TIR) imaging [...] Read more.
Early wildfire monitoring in power transmission corridors requires reliable detection of weak fire and smoke cues under complex field conditions and strict edge-computing constraints. To address these issues, this paper proposes an edge-deployable RGB–thermal framework based on visible and thermal infrared (TIR) imaging for unmanned aerial vehicle (UAV)-based corridor monitoring, including a spatial detector, YOLO-MMSC, and a temporal-enhanced version, YOLO-MMSC-T. The study also establishes a self-collected corridor-oriented RGB–thermal (RGB–T) dataset to complement public wildfire data. Unlike existing RGB–thermal wildfire datasets that mainly focus on forest or wildland fire scenes, the proposed dataset is specifically organized for complex-background power transmission-corridor monitoring, including continuous UAV sequences, nighttime conditions, smoke/vegetation occlusion, long-range small targets, and hard-negative interference. To the best of our knowledge, this is the first self-collected RGB–thermal wildfire dataset designed for this specific application scenario. The framework integrates a mobile inverted bottleneck convolution (MBConv) lightweight backbone, a Shallow Detail Fusion Module (SDFM) for shallow cross-modal alignment and denoising, a Content-Guided Attention (CGA) module for adaptive fusion, and normalized Wasserstein distance (NWD)-based box regression for long-range small-target localization. Experiments on public and self-collected datasets show that YOLO-MMSC achieves 94.6% mAP@0.5, 95.0% precision, and 93.9% recall while running at 60 FPS on Jetson Orin NX. With temporal fine-tuning, YOLO-MMSC-T reaches a continuous detection rate (CDR) of 95.6% with a jitter index of 2.8×103. Field experiments using a DJI Matrice 4T further indicate a practical operating altitude of 120–180 m. These results support lightweight RGB–thermal remote sensing for real-time wildfire monitoring in complex transmission-corridor environments. Full article
Show Figures

Figure 1

36 pages, 10912 KB  
Article
Waterbody Extraction from the Perspective of RGB+X Semantic Segmentation
by Zhechen Yang, Wangrui Zhang, Qi Zhang, Zongbao Hong, Danjie Cheng, Qiao Xu, Yan Meng, Yangjie Sun and Yuxuan Liu
Remote Sens. 2026, 18(11), 1824; https://doi.org/10.3390/rs18111824 - 3 Jun 2026
Viewed by 397
Abstract
Waterbody extraction is of great significance for water resource investigation and monitoring. In addition to RGB bands, most common satellite images have a near-infrared (NIR) band. By combining these RGB-NIR bands, certain water, vegetation, and shadow indices can be calculated. The near-infrared band [...] Read more.
Waterbody extraction is of great significance for water resource investigation and monitoring. In addition to RGB bands, most common satellite images have a near-infrared (NIR) band. By combining these RGB-NIR bands, certain water, vegetation, and shadow indices can be calculated. The near-infrared band and these indices are very similar to the X modality in RGB+X data (common examples include RGB-D and RGB-Thermal). However, at present, no studies have thoroughly examined multimodal feature fusion from the RGB+X perspective in order to extract waterbodies with high precision. As a result, existing algorithms do not fully utilize satellite image information and have limited generalization ability. To overcome this limitation, we propose a dual-complexity backbone for waterbody extraction from the perspective of RGB+X data semantic segmentation. Its complex Transformer branch is used to extract RGB modality features, while its simple CNN branch is used to extract X modality features. This network structure can effectively capture multimodal, global, and local features in remote sensing images. It can also fully leverage the fact that the scale of RGB image datasets in computer vision is significantly larger than that of remote sensing waterbody extraction datasets. If a large pretrained model is used in the RGB branch, it is unnecessary to freeze the weights. Instead, both branches can be trained jointly, allowing the RGB branch to better adapt to the remote sensing waterbody extraction task without raising concerns that fine-tuning might undermine the pretrained model’s strong representation capability. We also propose two X modality configurations with strong generalization performance. To fully fuse multimodal features, we design a hybrid fusion module combining a CNN and a cross-attention mechanism. To integrate the multi-scale features, we employ a multi-scale Transformer structure in the RGB branch and design a multi-scale decoder. Our algorithm achieves state-of-the-art performance on the GID-5 dataset and competitive performance on the S1S2-Water dataset. Furthermore, it significantly outperforms existing methods in cross-dataset zero-shot transfer between the two datasets, with IoU/F1-score gains of 26.08%/27.33% on GID-5 and 38.74%/31.37% on S1S2-Water over previous SOTA methods. Our processing paradigm of modeling RGB-NIR remote sensing images as RGB+X data shows potential for generalization to other multi-modal remote sensing tasks. The dual-complexity backbone we design also has potential to be extended to other tasks that transfer large pretrained RGB models to remote sensing imagery with RGB-NIR four bands or even more spectral bands. We have open-sourced the code and trained models used in this research. Full article
(This article belongs to the Special Issue Foundation Model-Based Multi-Modal Data Fusion in Remote Sensing)
Show Figures

Figure 1

23 pages, 17347 KB  
Article
A Two-Stage Deep Learning Method for Non-Invasive Sow Body Temperature Prediction Fusing Thermal Imaging and Environmental Parameters
by Shengyong Xu, Ziyi Qin, Qiao Huang, Chen Tan, Xuewen Xu and Xuan Li
Animals 2026, 16(11), 1692; https://doi.org/10.3390/ani16111692 - 31 May 2026
Viewed by 284
Abstract
Traditional rectal temperature measurement in pigs induces stress in animals, imposes a heavy labor burden on staff, and increases the risk of cross-infection. This study proposes a non-invasive deep learning approach to predict porcine rectal temperature by combining infrared thermal images of thermal [...] Read more.
Traditional rectal temperature measurement in pigs induces stress in animals, imposes a heavy labor burden on staff, and increases the risk of cross-infection. This study proposes a non-invasive deep learning approach to predict porcine rectal temperature by combining infrared thermal images of thermal windows with environmental parameters. A multimodal dataset is constructed by synchronously collecting thermal images, environmental parameters, and actual rectal temperatures. Mask Region-based Convolutional Neural Network (Mask R-CNN), You Only Look Once version 8 small (YOLOv8s), and YOLOv11s are employed to automatically detect or segment thermal window regions, from which the maximum temperature of each region is extracted. To enhance model generalization under varying environmental conditions, a two-stage hybrid regression framework is established. In this framework, a Convolutional Neural Network (CNN) extracts spatial features from thermal images, a fully connected network (FCNN) encodes regional surface temperatures and environmental parameters, and a Transformer module captures cross-modal dependencies to generate a preliminary prediction. Subsequently, a Random Forest (RF) regressor is applied for residual correction and final output optimization. Comparative experiments on single-region, dual-region, and triple-region combinations demonstrate that the “eye + vulva” dual-region scheme yields the optimal performance, with a mean absolute error (MAE) of 0.1796 °C and a coefficient of determination (R2) of 0.8212. The prediction error of this scheme is reduced by 42.3% compared with the best-performing unimodal model. The proposed method provides a fast, accurate, and stress-free solution for porcine body temperature monitoring, thereby supporting the development of intelligent health management in livestock farming. Full article
(This article belongs to the Section Pigs)
Show Figures

Figure 1

22 pages, 20012 KB  
Article
A Detail-Preserving Multi-Scale Cascaded Network for Infrared Rotary Kiln Shell Temperature Recognition and Refractory Lining Assessment
by Jie Li, Jianxin He, Hao Liu, Yunhan Hou, Zhiming Dong and Qian Zhang
Metals 2026, 16(6), 597; https://doi.org/10.3390/met16060597 - 29 May 2026
Viewed by 172
Abstract
Rotary kiln shell temperature monitoring is essential for metallic shell protection and refractory lining maintenance in high-temperature industrial processes, while smoke, dust, thermal diffusion and non-kiln heat sources make valid shell temperature extraction difficult. This study develops a multi-scale cascaded network with low-resolution [...] Read more.
Rotary kiln shell temperature monitoring is essential for metallic shell protection and refractory lining maintenance in high-temperature industrial processes, while smoke, dust, thermal diffusion and non-kiln heat sources make valid shell temperature extraction difficult. This study develops a multi-scale cascaded network with low-resolution space-to-depth downsampling (MSC-LSTD) for infrared kiln shell segmentation and temperature recognition. Global infrared thermal images and local laser temperature measurements are used to construct a calibrated rotary kiln infrared dataset, and predicted kiln shell masks are mapped to temperature matrices for valid shell temperature analysis. MSC-LSTD achieves 99.82% aAcc, 99.14% mAcc and 97.03% mIoU on the rotary kiln infrared dataset, showing robust segmentation performance under complex thermal interference. The proposed framework provides a practical image-based solution for kiln shell overheating warning and refractory lining degradation assessment. Full article
(This article belongs to the Section Computation and Simulation on Metals)
Show Figures

Figure 1

28 pages, 5603 KB  
Article
The Thermodynamics of Attention: First Law and Landauer Limit Analogues for Learning and Explainability
by Roberto C. Sotero and Jose M. Sanchez-Bornot
AI 2026, 7(6), 194; https://doi.org/10.3390/ai7060194 - 26 May 2026
Viewed by 403
Abstract
The Transformer architecture drives modern Artificial Intelligence (AI), yet the physical principles that may constrain self-attention training remain poorly characterized. We develop a thermodynamic framework for attention training, drawing on the established Boltzmann correspondence between softmax attention and equilibrium statistical mechanics, and we [...] Read more.
The Transformer architecture drives modern Artificial Intelligence (AI), yet the physical principles that may constrain self-attention training remain poorly characterized. We develop a thermodynamic framework for attention training, drawing on the established Boltzmann correspondence between softmax attention and equilibrium statistical mechanics, and we propose a First Law analogue that decomposes the training energy budget into a heat term (the entropic cost of ordering attention) and a work term (the gain in mutual information about the target). From this framework we derive a Landauer-type bound on learning, which states that the loss reduction during training is bounded below by the entropic cost of structuring attention against thermal noise. The bound is satisfied across all configurations tested: 625 grid points spanning three datasets on a compact Vision Transformer trained from scratch (MNIST, CIFAR-10, and OrganAMNIST), and ten temperatures on a pretrained ViT-Small fine-tuned on Food-101. Reusing the same physical principles at inference time, we show that the thermodynamic work performed by each input patch provides a quantitative, energy-based measure of feature importance that outperforms standard attention weights and Integrated Gradients on ImageNet across pretrained ViT-Small, ViT-Base, and ViT-Large (22M to 304M parameters). The result is an integrated diagnostic framework that links phase structure, training-time bounds, and inference-time attribution within a single empirically falsifiable thermodynamic apparatus. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning and Emerging Applications)
Show Figures

Figure 1

17 pages, 3604 KB  
Article
A Method for Down Quality Inspection: YOLO-Based Impurity Detection and Quality Quantification
by Shaowen Jing, Ruoyi Mai, Xiaofeng Gao, Weiyi Du, Ruipu Zhao, Chengran Luo and Zhihui Fan
Appl. Sci. 2026, 16(10), 5086; https://doi.org/10.3390/app16105086 - 20 May 2026
Viewed by 297
Abstract
Down quality is the core evaluation indicator of thermal insulation products, and its grade determination strictly complies with the down content index specified in the national standard GB/T 17685-2016 Feather and Down. Traditional down quality inspection adopts manual sorting and weighing methods, which [...] Read more.
Down quality is the core evaluation indicator of thermal insulation products, and its grade determination strictly complies with the down content index specified in the national standard GB/T 17685-2016 Feather and Down. Traditional down quality inspection adopts manual sorting and weighing methods, which are plagued by low efficiency, strong subjectivity and high error rates, thereby restricting the intelligent upgrading of the down industry. This study aims to develop an automatic down detection and quantitative grading method conforming to national standards based on deep learning. A down dataset consisting of 632 RGB images is constructed, with each image containing 5–10 individual down samples and covering five categories: mature down clusters, immature down clusters, down filaments, feathers, and yellow-tail down. Three mainstream frameworks including YOLOv8, YOLOv11 and YOLOv26 are trained for performance comparison. Precision, recall, mAP@50 and mAP@50-95 are adopted as evaluation metrics. In addition, this paper proposes a research idea for down content calculation and automatic classification and grading of down quality in accordance with relevant national standards. The experimental results demonstrate that the latest models do not necessarily achieve the optimal performance. The newly released YOLOv26n and YOLOv26m exhibit relatively low accuracy in the down detection task, with mAP@50 values of only 0.98556 and 0.99077, and recall rates of 0.95032 and 0.97848, respectively, failing to outperform their previous-generation counterparts. In contrast, YOLOv11n achieves the best comprehensive performance, with an mAP@50 of 0.99416, a precision of 0.99544, a recall of 0.99722, and an mAP@50-95 of 0.63464. Meanwhile, the model has only 2.58 M parameters, a computational complexity of 6.3 GFLOPs, and a single training time of approximately 6.7 min, achieving an optimal balance between detection accuracy and computational efficiency. All models show the highest detection accuracy for mature down clusters and yellow-tailed down, while slight confusion exists between immature down clusters and down filaments. This study verifies the feasibility of the YOLO series models in down quality inspection in accordance with national standards, and reveals that model architecture iteration does not necessarily lead to performance improvement on specific industrial datasets. The lightweight and robustly designed YOLOv11n presents greater practical value. The intelligent detection scheme proposed in this paper can assist in optimizing the traditional manual quality inspection workflow, alleviating the burden of manual counting and reducing subjective errors. It provides new ideas and technical references for the rapid screening and objective determination of down quality. Furthermore, the proposed research framework for automatic classification and grading of down quality is expected to promote the development of down quality inspection toward standardization, intelligence, and automation in the future. Full article
Show Figures

Figure 1

15 pages, 3297 KB  
Article
A Weakly Supervised Multi-Scale Cross-Modal Information Fusion Method for Wildfire Detection
by Dawei Wen, Zhoujiang Peng and Yuan Tian
Computers 2026, 15(5), 311; https://doi.org/10.3390/computers15050311 - 14 May 2026
Viewed by 301
Abstract
In recent years, wildfires have occurred with increasing frequency. Pixel-level annotation of high-resolution remote sensing wildfire imagery is costly and labor-intensive. Therefore, there is an urgent need for a weakly supervised wildfire detection method that balances detection accuracy and annotation efficiency. To address [...] Read more.
In recent years, wildfires have occurred with increasing frequency. Pixel-level annotation of high-resolution remote sensing wildfire imagery is costly and labor-intensive. Therefore, there is an urgent need for a weakly supervised wildfire detection method that balances detection accuracy and annotation efficiency. To address the key limitations of existing weakly supervised approaches based on class activation maps (CAMs), including imprecise delineation of fire boundaries, insufficient utilization of cross-modal information, and limited capability in modeling temporal characteristics, this paper proposes a dual-branch multi-scale feature fusion framework for weakly supervised wildfire detection. The proposed framework consists of a multispectral branch and a shortwave infrared (SWIR) temporal branch, which are designed to capture the spatial structural information of fire regions and the temporal variation of thermal anomalies, respectively. Attention-guided feature fusion modules are introduced at each network stage to enable complementary integration of cross-modal information. In addition, a multi-scale CAM-weighted fusion strategy is designed to jointly enhance region localization accuracy and semantic discrimination capability. Experimental evaluations are conducted on a high-resolution wildfire dataset covering 29 regions and consisting of 2206 images. The results demonstrate that the proposed method achieves an IoU of 58.7% and an F1-score of 73.5%, outperforming the state-of-the-art methods by 4.6% and 3.2%, respectively. Ablation and comparative experiments further verify that the dual-branch architecture and feature fusion strategy significantly improve fire localization accuracy and effectively reduce the missed detection rate. Full article
Show Figures

Figure 1

34 pages, 2306 KB  
Review
A Review of Explainable Machine Learning in Medical Thermography for Interpretable Thermal Feature Analysis and Biomarker Discovery
by Muhammad Sohail, Hikmat Yar and Heung Soo Kim
Mathematics 2026, 14(10), 1666; https://doi.org/10.3390/math14101666 - 13 May 2026
Viewed by 365
Abstract
Medical thermography is a noninvasive, contactless imaging technique that captures spatial temperature distributions across the human body, providing insights into vascular function, inflammation, metabolism, physiological regulation, and aging. Recently, machine learning has been increasingly utilized to analyze thermographic data for disease screening, functional [...] Read more.
Medical thermography is a noninvasive, contactless imaging technique that captures spatial temperature distributions across the human body, providing insights into vascular function, inflammation, metabolism, physiological regulation, and aging. Recently, machine learning has been increasingly utilized to analyze thermographic data for disease screening, functional assessment, and biomarker identification. However, the existing literature is fragmented, with varied clinical applications, feature-engineering strategies, and predictive modeling frameworks, often lacking a focus on interpretability and the reliable identification of clinically relevant thermal markers. This review offers a structured overview of explainable machine learning in medical thermography, emphasizing thermal feature representation, model interpretability, and biomarker discovery. It categorizes thermographic features into pixel-based representations, region-wise statistical descriptors, texture measures, and deep latent features. Additionally, it evaluates conventional machine learning and deep learning methods for classification, regression, and risk assessment tasks. The review pays special attention to interpretable learning strategies, such as feature importance analysis, surrogate explanation models, saliency-based visualization, and Shapley-value-based methods, which can enhance transparency and confidence in model outputs. Key challenges are critically discussed, including imaging variability, limited dataset sizes, weak protocol standardization, class imbalance, generalizability, and the gap between predictive performance and clinical trust. Overall, this review synthesizes current advancements, identifies major research gaps, and outlines future directions for developing trustworthy machine learning frameworks in medical thermography and enhancing interpretable thermal biomarker discovery. Full article
(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Systems)
Show Figures

Figure 1

29 pages, 19640 KB  
Article
Target-Aware Fusion: A Diffusion Model for Infrared and Visible Image Integration to Enhance Object Detection
by Jinyong Chen, Tingyu Zhu and Gang Wang
Remote Sens. 2026, 18(10), 1545; https://doi.org/10.3390/rs18101545 - 13 May 2026
Viewed by 285
Abstract
There are differences in imaging characteristics between infrared and visible light images: visible light images can provide rich texture and color information, but imaging is limited in harsh weather conditions. Infrared images are based on the target’s thermal radiation characteristics and have the [...] Read more.
There are differences in imaging characteristics between infrared and visible light images: visible light images can provide rich texture and color information, but imaging is limited in harsh weather conditions. Infrared images are based on the target’s thermal radiation characteristics and have the ability to resist environmental interference but lack details and background information. Effectively integrating the two can significantly enhance scene understanding ability and improve environmental perception and target recognition performance in applications such as intelligent driving. However, existing fusion methods still face challenges, especially in complex scenes where it is difficult to balance the full preservation of target information with the complete presentation of background details, often resulting in difficulties in extracting differentiated features from different modalities. This article proposes a target detection method based on the visible light infrared fusion diffusion model. This method introduces the Stable Diffusion architecture and designs a target perception spatial fusion weight module that can adaptively generate a spatial fusion weight map based on modal differences. By implementing a multi-stage dynamic fusion strategy, the fusion ratio is automatically adjusted at different diffusion stages. A full-step multi-step prediction mechanism is adopted to improve fusion quality and stability. Compared with existing methods, the method proposed in this article has significant advantages. Experiments on multiple publicly available datasets have shown that this method outperforms existing mainstream methods in key metrics such as Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE), and ean Absolute Error (MAE) and also demonstrates good detection performance in downstream tasks for object detection. Full article
Show Figures

Figure 1

13 pages, 2993 KB  
Article
Enhancing Catheter-Assisted C-Arm CT-Guided Ablation with PET/CT Fusion: A Pictorial Overview of Multimodal Synergy for Improving Local Tumor Control in Liver Metastasis
by Laurens Hermie, Charlotte Harth, Kathia De Man, Alexander Decruyenaere, Celine Jacobs and Karen Geboes
Cancers 2026, 18(10), 1584; https://doi.org/10.3390/cancers18101584 - 13 May 2026
Viewed by 421
Abstract
Background/Objectives: Image-guided percutaneous thermal ablation is an established local treatment for selected patients with liver metastases, provided that accurate tumor targeting and adequate ablation margins can be achieved. However, lesion detection, target delineation, and intraprocedural margin verification remain challenging in post-chemotherapy or previously [...] Read more.
Background/Objectives: Image-guided percutaneous thermal ablation is an established local treatment for selected patients with liver metastases, provided that accurate tumor targeting and adequate ablation margins can be achieved. However, lesion detection, target delineation, and intraprocedural margin verification remain challenging in post-chemotherapy or previously treated lesions that may become morphologically inconspicuous or radiologically occult. Catheter-assisted C-arm (cone-beam) CT hepatic arteriography (CBCT-HA) improves intraprocedural visualization of tumor vascularity and supports streamlined workflows within the angiography suite, yet it may underestimate tumor extent in lesions with limited or absent angiographic conspicuity. This pictorial essay illustrates the feasibility and added value of integrating preprocedural PET/CT with intraprocedural CBCT-HA for liver tumor ablation. Methods: Representative clinical cases of percutaneous liver tumor ablation guided by PET–CBCT-HA fusion are presented. Preprocedural PET/CT datasets were rigidly registered and fused with intraprocedural CBCT-HA to support tumor detection, target delineation, ablation planning, and real-time intraprocedural margin assessment. The complementary roles of metabolic and angiographic imaging were evaluated qualitatively across different clinical scenarios. Results: PET–CBCT-HA fusion improved detection and delineation of viable tumor components that were occult or insufficiently defined on CBCT-HA alone, particularly in post-chemotherapy or previously treated lesions. Conversely, CBCT-HA identified angiographically evident lesions not apparent on PET/CT. The combined approach enabled confident target definition, biologically informed ablation planning, and immediate post-ablation verification of metabolic and angiographic coverage, supporting margin-oriented intraprocedural decision-making. Conclusions: By integrating complementary metabolic and vascular information into a single-session workflow, PET–CBCT-HA fusion represents a multimodal guidance strategy that enhances lesion visualization and intraprocedural margin assessment. This approach may improve local tumor control in complex post-treatment and oligometastatic liver disease. Full article
(This article belongs to the Special Issue Image-Guided Treatment of Liver Tumors)
Show Figures

Figure 1

Back to TopTop