Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,107)

Search Parameters:
Keywords = Unet architecture

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 2292 KB  
Article
Source Camera Identification via Explicit Content–Fingerprint Decoupling with a Dual-Branch Deep Learning Framework
by Zijuan Han, Yang Yang, Jiaxuan Lu, Jian Sun, Yunxia Liu and Ngai-Fong Bonnie Law
Appl. Sci. 2026, 16(3), 1245; https://doi.org/10.3390/app16031245 - 26 Jan 2026
Abstract
In this paper, we propose a source camera identification method based on disentangled feature modeling, aiming to achieve robust extraction of camera fingerprint features under complex imaging and post-processing conditions. To address the severe coupling between image content and camera fingerprint features in [...] Read more.
In this paper, we propose a source camera identification method based on disentangled feature modeling, aiming to achieve robust extraction of camera fingerprint features under complex imaging and post-processing conditions. To address the severe coupling between image content and camera fingerprint features in existing methods, which makes content interference difficult to suppress, we develop a dual-branch deep learning framework guided by imaging physics. By introducing physical consistency constraints, the proposed framework explicitly separates image content representations from device-related fingerprint features in the feature space, thereby enhancing the stability and robustness of source camera identification. The proposed method adopts two parallel branches: a content modeling branch and a fingerprint feature extraction branch. The content branch is built upon an improved U-Net architecture to reconstruct scene and color information, and further incorporates texture refinement and multi-scale feature fusion to reduce residual content interference in fingerprint modeling. The fingerprint branch employs ResNet-50 as the backbone network to learn discriminative global features associated with the camera imaging pipeline. Based on these branches, fingerprint information dominated by sensor noise is explicitly extracted by computing the residual between the input image and the reconstructed content, and is further encoded through noise analysis and feature fusion for joint camera model classification. Experimental results on multiple public-source camera forensics datasets demonstrate that the proposed method achieves stable and competitive identification performance in same-brand camera discrimination, complex imaging conditions, and post-processing scenarios, validating the effectiveness of the proposed disentangled modeling and physical consistency constraint strategy for source camera identification. Full article
(This article belongs to the Special Issue New Development in Machine Learning in Image and Video Forensics)
Show Figures

Figure 1

26 pages, 4765 KB  
Article
Hybrid ConvLSTM U-Net Deep Neural Network for Land Use and Land Cover Classification from Multi-Temporal Sentinel-2 Images: Application to Yaoundé, Cameroon
by Ange Gabriel Belinga, Stéphane Cédric Tékouabou Koumetio and Mohammed El Haziti
Math. Comput. Appl. 2026, 31(1), 18; https://doi.org/10.3390/mca31010018 - 26 Jan 2026
Abstract
Accurate mapping of land use and land cover (LULC) is crucial for various applications such as urban planning, environmental management, and sustainable development, particularly in rapidly growing urban areas. African cities such as Yaoundé, Cameroon, are particularly affected by this rapid and often [...] Read more.
Accurate mapping of land use and land cover (LULC) is crucial for various applications such as urban planning, environmental management, and sustainable development, particularly in rapidly growing urban areas. African cities such as Yaoundé, Cameroon, are particularly affected by this rapid and often uncontrolled urban growth with complex spatio-temporal dynamics. Effective modeling of LULC indicators in such areas requires robust algorithms for high-resolution images segmentation and classification, as well as reliable data with great spatio-temporal distributions. Among the most suitable data sources for these types of studies, Sentinel-2 image time series, thanks to their high spatial (10 m) and temporal (5 days) resolution, are a valuable source of data for this task. However, for an effective LULC modeling purpose in such dynamic areas, many challenges remain, including spectral confusion between certain classes, seasonal variability, and spatial heterogeneity. This study proposes a hybrid deep learning architecture combining U-Net and Convolutional Long Short-Term Memory (ConvLSTM) layers, allowing the spatial structures and temporal dynamics of the Sentinel-2 series to be exploited jointly. Applied to the Yaoundé region (Cameroon) over the period 2018–2025, the hybrid model significantly outperforms the U-Net and ConvLSTM models alone. It achieves a macro-average F1 score of 0.893, an accuracy of 0.912, and an average IoU of 0.811 on the test set. These segmentation performances reached up to 0.948, 0.953, and 0.910 for precision, F1-score, and IoU, respectively, on the built-up areas class. Moreover, despite its better performance, in terms of complexity, the figures confirm that the hybrid does not significantly penalize evaluation speed. These results demonstrate the relevance of jointly integrating space and time for robust LULC classification from multi-temporal satellite images. Full article
Show Figures

Figure 1

18 pages, 4674 KB  
Article
AI Correction of Smartphone Thermal Images: Application to Diabetic Plantar Foot
by Hafid Elfahimi, Rachid Harba, Asma Aferhane, Hassan Douzi and Ikram Damoune
J. Sens. Actuator Netw. 2026, 15(1), 13; https://doi.org/10.3390/jsan15010013 - 26 Jan 2026
Abstract
Prevention of complications related to diabetic foot (DF) can now be performed using smartphone-connected thermal cameras. However, the absolute error associated with these devices remains particularly high, compromising measurement reliability, especially under variable environmental conditions. To address this, we introduce a physiologically motivated [...] Read more.
Prevention of complications related to diabetic foot (DF) can now be performed using smartphone-connected thermal cameras. However, the absolute error associated with these devices remains particularly high, compromising measurement reliability, especially under variable environmental conditions. To address this, we introduce a physiologically motivated two-region segmentation task (forehead + plantar foot) to enable stable temperature correction. First, we developed a fully automated joint method for this task, building upon a new multimodal thermal–RGB dataset constructed with detailed annotation procedures. Five deep learning methods (U-Net, U-Net++, SegNet, DE-ResUnet, and DE-ResUnet++) were evaluated and compared to traditional baselines (Adaptive Thresholding and Region Growing), demonstrating the clear advantage of data-driven approaches. The best performance was achieved by the DE-ResUnet++ architecture (Dice score: 98.46%). Second, we validated the correction approach through a clinical study. Results showed that the variance of corrected temperatures was reduced by half compared to absolute values (p < 0.01), highlighting the effectiveness of the correction approach. Furthermore, corrected temperatures successfully distinguished DF patients from healthy controls (p < 0.01), unlike absolute temperatures. These findings suggest that our approach could enhance the performance of smartphone-connected thermal devices and contribute to the early prevention of DF complications. Full article
(This article belongs to the Special Issue IoT and Networking Technologies for Smart Mobile Systems)
Show Figures

Figure 1

45 pages, 2071 KB  
Systematic Review
Artificial Intelligence Techniques for Thyroid Cancer Classification: A Systematic Review
by Yanche Ari Kustiawan, Khairil Imran Ghauth, Sakina Ghauth, Liew Yew Toong and Sien Hui Tan
Mach. Learn. Knowl. Extr. 2026, 8(2), 27; https://doi.org/10.3390/make8020027 - 23 Jan 2026
Viewed by 266
Abstract
Artificial intelligence (AI), particularly machine learning and deep learning architectures, has been widely applied to support thyroid cancer diagnosis, but existing evidence on its performance and limitations remains scattered across techniques, tasks, and data types. This systematic review synthesizes recent work on knowledge [...] Read more.
Artificial intelligence (AI), particularly machine learning and deep learning architectures, has been widely applied to support thyroid cancer diagnosis, but existing evidence on its performance and limitations remains scattered across techniques, tasks, and data types. This systematic review synthesizes recent work on knowledge extraction from heterogeneous imaging and clinical data for thyroid cancer diagnosis and detection published between 2021 and 2025. We searched eight major databases, applied predefined inclusion and exclusion criteria, and assessed study quality using the Newcastle–Ottawa Scale. A total of 150 primary studies were included and analyzed with respect to AI techniques, diagnostic tasks, imaging and non-imaging modalities, model generalization, explainable AI, and recommended future directions. We found that deep learning, particularly convolutional neural networks, U-Net variants, and transformer-based models, dominated recent work, mainly for ultrasound-based benign–malignant classification, nodule detection, and segmentation, while classical machine learning, ensembles, and advanced paradigms remained important in specific structured-data settings. Ultrasound was the primary modality, complemented by cytology, histopathology, cross-sectional imaging, molecular data, and multimodal combinations. Key limitations included diagnostic ambiguity, small and imbalanced datasets, limited external validation, gaps in model generalization, and the use of largely non-interpretable black-box models with only partial use of explainable AI techniques. This review provides a structured, machine learning-oriented evidence map that highlights opportunities for more robust representation learning, workflow-ready automation, and trustworthy AI systems for thyroid oncology. Full article
(This article belongs to the Section Thematic Reviews)
Show Figures

Graphical abstract

18 pages, 10969 KB  
Article
Simulation Data-Based Dual Domain Network (Sim-DDNet) for Motion Artifact Reduction in MR Images
by Seong-Hyeon Kang, Jun-Young Chung, Youngjin Lee and for The Alzheimer’s Disease Neuroimaging Initiative
Magnetochemistry 2026, 12(1), 14; https://doi.org/10.3390/magnetochemistry12010014 - 20 Jan 2026
Viewed by 129
Abstract
Brain magnetic resonance imaging (MRI) is highly susceptible to motion artifacts that degrade fine structural details and undermine quantitative analysis. Conventional U-Net-based deep learning approaches for motion artifact reduction typically operate only in the image domain and are often trained on data with [...] Read more.
Brain magnetic resonance imaging (MRI) is highly susceptible to motion artifacts that degrade fine structural details and undermine quantitative analysis. Conventional U-Net-based deep learning approaches for motion artifact reduction typically operate only in the image domain and are often trained on data with simplified motion patterns, thereby limiting physical plausibility and generalization. We propose Sim-DDNet, a simulation-data-based dual-domain network that combines k-space-based motion simulation with a joint image-k-space reconstruction architecture. Motion-corrupted data were generated from T2-weighted Alzheimer’s Disease Neuroimaging Initiative brain MR scans using a k-space replacement scheme with three to five random rotational and translational events per volume, yielding 69,283 paired samples (49,852/6969/12,462 for training/validation/testing). Sim-DDNet integrates a real-valued U-Net-like image branch and a complex-valued k-space branch using cross attention, FiLM-based feature modulation, soft data consistency, and composite loss comprising L1, structural similarity index measure (SSIM), perceptual, and k-space-weighted terms. On the independent test set, Sim-DDNet achieved a peak signal-to-noise ratio of 31.05 dB, SSIM of 0.85, and gradient magnitude similarity deviation of 0.077, consistently outperforming U-Net and U-Net++ across all three metrics while producing less blurring, fewer residual ghost/streak artifacts, and reduced hallucination of non-existent structures. These results indicate that dual-domain, data-consistency-aware learning, which explicitly exploits k-space information, is a promising approach for physically plausible motion artifact correction in brain MRI. Full article
(This article belongs to the Special Issue Magnetic Resonances: Current Applications and Future Perspectives)
Show Figures

Figure 1

32 pages, 3054 KB  
Article
Identification of Cholesterol in Plaques of Atherosclerotic Using Magnetic Resonance Spectroscopy and 1D U-Net Architecture
by Angelika Myśliwiec, Dawid Leksa, Avijit Paul, Marvin Xavierselvan, Adrian Truszkiewicz, Dorota Bartusik-Aebisher and David Aebisher
Molecules 2026, 31(2), 352; https://doi.org/10.3390/molecules31020352 - 19 Jan 2026
Viewed by 124
Abstract
Cholesterol plays a fundamental role in the human body—it stabilizes cell membranes, modulates gene expression, and is a precursor to steroid hormones, vitamin D, and bile salts. Its correct level is crucial for homeostasis, while both excess and deficiency are associated with serious [...] Read more.
Cholesterol plays a fundamental role in the human body—it stabilizes cell membranes, modulates gene expression, and is a precursor to steroid hormones, vitamin D, and bile salts. Its correct level is crucial for homeostasis, while both excess and deficiency are associated with serious metabolic and health consequences. Excessive accumulation of cholesterol leads to the development of atherosclerosis, while its deficiency disrupts the transport of fat-soluble vitamins. Magnetic resonance spectroscopy (MRS) enables the detection of cholesterol esters and the differentiation between their liquid and crystalline phases, but the technical limitations of clinical MRI systems require the use of dedicated coils and sequence modifications. This study demonstrates the feasibility of using MRS to identify cholesterol-specific spectral signatures in atherosclerotic plaque through ex vivo analysis. Using a custom-designed experimental coil adapted for small-volume samples, we successfully detected characteristic cholesterol peaks from plaque material dissolved in chloroform, with spectral signatures corresponding to established NMR databases. To further enhance spectral quality, a deep-learning denoising framework based on a 1D U-Net architecture was implemented, enabling the recovery of low-intensity cholesterol peaks that would otherwise be obscured by noise. The trained U-Net was applied to experimental MRS data from atherosclerotic plaques, where it significantly outperformed traditional denoising methods (Gaussian, Savitzky–Golay, wavelet, median) across six quantitative metrics (SNR, PSNR, SSIM, RMSE, MAE, correlation), enhancing low-amplitude cholesteryl ester detection. This approach substantially improved signal clarity and the interpretability of cholesterol-related resonances, supporting more accurate downstream spectral assessment. The integration of MRS with NMR-based lipidomic analysis, which allows the identification of lipid signatures associated with plaque progression and destabilization, is becoming increasingly important. At the same time, the development of high-resolution techniques such as μOCT provides evidence for the presence of cholesterol crystals and their potential involvement in the destabilization of atherosclerotic lesions. In summary, nanotechnology-assisted MRI has the potential to become an advanced tool in the proof-of-concept of atherosclerosis, enabling not only the identification of cholesterol and its derivatives, but also the monitoring of treatment efficacy. However, further clinical studies are necessary to confirm the practical usefulness of these solutions and their prognostic value in assessing cardiovascular risk. Full article
Show Figures

Figure 1

20 pages, 4501 KB  
Article
Improving Prostate Cancer Segmentation on T2-Weighted MRI Using Prostate Detection and Cascaded Networks
by Nikolay Nefediev, Nikolay Staroverov and Roman Davydov
Algorithms 2026, 19(1), 85; https://doi.org/10.3390/a19010085 - 19 Jan 2026
Viewed by 92
Abstract
Prostate cancer is one of the most lethal cancers in the male population, and accurate localization of intraprostatic lesions on MRI remains challenging. In this study, we investigated methods for improving prostate cancer segmentation on T2-weighted pelvic MRI using cascaded neural networks. We [...] Read more.
Prostate cancer is one of the most lethal cancers in the male population, and accurate localization of intraprostatic lesions on MRI remains challenging. In this study, we investigated methods for improving prostate cancer segmentation on T2-weighted pelvic MRI using cascaded neural networks. We used an anonymized dataset of 400 multiparametric MRI scans from two centers, in which experienced radiologists had delineated the prostate and clinically significant cancer on the T2 series. Our baseline approach applies 2D and 3D segmentation networks (UNETR, UNET++, Swin-UNETR, SegResNetDS, and SegResNetVAE) directly to full MRI volumes. We then introduce additional stages that filter slices using DenseNet-201 classifiers (cancer/no-cancer and prostate/no-prostate) and localize the prostate via a YOLO-based detector to crop the 3D region of interest before segmentation. Using Swin-UNETR as the backbone, the prostate segmentation Dice score increased from 71.37% for direct 3D segmentation to 76.09% when using prostate detection and cropped 3D inputs. For cancer segmentation, the final cascaded pipeline—prostate detection, 3D prostate segmentation, and 3D cancer segmentation within the prostate—improved the Dice score from 55.03% for direct 3D segmentation to 67.11%, with an ROC AUC of 0.89 on the test set. These results suggest that cascaded detection- and segmentation-based preprocessing of the prostate region can substantially improve automatic prostate cancer segmentation on MRI while remaining compatible with standard segmentation architectures. Full article
(This article belongs to the Special Issue AI-Powered Biomedical Image Analysis)
Show Figures

Figure 1

16 pages, 1725 KB  
Article
A Lightweight Modified Adaptive UNet for Nucleus Segmentation
by Md Rahat Kader Khan, Tamador Mohaidat and Kasem Khalil
Sensors 2026, 26(2), 665; https://doi.org/10.3390/s26020665 - 19 Jan 2026
Viewed by 290
Abstract
Cell nucleus segmentation in microscopy images is an initial step in the quantitative analysis of imaging data, which is crucial for diverse biological and biomedical applications. While traditional machine learning methodologies have demonstrated limitations, recent advances in U-Net models have yielded promising improvements. [...] Read more.
Cell nucleus segmentation in microscopy images is an initial step in the quantitative analysis of imaging data, which is crucial for diverse biological and biomedical applications. While traditional machine learning methodologies have demonstrated limitations, recent advances in U-Net models have yielded promising improvements. However, it is noteworthy that these models perform well on balanced datasets, where the ratio of background to foreground pixels is equal. Within the realm of microscopy image segmentation, state-of-the-art models often encounter challenges in accurately predicting small foreground entities such as nuclei. Moreover, the majority of these models exhibit large parameter sizes, predisposing them to overfitting issues. To overcome these challenges, this study introduces a novel architecture, called mA-UNet, designed to excel in predicting small foreground elements. Additionally, a data preprocessing strategy inspired by road segmentation approaches is employed to address dataset imbalance issues. The experimental results show that the MIoU score attained by the mA-UNet model stands at 95.50%, surpassing the nearest competitor, UNet++, on the 2018 Data Science Bowl dataset. Ultimately, our proposed methodology surpasses all other state-of-the-art models in terms of both quantitative and qualitative evaluations. The mA-UNet model is also implemented in VHDL on the Zynq UltraScale+ FPGA, demonstrating its ability to perform complex computations with minimal hardware resources, as well as its efficiency and scalability on advanced FPGA platforms. Full article
(This article belongs to the Special Issue Sensing and Processing for Medical Imaging: Methods and Applications)
Show Figures

Figure 1

23 pages, 8167 KB  
Article
MRMAFusion: A Multi-Scale Restormer and Multi-Dimensional Attention Network for Infrared and Visible Image Fusion
by Liang Dong, Guiling Sun, Haicheng Zhang and Wenxuan Luo
Appl. Sci. 2026, 16(2), 946; https://doi.org/10.3390/app16020946 - 16 Jan 2026
Viewed by 96
Abstract
Infrared and visible image fusion improves the visual representation of scenes. Current deep learning-based fusion methods typically rely on either convolution operations for local feature extraction or Transformers for global feature extraction, often neglecting the contribution of multi-scale features to fusion performance. To [...] Read more.
Infrared and visible image fusion improves the visual representation of scenes. Current deep learning-based fusion methods typically rely on either convolution operations for local feature extraction or Transformers for global feature extraction, often neglecting the contribution of multi-scale features to fusion performance. To address this limitation, we propose MRMAFusion, a nested connection model that relies on the multi-scale restoration-Transformer (Restormer) and multi-dimensional attention. We construct an encoder–decoder architecture on UNet++ network with multi-scale local and global feature extraction using convolution blocks and Restormer. Restormer can provide global dependency and more comprehensive attention to texture details of the target region along the vertical dimension, compared to extracting features by convolution operations. Along the horizontal dimension, we enhance MRMAFusion’s multi-scale feature extraction and reconstruction capability by incorporating multi-dimensional attention into the encoder’s convolutional blocks. We perform extensive experiments on the public datasets TNO, NIR and RoadScene and compare with other state-of-the-art methods for both objective and subjective evaluation. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 9385 KB  
Article
YOLOv11-MDD: YOLOv11 in an Encoder–Decoder Architecture for Multi-Label Post-Wildfire Damage Detection—A Case Study of the 2023 US and Canada Wildfires
by Masoomeh Gomroki, Negar Zahedi, Majid Jahangiri, Bahareh Kalantar and Husam Al-Najjar
Remote Sens. 2026, 18(2), 280; https://doi.org/10.3390/rs18020280 - 15 Jan 2026
Viewed by 283
Abstract
Natural disasters occur worldwide and cause significant financial and human losses. Wildfires are among the most important natural disasters, occurring more frequently in recent years due to global warming. Fast and accurate post-disaster damage detection could play an essential role in swift rescue [...] Read more.
Natural disasters occur worldwide and cause significant financial and human losses. Wildfires are among the most important natural disasters, occurring more frequently in recent years due to global warming. Fast and accurate post-disaster damage detection could play an essential role in swift rescue planning and operations. Remote sensing (RS) data is an important source for tracking damage detection. Deep learning (DL) methods, as efficient tools, can extract valuable information from RS data to generate an accurate damage map for future operations. The present study proposes an encoder–decoder architecture composed of pre-trained Yolov11 blocks as the encoder path and Modified UNet (MUNet) blocks as the decoder path. The proposed network includes three main steps: (1) pre-processing, (2) network training, (3) prediction multilabel damage map and accuracy evaluation. To evaluate the network’s performance, the US and Canada datasets were considered. The datasets are satellite images of the 2023 wildfires in the US and Canada. The proposed method reaches the Overall Accuracy (OA) of 97.36, 97.47, and Kappa Coefficient (KC) of 0.96, 0.87 for the US and Canada 2023 wildfire datasets, respectively. Regarding the high OA and KC, an accurate final burnt map can be generated to assist in rescue and recovery efforts after the wildfire. The proposed YOLOv11–MUNet framework introduces an efficient and accurate post-event-only approach for wildfire damage detection. By overcoming the dependency on pre-event imagery and reducing model complexity, this method enhances the applicability of DL in rapid post-disaster assessment and management. Full article
Show Figures

Figure 1

29 pages, 7092 KB  
Article
Dual-Branch Attention Photovoltaic Power Forecasting Model Integrating Ground-Based Cloud Image Features
by Lianglin Zou, Hongyang Quan, Jinguo He, Shuai Zhang, Ping Tang, Xiaoshi Xu and Jifeng Song
Energies 2026, 19(2), 409; https://doi.org/10.3390/en19020409 - 14 Jan 2026
Viewed by 95
Abstract
The photovoltaic field has seen significant development in recent years, with continuously expanding installation capacity and increasing grid integration. However, due to the intermittency of solar energy and meteorological variability, PV output power poses serious challenges to grid security and dispatch reliability. Traditional [...] Read more.
The photovoltaic field has seen significant development in recent years, with continuously expanding installation capacity and increasing grid integration. However, due to the intermittency of solar energy and meteorological variability, PV output power poses serious challenges to grid security and dispatch reliability. Traditional forecasting methods largely rely on modeling historical power and meteorological data, often neglecting the consideration of cloud movement, which constrains further improvement in prediction accuracy. To enhance prediction accuracy and model interpretability, this paper proposes a dual-branch attention-based PV power prediction model that integrates physical features from ground-based cloud images. Regarding input features, a cloud segmentation model is constructed based on the vision foundation model DINO encoder and an improved U-Net decoder to obtain cloud cover information. Based on deep feature point detection and an attention matching mechanism, cloud motion vectors are calculated to extract cloud motion speed and direction features. For feature processing, feature attention and temporal attention mechanisms are introduced, enabling the model to learn key meteorological factors and critical historical time steps. Structurally, a parallel architecture consisting of a linear branch and a nonlinear branch is adopted. A context-aware fusion module adaptively combines the prediction results from both branches, achieving collaborative modeling of linear trends and nonlinear fluctuations. Comparative experiments were conducted using two years of engineering data. Experimental results demonstrate that the proposed model outperforms the benchmarks across multiple metrics, validating the predictive advantages of the dual-branch structure that integrates physical features under complex weather conditions. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

14 pages, 623 KB  
Article
Improved Multisource Image-Based Diagnostic for Thyroid Cancer Detection: ANTHEM National Complementary Plan Research Project
by Domenico Parmeggiani, Alessio Cece, Massimo Agresti, Francesco Miele, Pasquale Luongo, Giancarlo Moccia, Francesco Torelli, Rossella Sperlongano, Paola Bassi, Mehrdad Savabi Far, Shima Tajabadi, Agostino Fernicola, Marina Di Domenico, Federica Colapietra, Paola Della Monica, Stefano Avenia and Ludovico Docimo
Appl. Sci. 2026, 16(2), 830; https://doi.org/10.3390/app16020830 - 13 Jan 2026
Viewed by 270
Abstract
Thyroid nodule evaluation relies heavily on ultrasound imaging, yet it suffers from significant inter-operator variability. To address this, we present a preliminary validation of the Synergy-Net platform, an AI-driven Computer-Aided Diagnosis (CAD) system designed to standardize acquisition and improve diagnostic accuracy. The system [...] Read more.
Thyroid nodule evaluation relies heavily on ultrasound imaging, yet it suffers from significant inter-operator variability. To address this, we present a preliminary validation of the Synergy-Net platform, an AI-driven Computer-Aided Diagnosis (CAD) system designed to standardize acquisition and improve diagnostic accuracy. The system integrates a U-Net architecture for anatomical segmentation and a ResNet-50 classifier for lesion characterization within a Human-in-the-Loop (HITL) workflow. The study enrolled 110 patients (71 benign, 39 malignant) undergoing surgery. Performance was evaluated against histopathological ground truth. The system achieved an Accuracy of 90.35% (95% CI: 88.2–92.5%), Sensitivity of 90.64% (95% CI: 87.9–93.4%), and an AUC of 0.90. Furthermore, the framework introduces a multimodal approach, performing late fusion of imaging features with genomic profiles (TruSight One panel). While current results validate the 2D diagnostic pipeline, the discussion outlines the transition to the ANTHEM framework, incorporating future 3D volumetric analysis and digital pathology integration. These findings suggest that AI-assisted standardization can significantly enhance diagnostic precision, though multi-center validation remains necessary. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

24 pages, 5237 KB  
Article
DCA-UNet: A Cross-Modal Ginkgo Crown Recognition Method Based on Multi-Source Data
by Yunzhi Guo, Yang Yu, Yan Li, Mengyuan Chen, Wenwen Kong, Yunpeng Zhao and Fei Liu
Plants 2026, 15(2), 249; https://doi.org/10.3390/plants15020249 - 13 Jan 2026
Viewed by 272
Abstract
Wild ginkgo, as an endangered species, holds significant value for genetic resource conservation, yet its practical applications face numerous challenges. Traditional field surveys are inefficient in mountainous mixed forests, while satellite remote sensing is limited by spatial resolution. Current deep learning approaches relying [...] Read more.
Wild ginkgo, as an endangered species, holds significant value for genetic resource conservation, yet its practical applications face numerous challenges. Traditional field surveys are inefficient in mountainous mixed forests, while satellite remote sensing is limited by spatial resolution. Current deep learning approaches relying on single-source data or merely simple multi-source fusion fail to fully exploit information, leading to suboptimal recognition performance. This study presents a multimodal ginkgo crown dataset, comprising RGB and multispectral images acquired by an UAV platform. To achieve precise crown segmentation with this data, we propose a novel dual-branch dynamic weighting fusion network, termed dual-branch cross-modal attention-enhanced UNet (DCA-UNet). We design a dual-branch encoder (DBE) with a two-stream architecture for independent feature extraction from each modality. We further develop a cross-modal interaction fusion module (CIF), employing cross-modal attention and learnable dynamic weights to boost multi-source information fusion. Additionally, we introduce an attention-enhanced decoder (AED) that combines progressive upsampling with a hybrid channel-spatial attention mechanism, thereby effectively utilizing multi-scale features and enhancing boundary semantic consistency. Evaluation on the ginkgo dataset demonstrates that DCA-UNet achieves a segmentation performance of 93.42% IoU (Intersection over Union), 96.82% PA (Pixel Accuracy), 96.38% Precision, and 96.60% F1-score. These results outperform differential feature attention fusion network (DFAFNet) by 12.19%, 6.37%, 4.62%, and 6.95%, respectively, and surpasses the single-modality baselines (RGB or multispectral) in all metrics. Superior performance on cross-flight-altitude data further validates the model’s strong generalization capability and robustness in complex scenarios. These results demonstrate the superiority of DCA-UNet in UAV-based multimodal ginkgo crown recognition, offering a reliable and efficient solution for monitoring wild endangered tree species. Full article
(This article belongs to the Special Issue Advanced Remote Sensing and AI Techniques in Agriculture and Forestry)
Show Figures

Figure 1

25 pages, 4064 KB  
Article
Application of CNN and Vision Transformer Models for Classifying Crowns in Pine Plantations Affected by Diplodia Shoot Blight
by Mingzhu Wang, Christine Stone and Angus J. Carnegie
Forests 2026, 17(1), 108; https://doi.org/10.3390/f17010108 - 13 Jan 2026
Viewed by 212
Abstract
Diplodia shoot blight is an opportunistic fungal pathogen infesting many conifer species and it has a global distribution. Depending on the duration and severity of the disease, affected needles appear yellow (chlorotic) for a brief period before becoming red or brown in colour. [...] Read more.
Diplodia shoot blight is an opportunistic fungal pathogen infesting many conifer species and it has a global distribution. Depending on the duration and severity of the disease, affected needles appear yellow (chlorotic) for a brief period before becoming red or brown in colour. These symptoms can occur on individual branches or over the entire crown. Aerial sketch-mapping or the manual interpretation of aerial photography for tree health surveys are labour-intensive and subjective. Recently, however, the application of deep learning (DL) techniques to detect and classify tree crowns in high-spatial-resolution imagery has gained significant attention. This study evaluated two complementary DL approaches for the detection and classification of Pinus radiata trees infected with diplodia shoot blight across five geographically dispersed sites with varying topographies over two acquisition years: (1) object detection using YOLOv12 combined with Segment Anything Model (SAM) and (2) pixel-level semantic segmentation using U-Net, SegFormer, and EVitNet. The three damage classes for the object detection approach were ‘yellow’, ‘red-brown’ (both whole-crown discolouration) and ‘dead tops’ (partially discoloured crowns), while for the semantic segmentation the three classes were yellow, red-brown, and background. The YOLOv12m model achieved an overall mAP50 score of 0.766 and mAP50–95 of 0.447 across all three classes, with red-brown crowns demonstrating the highest detection accuracy (mAP50: 0.918, F1 score: 0.851). For semantic segmentation models, SegFormer showed the strongest performance (IoU of 0.662 for red-brown and 0.542 for yellow) but at the cost of longest training time, while EVitNet offered the most cost-effective solution achieving comparable accuracy to SegFormer but with a superior training efficiency with its lighter architecture. The accurate identification and symptom classification of crown damage symptoms support the calibration and validation of satellite-based monitoring systems and assist in the prioritisation of ground-based diagnosis or management interventions. Full article
(This article belongs to the Section Forest Health)
Show Figures

Figure 1

20 pages, 2070 KB  
Article
Automated Detection of Normal, Atrial, and Ventricular Premature Beats from Single-Lead ECG Using Convolutional Neural Networks
by Dimitri Kraft and Peter Rumm
Sensors 2026, 26(2), 513; https://doi.org/10.3390/s26020513 - 12 Jan 2026
Viewed by 256
Abstract
Accurate detection of premature atrial contractions (PACs) and premature ventricular contractions (PVCs) in single-lead electrocardiograms (ECGs) is crucial for early identification of patients at risk for atrial fibrillation, cardiomyopathy, and other adverse outcomes. In this work, we present a fully convolutional one-dimensional U-Net [...] Read more.
Accurate detection of premature atrial contractions (PACs) and premature ventricular contractions (PVCs) in single-lead electrocardiograms (ECGs) is crucial for early identification of patients at risk for atrial fibrillation, cardiomyopathy, and other adverse outcomes. In this work, we present a fully convolutional one-dimensional U-Net that reframes beat classification as a segmentation task and directly detects normal beats, PACs, and PVCs from raw ECG signals. The architecture employs a ConvNeXt V2 encoder with simple decoder blocks and does not rely on explicit R-peak detection, handcrafted features, or fixed-length input windows. The model is trained on the Icentia11k database and an in-house single-lead ECG dataset that emphasizes challenging, noisy recordings, and is validated on the CPSC2020 database. Generalization is assessed across several benchmark and clinical datasets, including MIT-BIH Arrhythmia (ADB), MIT 11, AHA, NST, SVDB, CST STRIPS, and CPSC2020. The proposed method achieves near-perfect QRS detection (sensitivity and precision up to 0.999) and competitive PVC performance, with sensitivity ranging from 0.820 (AHA) to 0.986 (MIT 11) and precision up to 0.993 (MIT 11). PAC detection is more variable, with sensitivities between 0.539 and 0.797 and precisions between 0.751 and 0.910, yet the resulting F1-score of 0.72 on SVDB exceeds that of previously published approaches. Model interpretability is addressed using Layer-wise Gradient-weighted Class Activation Mapping (LayerGradCAM), which confirms physiologically plausible attention to QRS complexes for PVCs and to P-waves for PACs. Overall, the proposed framework provides a robust, interpretable, and hardware-efficient solution for joint PAC and PVC detection in noisy, single-lead ECG recordings, suitable for integration into Holter and wearable monitoring systems. Full article
Show Figures

Figure 1

Back to TopTop