-
Gated Attention-Augmented Double U-Net for White Blood Cell Segmentation -
Symbolic Regression for Interpretable Camera Calibration -
GATF-PCQA: A Graph Attention Transformer Fusion Network for Point Cloud Quality Assessment -
Multi-Channel Spectro-Temporal Representations for Parkinson’s Detection -
Image Matching: Foundations, State of the Art, and Future Directions
Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques, published online monthly by MDPI.
- Open Accessfree for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q2 (Imaging Science and Photographic Technology) / CiteScore - Q1 (Radiology, Nuclear Medicine and Imaging)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 15.3 days after submission; acceptance to publication is undertaken in 3.5 days (median values for papers published in this journal in the first half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
3.3 (2024);
5-Year Impact Factor:
3.3 (2024)
Latest Articles
Development of a Multispectral Image Database in Visible–Near–Infrared for Demosaicking and Machine Learning Applications
J. Imaging 2026, 12(1), 2; https://doi.org/10.3390/jimaging12010002 (registering DOI) - 20 Dec 2025
Abstract
The use of Multispectral (MS) imaging is growing fast across many research fields. However, one of the obstacles researchers face is the limited availability of multispectral image databases. This arises from two factors: multispectral cameras are a relatively recent technology, and they are
[...] Read more.
The use of Multispectral (MS) imaging is growing fast across many research fields. However, one of the obstacles researchers face is the limited availability of multispectral image databases. This arises from two factors: multispectral cameras are a relatively recent technology, and they are not widely available. Hence, the development of an image database is crucial for research on multispectral images. This study takes advantage of two high-end MS cameras in visible and near-infrared based on filter array technology developed in the PImRob platform, the University of Burgundy, to provide a freely accessible database. The database includes high-resolution MS images taken from different plants and weeds, along with annotated images and masks. The original raw images and the demosaicked images have been provided. The database has been developed for research on demosaicking techniques, segmentation algorithms, or deep learning for crop/weed discrimination.
Full article
(This article belongs to the Special Issue Imaging Applications in Agriculture)
►
Show Figures
Open AccessArticle
Non-Destructive Mangosteen Volume Estimation via Multi-View Instance Segmentation and Hybrid Geometric Modeling
by
Wattanapong Kurdthongmee, Arsanchai Sukkuea, Md Eshrat E Alahi and Qi Zeng
J. Imaging 2026, 12(1), 1; https://doi.org/10.3390/jimaging12010001 - 19 Dec 2025
Abstract
In precision agriculture, accurate, non-destructive estimation of fruit volume is crucial for quality grading, yield prediction, and post-harvest management. While vision-based methods provided some usefulness, fruits with complex geometry—such as mangosteen (Garcinia mangostana L.)—are difficult due to their large calyx, which may
[...] Read more.
In precision agriculture, accurate, non-destructive estimation of fruit volume is crucial for quality grading, yield prediction, and post-harvest management. While vision-based methods provided some usefulness, fruits with complex geometry—such as mangosteen (Garcinia mangostana L.)—are difficult due to their large calyx, which may lead to difficulties in solving using traditional form-modeling methods. Traditional geometric solutions such as ellipsoid approximations, diameter–height estimation, and shape-from-silhouette reconstruction often fail because the irregular calyx generates asymmetric protrusions that violate their basic form assumptions. We offer a novel study framework employing both multi-view instance segmentation and hybrid geometrical feature modeling to quantitatively model mangosteen volume with traditional 2D imaging. A You Only Look Once (YOLO)-based segmentation model was employed to explicitly separate the fruit body from the calyx. Calyx inclusion resulted in dense geometric noise and reduced model performance ( ). We trained eight regression models on a curated and augmented 900 image dataset ( , test ). The models used single-view and multi-view geometric regressors ( ), polynomial hybrid configurations, ellipsoid-based approximations, as well as hybrid feature formulations. Multi-view models consistently outperformed single-view models, and the average predictive accuracy improved from to . The best model is indeed a hybrid linear regression model with side- and bottom-area features—( , )—combined with ellipsoid-derived volume estimation—( )—which resulted in , a Mean Absolute Percentage Error (MAPE) of 16.04%, and a Root Mean Square Error (RMSE) of 31.9 on the test set. These results confirm the proposed model as a low-cost, interpretable, and flexible model for real-time fruit volume estimation, ready for incorporation into automated sorting and grading systems integrated in post-harvest processing pipelines.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessReview
A Structured Review and Quantitative Profiling of Public Brain MRI Datasets for Foundation Model Development
by
Minh Sao Khue Luu, Margaret V. Benedichuk, Ekaterina I. Roppert, Roman M. Kenzhin and Bair N. Tuchinov
J. Imaging 2025, 11(12), 454; https://doi.org/10.3390/jimaging11120454 - 18 Dec 2025
Abstract
The development of foundation models for brain MRI depends critically on the scale, diversity, and consistency of available data, yet systematic assessments of these factors remain scarce. In this study, we analyze 54 publicly accessible brain MRI datasets encompassing over 538,031 scans to
[...] Read more.
The development of foundation models for brain MRI depends critically on the scale, diversity, and consistency of available data, yet systematic assessments of these factors remain scarce. In this study, we analyze 54 publicly accessible brain MRI datasets encompassing over 538,031 scans to provide a structured, multi-level overview tailored to foundation model development. At the dataset level, we characterize modality composition, disease coverage, and dataset scale, revealing strong imbalances between large healthy cohorts and smaller clinical populations. At the image level, we quantify voxel spacing, orientation, and intensity distributions across 14 representative datasets, demonstrating substantial heterogeneity that can influence representation learning. We then perform a quantitative evaluation of preprocessing variability, examining how intensity normalization, bias field correction, skull stripping, spatial registration, and interpolation alter voxel statistics and geometry. While these steps improve within-dataset consistency, residual differences persist between datasets. Finally, a feature-space case study using a 3D DenseNet121 shows measurable residual covariate shift after standardized preprocessing, confirming that harmonization alone cannot eliminate inter-dataset bias. Together, these analyses provide a unified characterization of variability in public brain MRI resources and emphasize the need for preprocessing-aware and domain-adaptive strategies in the design of generalizable brain MRI foundation models.
Full article
(This article belongs to the Special Issue Self-Supervised Learning and Multimodal Foundation Models for AI-Driven Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Salient Object Detection in Optical Remote Sensing Images Based on Hierarchical Semantic Interaction
by
Jingfan Xu, Qi Zhang, Jinwen Xing, Mingquan Zhou and Guohua Geng
J. Imaging 2025, 11(12), 453; https://doi.org/10.3390/jimaging11120453 - 17 Dec 2025
Abstract
Existing salient object detection methods for optical remote sensing images still face certain limitations due to complex background variations, significant scale discrepancies among targets, severe background interference, and diverse topological structures. On the one hand, the feature transmission process often neglects the constraints
[...] Read more.
Existing salient object detection methods for optical remote sensing images still face certain limitations due to complex background variations, significant scale discrepancies among targets, severe background interference, and diverse topological structures. On the one hand, the feature transmission process often neglects the constraints and complementary effects of high-level features on low-level features, leading to insufficient feature interaction and weakened model representation. On the other hand, decoder architectures generally rely on simple cascaded structures, which fail to adequately exploit and utilize contextual information. To address these challenges, this study proposes a Hierarchical Semantic Interaction Module to enhance salient object detection performance in optical remote sensing scenarios. The module introduces foreground content modeling and a hierarchical semantic interaction mechanism within a multi-scale feature space, reinforcing the synergy and complementarity among features at different levels. This effectively highlights multi-scale and multi-type salient regions in complex backgrounds. Extensive experiments on multiple optical remote sensing datasets demonstrate the effectiveness of the proposed method. Specifically, on the EORSSD dataset, our full model integrating both CA and PA modules improves the max F-measure from 0.8826 to 0.9100 (↑2.74%), increases maxE from 0.9603 to 0.9727 (↑1.24%), and enhances the S-measure from 0.9026 to 0.9295 (↑2.69%) compared with the baseline. These results clearly demonstrate the effectiveness of the proposed modules and verify the robustness and strong generalization capability of our method in complex remote sensing scenarios.
Full article
(This article belongs to the Special Issue AI-Driven Remote Sensing Image Processing and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
SRE-FMaps: A Sinkhorn-Regularized Elastic Functional Map Framework for Non-Isometric 3D Shape Matching
by
Dan Zhang, Yue Zhang, Ning Wang and Dong Zhao
J. Imaging 2025, 11(12), 452; https://doi.org/10.3390/jimaging11120452 - 16 Dec 2025
Abstract
►▼
Show Figures
Precise 3D shape correspondence is a fundamental prerequisite for critical applications ranging from medical anatomical modeling to visual recognition. However, non-isometric 3D shape matching remains a challenging task due to the limited sensitivity of traditional Laplace–Beltrami (LB) bases to local geometric deformations such
[...] Read more.
Precise 3D shape correspondence is a fundamental prerequisite for critical applications ranging from medical anatomical modeling to visual recognition. However, non-isometric 3D shape matching remains a challenging task due to the limited sensitivity of traditional Laplace–Beltrami (LB) bases to local geometric deformations such as stretching and bending. To address these limitations, this paper proposes a Sinkhorn-Regularized Elastic Functional Map framework (SRE-FMaps) that integrates entropy-regularized optimal transport with an elastic thin-shell energy basis. First, a sparse Sinkhorn transport plan is adopted to initialize a bijective correspondence with linear computational complexity. Then, a non-orthogonal elastic basis, derived from the Hessian of thin-shell deformation energy, is introduced to enhance high-frequency feature perception. Finally, correspondence stability is quantified through a cosine-based elastic distance metric, enabling retrieval and classification. Experiments on the SHREC2015, McGill, and Face datasets demonstrate that SRE-FMaps reduces the correspondence error by a maximum of 32% and achieves an average of 92.3% classification accuracy (with a peak of 94.74% on the Face dataset). Moreover, the framework exhibits superior robustness, yielding a recall of up to 91.67% and an F1-score of 0.94, effectively handling bending, stretching, and folding deformations compared with conventional LB-based functional map pipelines. The proposed framework provides a scalable solution for non-isometric shape correspondence in medical modeling, 3D reconstruction, and visual recognition.
Full article

Figure 1
Open AccessArticle
Application of Generative Adversarial Networks to Improve COVID-19 Classification on Ultrasound Images
by
Pedro Sérgio Tôrres Figueiredo Silva, Antonio Mauricio Ferreira Leite Miranda de Sá, Wagner Coelho de Albuquerque Pereira, Leonardo Bonato Felix and José Manoel de Seixas
J. Imaging 2025, 11(12), 451; https://doi.org/10.3390/jimaging11120451 - 15 Dec 2025
Abstract
COVID-19 screening is crucial for the early diagnosis and treatment of the disease, with lung ultrasound posing as a cost-effective alternative to other imaging techniques. Given the dependency on medical expertise and experience to accurately identify patterns in ultrasound exams, deep learning techniques
[...] Read more.
COVID-19 screening is crucial for the early diagnosis and treatment of the disease, with lung ultrasound posing as a cost-effective alternative to other imaging techniques. Given the dependency on medical expertise and experience to accurately identify patterns in ultrasound exams, deep learning techniques have been explored for automatically classifying patients’ conditions. However, the limited availability of public medical databases remains a significant obstacle to the development of more advanced models. To address the data scarcity problem, this study proposes a method that leverages generative adversarial networks (GANs) to generate synthetic lung ultrasound images, which are subsequently used to train frame-based classification models. Two types of GANs are considered: Wasserstein GANs (WGAN) and Pix2Pix. Specific tools are used to show that the synthetic data produced present a distribution close to the original data. The classification models trained with synthetic data achieved a peak accuracy of 96.32% ± 4.17%, significantly outperforming the maximum accuracy of 82.69% ± 10.42% obtained when training only with the original data. Furthermore, the best results are comparable to, and in some cases surpass, those reported in recent related studies.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Applying Radiomics to Predict Outcomes in Patients with High-Grade Retroperitoneal Sarcoma Treated with Preoperative Radiotherapy
by
Adel Shahnam, Nicholas Hardcastle, David E. Gyorki, Katrina M. Ingley, Krystel Tran, Catherine Mitchell, Sarat Chander, Julie Chu, Michael Henderson, Alan Herschtal, Mathias Bressel and Jeremy Lewin
J. Imaging 2025, 11(12), 450; https://doi.org/10.3390/jimaging11120450 - 15 Dec 2025
Abstract
Retroperitoneal sarcomas (RPS) are rare tumours, primarily treated with surgical resection. However, recurrences are frequent. Combining clinical factors with CT-derived radiomic features could enhance treatment stratification and personalization. This study aims to assess whether radiomic features provide additional prognostic value beyond clinicopathological features
[...] Read more.
Retroperitoneal sarcomas (RPS) are rare tumours, primarily treated with surgical resection. However, recurrences are frequent. Combining clinical factors with CT-derived radiomic features could enhance treatment stratification and personalization. This study aims to assess whether radiomic features provide additional prognostic value beyond clinicopathological features in patients with high-risk RPS treated with preoperative radiotherapy. This retrospective study included patients aged 18 or older with non-recurrent and non-metastatic RPS treated with preoperative radiotherapy between 2008 and 2016. Hazard ratios (HR) were calculated using Cox proportional hazards regression to assess the impact of clinical and radiomic features on time to event outcomes. Predictive accuracy was assessed with c-statistics. Radiomic analysis was performed on the high-risk group (undifferentiated pleomorphic sarcoma, well-differentiated/de-differentiated liposarcoma or grade 2/3 leiomyosarcoma). Seventy-two patients were included, with a median follow-up of 3.7 years, the 5-year overall survival (OS) was 67%. Multivariable analysis showed older age (HR: 1.3 per 5-year increase, p = 0.04), grade 3 (HR: 180.3, p = 0.02), and larger tumours (HR: 4.0 per 10 cm increase, p = 0.02) predicted worse OS. In the higher-risk group, the c-statistic for the clinical model was 0.59 (time to distant metastasis (TDM)) and 0.56 (OS). Among 27 radiomic features, kurtosis improved OS prediction (c-statistic 0.69, p = 0.013), and Neighbourhood Gray-Tone Difference Matrix (NGTDM) busyness improved it to 0.73 (p = 0.036). Kurtosis also improved TDM prediction (c-statistic 0.72, p = 0.023). Radiomic features may complement clinicopathological factors in predicting overall survival and time to distant metastasis in high-risk retroperitoneal sarcoma. These exploratory findings warrant validation in larger, multi-institutional studies.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Sensory Representation of Neural Networks Using Sound and Color for Medical Imaging Segmentation
by
Irenel Lopo Da Silva, Nicolas Francisco Lori and José Manuel Ferreira Machado
J. Imaging 2025, 11(12), 449; https://doi.org/10.3390/jimaging11120449 - 15 Dec 2025
Abstract
This paper introduces a novel framework for sensory representation of brain imaging data, combining deep learning-based segmentation with multimodal visual and auditory outputs. Structural magnetic resonance imaging (MRI) predictions are converted into color-coded maps and stereophonic/MIDI sonifications, enabling intuitive interpretation of cortical activation
[...] Read more.
This paper introduces a novel framework for sensory representation of brain imaging data, combining deep learning-based segmentation with multimodal visual and auditory outputs. Structural magnetic resonance imaging (MRI) predictions are converted into color-coded maps and stereophonic/MIDI sonifications, enabling intuitive interpretation of cortical activation patterns. High-precision U-Net models efficiently generate these outputs, supporting clinical decision-making, cognitive research, and creative applications. Spatial, intensity, and anomalous features are encoded into perceivable visual and auditory cues, facilitating early detection and introducing the concept of “auditory biomarkers” for potential pathological identification. Despite current limitations, including dataset size, absence of clinical validation, and heuristic-based sonification, the pipeline demonstrates technical feasibility and robustness. Future work will focus on clinical user studies, the application of functional MRI (fMRI) time-series for dynamic sonification, and the integration of real-time emotional feedback in cinematic contexts. This multisensory approach offers a promising avenue for enhancing the interpretability of complex neuroimaging data across medical, research, and artistic domains.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Graphical abstract
Open AccessArticle
Dual-Path Convolutional Neural Network with Squeeze-and-Excitation Attention for Lung and Colon Histopathology Classification
by
Helala AlShehri
J. Imaging 2025, 11(12), 448; https://doi.org/10.3390/jimaging11120448 - 14 Dec 2025
Abstract
Lung and colon cancers remain among the leading causes of cancer-related mortality worldwide, underscoring the need for rapid and accurate histopathological diagnosis. Manual examination of biopsy slides is often time-consuming and prone to inter-observer variability, which highlights the importance of developing reliable and
[...] Read more.
Lung and colon cancers remain among the leading causes of cancer-related mortality worldwide, underscoring the need for rapid and accurate histopathological diagnosis. Manual examination of biopsy slides is often time-consuming and prone to inter-observer variability, which highlights the importance of developing reliable and explainable automated diagnostic systems. This study presents DPCSE-Net, a lightweight dual-path convolutional neural network enhanced with a squeeze-and-excitation (SE) attention mechanism for lung and colon cancer classification. The dual-path structure captures both fine-grained cellular textures and global contextual information through multiscale feature extraction, while the SE attention module adaptively recalibrates channel responses to emphasize discriminative features. To enhance transparency and interpretability, Gradient-weighted Class Activation Mapping (Grad-CAM), attention heatmaps, and Integrated Gradients are employed to visualize class-specific activation patterns and verify that the model’s focus aligns with diagnostically relevant tissue regions. Evaluated on the publicly available LC25000 dataset, DPCSE-Net achieved state-of-the-art performance with 99.88% accuracy and F1-score, while maintaining low computational complexity. Ablation experiments confirmed the contribution of the dual-path design and SE module, and qualitative analyses demonstrated the model’s strong interpretability. These results establish DPCSE-Net as an accurate, efficient, and explainable framework for computer-aided histopathological diagnosis, supporting the broader goals of explainable AI in computer vision.
Full article
(This article belongs to the Special Issue Explainable AI in Computer Vision)
►▼
Show Figures

Figure 1
Open AccessArticle
Enhanced Object Detection Algorithms in Complex Environments via Improved CycleGAN Data Augmentation and AS-YOLO Framework
by
Zhen Li, Yuxuan Wang, Lingzhong Meng, Wenjuan Chu and Guang Yang
J. Imaging 2025, 11(12), 447; https://doi.org/10.3390/jimaging11120447 - 12 Dec 2025
Abstract
Object detection in complex environments, such as challenging lighting conditions, adverse weather, and target occlusions, poses significant difficulties for existing algorithms. To address these challenges, this study introduces a collaborative solution integrating improved CycleGAN-based data augmentation and an enhanced object detection framework, AS-YOLO.
[...] Read more.
Object detection in complex environments, such as challenging lighting conditions, adverse weather, and target occlusions, poses significant difficulties for existing algorithms. To address these challenges, this study introduces a collaborative solution integrating improved CycleGAN-based data augmentation and an enhanced object detection framework, AS-YOLO. The improved CycleGAN incorporates a dual self-attention mechanism and spectral normalization to enhance feature capture and training stability. The AS-YOLO framework integrates a channel–spatial parallel attention mechanism, an AFPN structure for improved feature fusion, and the Inner_IoU loss function for better generalization. The experimental results show that compared with YOLOv8n, mAP@0.5 and mAP@0.95 of the AS-YOLO algorithm have increased by 1.5% and 0.6%, respectively. After data augmentation and style transfer, mAP@0.5 and mAP@0.95 have increased by 14.6% and 17.8%, respectively, demonstrating the effectiveness of the proposed method in improving the performance of the model in complex scenarios.
Full article
(This article belongs to the Special Issue Advances in Machine Learning for Computer Vision Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Pixel-Wise Sky-Obstacle Segmentation in Fisheye Imagery Using Deep Learning and Gradient Boosting
by
Némo Bouillon and Vincent Boitier
J. Imaging 2025, 11(12), 446; https://doi.org/10.3390/jimaging11120446 - 12 Dec 2025
Abstract
Accurate sky–obstacle segmentation in hemispherical fisheye imagery is essential for solar irradiance forecasting, photovoltaic system design, and environmental monitoring. However, existing methods often rely on expensive all-sky imagers and region-specific training data, produce coarse sky–obstacle boundaries, and ignore the optical properties of fisheye
[...] Read more.
Accurate sky–obstacle segmentation in hemispherical fisheye imagery is essential for solar irradiance forecasting, photovoltaic system design, and environmental monitoring. However, existing methods often rely on expensive all-sky imagers and region-specific training data, produce coarse sky–obstacle boundaries, and ignore the optical properties of fisheye lenses. We propose a low-cost segmentation framework designed for fisheye imagery that combines synthetic data generation, lens-aware augmentation, and a hybrid deep-learning pipeline. Synthetic fisheye training images are created from publicly available street-view panoramas to cover diverse environments without dedicated hardware, and lens-aware augmentations model fisheye projection and photometric effects to improve robustness across devices. On this dataset, we train a convolutional neural network (CNN) and refine its output with gradient-boosted decision trees (GBDT) to sharpen sky–obstacle boundaries. The method is evaluated on real fisheye images captured with smartphones and low-cost clip-on lenses across multiple sites, achieving an Intersection over Union (IoU) of 96.63% and an F1 score of 98.29%, along with high boundary accuracy. An additional evaluation on an external panoramic baseline dataset confirms strong cross-dataset generalization. Together, these results show that the proposed framework enables accurate, low-cost, and widely deployable hemispherical sky segmentation for practical solar and environmental imaging applications.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Research on Augmentation of Wood Microscopic Image Dataset Based on Generative Adversarial Networks
by
Shuo Xu, Hang Su and Lei Zhao
J. Imaging 2025, 11(12), 445; https://doi.org/10.3390/jimaging11120445 - 12 Dec 2025
Abstract
Microscopic wood images are vital in wood analysis and classification research. However, the high cost of acquiring microscopic images and the limitations of experimental conditions have led to a severe problem of insufficient sample data, which significantly restricts the training performance and generalization
[...] Read more.
Microscopic wood images are vital in wood analysis and classification research. However, the high cost of acquiring microscopic images and the limitations of experimental conditions have led to a severe problem of insufficient sample data, which significantly restricts the training performance and generalization ability of deep learning models. This study first used basic image processing techniques to perform preliminary augmentation of the original dataset. The augmented data were then input into five GAN models, BGAN, DCGAN, WGAN-GP, LSGAN, and StyleGAN2, for training. The quality and model performance of the generated images were assessed by analyzing the degree of fidelity of cellular structure (e.g., earlywood, latewood, and wood rays), image clarity, and diversity of the images for each model-generated image, as well as by using KID, IS, and SSIM. The results showed that images generated by BGAN and WGAN-GP exhibited high quality, with lower KID values and higher IS values, and the generated images were visually close to real images. In contrast, the DCGAN, LSGAN, and StyleGAN2 models experienced mode collapse during training, resulting in lower image clarity and diversity compared to the other models. Through a comparative analysis of different GAN models, this study demonstrates the feasibility and effectiveness of Generative Adversarial Networks in the domain of small-sample image data augmentation, providing an important reference for further research in the field of wood identification.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
AI-Driven Clinical Decision Support System for Automated Ventriculomegaly Classification from Fetal Brain MRI
by
Mannam Subbarao, Simi Surendran, Seena Thomas, Hemanth Lakshman, Vinjanampati Goutham, Keshagani Goud and Suhas Udayakumaran
J. Imaging 2025, 11(12), 444; https://doi.org/10.3390/jimaging11120444 - 12 Dec 2025
Abstract
Fetal ventriculomegaly (VM) is a condition characterized by abnormal enlargement of the cerebral ventricles of the fetus brain that often causes developmental disorders in children. Manual segmentation and classification of ventricular structures from brain MRI scans are time-consuming and require clinical expertise. To
[...] Read more.
Fetal ventriculomegaly (VM) is a condition characterized by abnormal enlargement of the cerebral ventricles of the fetus brain that often causes developmental disorders in children. Manual segmentation and classification of ventricular structures from brain MRI scans are time-consuming and require clinical expertise. To address this challenge, we develop an automated pipeline for ventricle segmentation, ventricular width estimation, and VM severity classification using a publicly available dataset. An adaptive slice selection strategy converts 3D MRI volumes into the most informative 2D slices, which are then segmented to isolate the lateral ventricles and deep gray matter. Ventricular width is automatically estimated to assign severity levels based on clinical thresholds, generating labeled data for training a deep learning classifier. Finally, an explainability module using a large language model integrates the MRI slices, segmentation masks, and predicted severity to provide interpretable clinical reasoning. Experimental results demonstrate that the proposed decision support system delivers robust performance, achieving dice scores of 89% and 87.5% for the 2D and 3D segmentation models, respectively. Also, the classification network attains an accuracy of 86% and an F1-score of 0.84 in VM analysis.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
VMPANet: Vision Mamba Skin Lesion Image Segmentation Model Based on Prompt and Attention Mechanism Fusion
by
Zinuo Peng, Shuxian Liu and Chenhao Li
J. Imaging 2025, 11(12), 443; https://doi.org/10.3390/jimaging11120443 - 11 Dec 2025
Abstract
In the realm of medical image processing, the segmentation of dermatological lesions is a pivotal technique for the early detection of skin cancer. However, existing methods for segmenting images of skin lesions often encounter limitations when dealing with intricate boundaries and diverse lesion
[...] Read more.
In the realm of medical image processing, the segmentation of dermatological lesions is a pivotal technique for the early detection of skin cancer. However, existing methods for segmenting images of skin lesions often encounter limitations when dealing with intricate boundaries and diverse lesion shapes. To address these challenges, we propose VMPANet, designed to accurately localize critical targets and capture edge structures. VMPANet employs an inverted pyramid convolution to extract multi-scale features while utilizing the visual Mamba module to capture long-range dependencies among image features. Additionally, we leverage previously extracted masks as cues to facilitate efficient feature propagation. Furthermore, VMPANet integrates parallel depthwise separable convolutions to enhance feature extraction and introduces innovative mechanisms for edge enhancement, spatial attention, and channel attention to adaptively extract edge information and complex spatial relationships. Notably, VMPANet refines a novel cross-attention mechanism, which effectively facilitates the interaction between deep semantic cues and shallow texture details, thereby generating comprehensive feature representations while reducing computational load and redundancy. We conducted comparative and ablation experiments on two public skin lesion datasets (ISIC2017 and ISIC2018). The results demonstrate that VMPANet outperforms existing mainstream methods. On the ISIC2017 dataset, its mIoU and DSC metrics are 1.38% and 0.83% higher than those of VM-Unet respectively; on the ISIC2018 dataset, these metrics are 1.10% and 0.67% higher than those of EMCAD, respectively. Moreover, VMPANet boasts a parameter count of only 0.383 M and a computational load of 1.159 GFLOPs.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
HDR Merging of RAW Exposure Series for All-Sky Cameras: A Comparative Study for Circumsolar Radiometry
by
Paul Matteschk, Max Aragón, Jose Gomez, Jacob K. Thorning, Stefanie Meilinger and Sebastian Houben
J. Imaging 2025, 11(12), 442; https://doi.org/10.3390/jimaging11120442 - 11 Dec 2025
Abstract
All-sky imagers (ASIs) used in solar energy meteorology face an extreme intra-image dynamic range, with the circumsolar neighborhood orders of magnitude brighter than the diffuse dome. Many operational ASI pipelines address this gap with high-dynamic-range (HDR) bracketing inside the camera’s image signal processor
[...] Read more.
All-sky imagers (ASIs) used in solar energy meteorology face an extreme intra-image dynamic range, with the circumsolar neighborhood orders of magnitude brighter than the diffuse dome. Many operational ASI pipelines address this gap with high-dynamic-range (HDR) bracketing inside the camera’s image signal processor (ISP), i.e., after demosaicing and color processing in a nonlinear 8-bit RGB domain. Near the Sun, such ISP-domain HDR can down-weight the shortest exposure, retain clipped or near-clipped samples from longer frames, and compress highlight contrast, thereby increasing circumsolar saturation and flattening aureole gradients. A radiance-linear HDR fusion in the sensor/RAW domain (RAW–HDR) is therefore contrasted with the vendor ISP-based HDR mode (ISP–HDR). Solar-based geometric calibration enables Sun-centered analysis. Paired, interleaved acquisitions under clear-sky and broken-cloud conditions are evaluated using two circumsolar performance criteria per RGB channel: (i) saturated-area fraction in concentric rings and (ii) a median-based radial gradient in defined arcs. All quantitative analyses operate on the radiance-linear HDR result; post-merge tone mapping is only used for visualization. Across conditions, ISP–HDR exhibits roughly double the near-saturation within 0– of the Sun and about a three- to fourfold weaker circumsolar radial gradient within 0– relative to RAW–HDR. These findings indicate that radiance-linear fusion in the RAW domain better preserves circumsolar structure than the examined ISP-domain HDR mode and thus provides more suitable input for downstream tasks such as cloud–edge detection, aerosol retrieval, and irradiance estimation.
Full article
(This article belongs to the Special Issue Techniques and Applications of Sky Imagers)
►▼
Show Figures

Graphical abstract
Open AccessArticle
WaveletHSI: Direct HSI Classification from Compressed Wavelet Coefficients via Sub-Band Feature Extraction and Fusion
by
Xin Li and Baile Sun
J. Imaging 2025, 11(12), 441; https://doi.org/10.3390/jimaging11120441 - 10 Dec 2025
Abstract
A major computational bottleneck in classifying large-scale hyperspectral images (HSI) is the mandatory data decompression prior to processing. Compressed-domain computing offers a solution by enabling deep learning on partially compressed data. However, existing compressed-domain methods are predominantly tailored for the Discrete Cosine Transform
[...] Read more.
A major computational bottleneck in classifying large-scale hyperspectral images (HSI) is the mandatory data decompression prior to processing. Compressed-domain computing offers a solution by enabling deep learning on partially compressed data. However, existing compressed-domain methods are predominantly tailored for the Discrete Cosine Transform (DCT) used in natural images, while HSIs are typically compressed using the Discrete Wavelet Transform (DWT). The fundamental structural mismatch between the block-based DCT and the hierarchical DWT sub-bands presents two core challenges: how to extract features from multiple wavelet sub-bands, and how to fuse these features effectively? To address these issues, we propose a novel framework that extracts and fuses features from different DWT sub-bands directly. We design a multi-branch feature extractor with sub-band feature alignment loss that processes functionally different sub-bands in parallel, preserving the independence of each frequency feature. We then employ a sub-band cross-attention mechanism that inverts the typical attention paradigm by using the sparse, high-frequency detail sub-bands as queries to adaptively select and enhance salient features from the dense, information-rich low-frequency sub-bands. This enables a targeted fusion of global context and fine-grained structural information without data reconstruction. Experiments on three benchmark datasets demonstrate that our method achieves classification accuracy comparable to state-of-the-art spatial-domain approaches while eliminating at least 56% of the decompression overhead.
Full article
(This article belongs to the Special Issue Multispectral and Hyperspectral Imaging: Progress and Challenges)
►▼
Show Figures

Figure 1
Open AccessArticle
Hybrid Multi-Scale Neural Network with Attention-Based Fusion for Fruit Crop Disease Identification
by
Shakhmaran Seilov, Akniyet Nurzhaubayev, Marat Baideldinov, Bibinur Zhursinbek, Medet Ashimgaliyev and Ainur Zhumadillayeva
J. Imaging 2025, 11(12), 440; https://doi.org/10.3390/jimaging11120440 - 10 Dec 2025
Abstract
Unobserved fruit crop illnesses are a major threat to agricultural productivity worldwide and frequently cause farmers to suffer large financial losses. Manual field inspection-based disease detection techniques are time-consuming, unreliable, and unsuitable for extensive monitoring. Deep learning approaches, in particular convolutional neural networks,
[...] Read more.
Unobserved fruit crop illnesses are a major threat to agricultural productivity worldwide and frequently cause farmers to suffer large financial losses. Manual field inspection-based disease detection techniques are time-consuming, unreliable, and unsuitable for extensive monitoring. Deep learning approaches, in particular convolutional neural networks, have shown promise for automated plant disease identification, although they still face significant obstacles. These include poor generalization across complicated visual backdrops, limited resilience to different illness sizes, and high processing needs that make deployment on resource-constrained edge devices difficult. We suggest a Hybrid Multi-Scale Neural Network (HMCT-AF with GSAF) architecture for precise and effective fruit crop disease identification in order to overcome these drawbacks. In order to extract long-range dependencies, HMCT-AF with GSAF combines a Vision Transformer-based structural branch with multi-scale convolutional branches to capture both high-level contextual patterns and fine-grained local information. These disparate features are adaptively combined using a novel HMCT-AF with a GSAF module, which enhances model interpretability and classification performance. We conduct evaluations on both PlantVillage (controlled environment) and CLD (real-world in-field conditions), observing consistent performance gains that indicate strong resilience to natural lighting variations and background complexity. With an accuracy of up to 93.79%, HMCT-AF with GSAF outperforms vanilla Transformer models, EfficientNet, and traditional CNNs. These findings demonstrate how well the model captures scale-variant disease symptoms and how it may be used in real-time agricultural applications using hardware that is compatible with the edge. According to our research, HMCT-AF with GSAF presents a viable basis for intelligent, scalable plant disease monitoring systems in contemporary precision farming.
Full article
(This article belongs to the Special Issue Computer Vision for Food Data Analysis: Methods, Challenges, and Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Application of Artificial Intelligence and Computer Vision for Measuring and Counting Oysters
by
Julio Antonio Laria Pino, Jesús David Terán Villanueva, Julio Laria Menchaca, Leobardo Garcia Solorio, Salvador Ibarra Martínez, Mirna Patricia Ponce Flores and Aurelio Alejandro Santiago Pineda
J. Imaging 2025, 11(12), 439; https://doi.org/10.3390/jimaging11120439 - 10 Dec 2025
Abstract
One of the most important activities in any oyster farm is the measurement of oyster size; this activity is time-consuming and conducted manually, generally using a caliper, which leads to high measurement variability. This paper proposes a methodology to count and obtain the
[...] Read more.
One of the most important activities in any oyster farm is the measurement of oyster size; this activity is time-consuming and conducted manually, generally using a caliper, which leads to high measurement variability. This paper proposes a methodology to count and obtain the length and width averages of a sample of oysters from an image, relying on artificial intelligence (AI), which refers to systems capable of learning and decision-making, and computer vision (CV), which enables the extraction of information from digital images. The proposed approach employs the DBScan clustering algorithm, an artificial neural network (ANN), and a random forest classifier to enable automatic oyster classification, counting, and size estimation from images. As a result of the proposed methodology, the speed in measuring the length and width of the oysters was 86.7 times faster than manual measurement. Regarding the counting, the process missed the total count of oysters in two of the ten images. These results demonstrate the feasibility of using the proposed methodology to measure oyster size and count in oyster farms.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
Texture-Based Preprocessing Framework with nnU-Net Model for Accurate Intracranial Artery Segmentation
by
Kyuseok Kim and Ji-Youn Kim
J. Imaging 2025, 11(12), 438; https://doi.org/10.3390/jimaging11120438 - 9 Dec 2025
Abstract
Accurate intracranial artery segmentation from digital subtraction angiography (DSA) is critical for neurovascular diagnosis and intervention planning. Vascular extraction, which combines preprocessing methods and deep learning models, yields a high level of results, but limited preprocessing results constrain the improvement of results. We
[...] Read more.
Accurate intracranial artery segmentation from digital subtraction angiography (DSA) is critical for neurovascular diagnosis and intervention planning. Vascular extraction, which combines preprocessing methods and deep learning models, yields a high level of results, but limited preprocessing results constrain the improvement of results. We propose a texture-based contrast enhancement preprocessing framework integrated with the nnU-Net model to improve vessel segmentation in time-sequential DSA images. The method generates a combined feature mask by fusing local contrast, local entropy, and brightness threshold maps, which is then used as input for deep learning–based segmentation. Segmentation performance was evaluated using the DIAS dataset with various standard quantitative metrics. The proposed preprocessing significantly improved segmentation across all metrics compared to both the baseline and contrast-limited adaptive histogram equalization (CLAHE). Using nnU-Net, the method achieved a Dice Similarity Coefficient (DICE) of 0.83 ± 0.20 and an Intersection over Union (IoU) of 0.72 ± 0.14, outperforming CLAHE (DICE 0.79 ± 0.41, IoU 0.70 ± 0.23) and the baseline (DICE 0.65 ± 0.15, IoU 0.47 ± 0.20). Most notably, vessel connectivity (VC) dropped by over 65% relative to unprocessed images, indicating marked improvements in VC and topological accuracy. This study demonstrates that combining texture-based preprocessing with nnU-Net delivers robust, noise-tolerant, and clinically interpretable segmentation of intracranial arteries from DSA.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Domain-Adaptive Segment Anything Model for Cross-Domain Water Body Segmentation in Satellite Imagery
by
Lihong Yang, Pengfei Liu, Guilong Zhang, Huaici Zhao and Chunyang Zhao
J. Imaging 2025, 11(12), 437; https://doi.org/10.3390/jimaging11120437 - 9 Dec 2025
Abstract
Monitoring surface water bodies is crucial for environmental protection and resource management. Existing segmentation methods often struggle with limited generalization across different satellite domains. We propose DASAM, a domain-adaptive Segment Anything Model for cross-domain water body segmentation in satellite imagery. The core innovation
[...] Read more.
Monitoring surface water bodies is crucial for environmental protection and resource management. Existing segmentation methods often struggle with limited generalization across different satellite domains. We propose DASAM, a domain-adaptive Segment Anything Model for cross-domain water body segmentation in satellite imagery. The core innovation of DASAM is a contrastive learning module that aligns features between source and style-augmented images, enabling robust domain generalization without requiring annotations from the target domain. Additionally, DASAM integrates a prompt-enhanced module and an encoder adapter to capture fine-grained spatial details and global context, further improving segmentation accuracy. Experiments on the China GF-2 dataset demonstrate superior performance over existing methods, while cross-domain evaluations on GLH-water and Sentinel-2 water body image datasets verify its strong generalization and robustness. These results highlight DASAM’s potential for large-scale, diverse satellite water body monitoring and accurate environmental analysis.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Journal Menu
► ▼ Journal Menu-
- J. Imaging Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Special Issues
- Topics
- Sections
- Article Processing Charge
- Indexing & Archiving
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Conferences
- Editorial Office
- 10th Anniversary
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Applied Sciences, Electronics, MAKE, J. Imaging, Sensors
Applied Computer Vision and Pattern Recognition: 2nd Edition
Topic Editors: Antonio Fernández-Caballero, Byung-Gyu KimDeadline: 31 December 2025
Topic in
Applied Sciences, Computers, Electronics, Information, J. Imaging
Visual Computing and Understanding: New Developments and Trends
Topic Editors: Wei Zhou, Guanghui Yue, Wenhan YangDeadline: 31 March 2026
Topic in
Applied Sciences, Electronics, J. Imaging, MAKE, Information, BDCC, Signals
Applications of Image and Video Processing in Medical Imaging
Topic Editors: Jyh-Cheng Chen, Kuangyu ShiDeadline: 30 April 2026
Topic in
Diagnostics, Electronics, J. Imaging, Mathematics, Sensors
Transformer and Deep Learning Applications in Image Processing
Topic Editors: Fengping An, Haitao Xu, Chuyang YeDeadline: 31 May 2026
Conferences
Special Issues
Special Issue in
J. Imaging
Novel Approaches to Image Quality Assessment
Guest Editors: Luigi Celona, Hanhe LinDeadline: 31 December 2025
Special Issue in
J. Imaging
Imaging in Healthcare: Progress and Challenges
Guest Editors: Vasileios Magoulianitis, Pawan Jogi, Spyridon ThermosDeadline: 31 December 2025
Special Issue in
J. Imaging
Underwater Imaging (2nd Edition)
Guest Editor: Yuri RzhanovDeadline: 31 December 2025
Special Issue in
J. Imaging
Next-Gen Visual Stimulators: Smart Human-Machine Interfaces for Visual Perception Assessment
Guest Editor: Francisco Ávila GómezDeadline: 31 December 2025





