-
Artificial Intelligence in Placental Pathology: New Diagnostic Imaging Tools in Evolution and in Perspective
-
Evaluating Super-Resolution Models in Biomedical Imaging: Applications and Performance in Segmentation and Classification
-
Optimizing Digital Image Quality for Improved Skin Cancer Detection
-
Towards the Performance Characterization of a Robotic Multimodal Diagnostic Imaging System
Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques published online monthly by MDPI.
- Open Accessfree for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q2 (Imaging Science and Photographic Technology) / CiteScore - Q1 (Radiology, Nuclear Medicine and Imaging)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 15.3 days after submission; acceptance to publication is undertaken in 3.5 days (median values for papers published in this journal in the first half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
3.3 (2024);
5-Year Impact Factor:
3.3 (2024)
Latest Articles
Fundus Image-Based Eye Disease Detection Using EfficientNetB3 Architecture
J. Imaging 2025, 11(8), 279; https://doi.org/10.3390/jimaging11080279 - 19 Aug 2025
Abstract
Accurate and early classification of retinal diseases such as diabetic retinopathy, cataract, and glaucoma is essential for preventing vision loss and improving clinical outcomes. Manual diagnosis from fundus images is often time-consuming and error-prone, motivating the development of automated solutions. This study proposes
[...] Read more.
Accurate and early classification of retinal diseases such as diabetic retinopathy, cataract, and glaucoma is essential for preventing vision loss and improving clinical outcomes. Manual diagnosis from fundus images is often time-consuming and error-prone, motivating the development of automated solutions. This study proposes a deep learning-based classification model using a pretrained EfficientNetB3 architecture, fine-tuned on a publicly available Kaggle retinal image dataset. The model categorizes images into four classes: cataract, diabetic retinopathy, glaucoma, and healthy. Key enhancements include transfer learning, data augmentation, and optimization via the Adam optimizer with a cosine annealing scheduler. The proposed model achieved a classification accuracy of 95.12%, with a precision of 95.21%, recall of 94.88%, F1-score of 95.00%, Dice Score of 94.91%, Jaccard Index of 91.2%, and an MCC of 0.925. These results demonstrate the model’s robustness and potential to support automated retinal disease diagnosis in clinical settings.
Full article
(This article belongs to the Section Medical Imaging)
►
Show Figures
Open AccessArticle
ODDM: Integration of SMOTE Tomek with Deep Learning on Imbalanced Color Fundus Images for Classification of Several Ocular Diseases
by
Afraz Danish Ali Qureshi, Hassaan Malik, Ahmad Naeem, Syeda Nida Hassan, Daesik Jeong and Rizwan Ali Naqvi
J. Imaging 2025, 11(8), 278; https://doi.org/10.3390/jimaging11080278 - 18 Aug 2025
Abstract
Ocular disease (OD) represents a complex medical condition affecting humans. OD diagnosis is a challenging process in the current medical system, and blindness may occur if the disease is not detected at its initial phase. Recent studies showed significant outcomes in the identification
[...] Read more.
Ocular disease (OD) represents a complex medical condition affecting humans. OD diagnosis is a challenging process in the current medical system, and blindness may occur if the disease is not detected at its initial phase. Recent studies showed significant outcomes in the identification of OD using deep learning (DL) models. Thus, this work aims to develop a multi-classification DL-based model for the classification of seven ODs, including normal (NOR), age-related macular degeneration (AMD), diabetic retinopathy (DR), glaucoma (GLU), maculopathy (MAC), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR), using color fundus images (CFIs). This work proposes a custom model named the ocular disease detection model (ODDM) based on a CNN. The proposed ODDM is trained and tested on a publicly available ocular disease dataset (ODD). Additionally, the SMOTE Tomek (SM-TOM) approach is also used to handle the imbalanced distribution of the OD images in the ODD. The performance of the ODDM is compared with seven baseline models, including DenseNet-201 (R1), EfficientNet-B0 (R2), Inception-V3 (R3), MobileNet (R4), Vgg-16 (R5), Vgg-19 (R6), and ResNet-50 (R7). The proposed ODDM obtained a 98.94% AUC, along with 97.19% accuracy, a recall of 88.74%, a precision of 95.23%, and an F1-score of 88.31% in classifying the seven different types of OD. Furthermore, ANOVA and Tukey HSD (Honestly Significant Difference) post hoc tests are also applied to represent the statistical significance of the proposed ODDM. Thus, this study concludes that the results of the proposed ODDM are superior to those of baseline models and state-of-the-art models.
Full article
(This article belongs to the Special Issue Advances in Machine Learning for Medical Imaging Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Automated Task-Transfer Function Measurement for CT Image Quality Assessment Based on AAPM TG 233
by
Choirul Anam, Riska Amilia, Ariij Naufal, Eko Hidayanto, Heri Sutanto, Lukmanda E. Lubis, Toshioh Fujibuchi and Geoff Dougherty
J. Imaging 2025, 11(8), 277; https://doi.org/10.3390/jimaging11080277 - 18 Aug 2025
Abstract
►▼
Show Figures
This study aims to develop and validate software for the automatic measurement of the task-transfer function (TTF) based on the American Association of Physicists in Medicine (AAPM) Task Group (TG) 233. The software consists of two main stages: automatic placement of the region
[...] Read more.
This study aims to develop and validate software for the automatic measurement of the task-transfer function (TTF) based on the American Association of Physicists in Medicine (AAPM) Task Group (TG) 233. The software consists of two main stages: automatic placement of the region of interest (ROI) within circular objects of the phantoms and calculating the TTF. The software was developed on four CT phantom types: computational phantom, ACR 464 CT phantom, AAPM CT phantom, and Catphan® 604 phantom. Each phantom was tested with varying parameters, including spatial resolution level, slice thickness, and image reconstruction technique. The results of TTF were compared with manual measurements performed using ImQuest version 7.3.01 and iQmetix-CT version v1.2. The software successfully located ROIs at all circular objects within each phantom and measured accurate TTF with various contrast-to-noise ratios (CNRs) of all phantoms. The TTF results were comparable to those obtained with ImQuest and iQmetrix-CT. It was found that the TTF curves produced by the software are smoother than those produced by ImQuest. An algorithm for the automated measurement of TTF was successfully developed and validated. TTF measurement with our software is highly user-friendly, requiring only a single click from the user.
Full article

Figure 1
Open AccessArticle
The Contribution of AIDA (Artificial Intelligence Dystocia Algorithm) to Cesarean Section Within Robson Classification Group
by
Antonio Malvasi, Lorenzo E. Malgieri, Michael Stark, Edoardo Di Naro, Dan Farine, Giorgio Maria Baldini, Miriam Dellino, Murat Yassa, Andrea Tinelli, Antonella Vimercati and Tommaso Difonzo
J. Imaging 2025, 11(8), 276; https://doi.org/10.3390/jimaging11080276 - 16 Aug 2025
Abstract
Global cesarean section (CS) rates continue to rise, with the Robson classification widely used for analysis. However, Robson Group 2A patients (nulliparous women with induced labor) show disproportionately high CS rates that cannot be fully explained by demographic factors alone. This study explored
[...] Read more.
Global cesarean section (CS) rates continue to rise, with the Robson classification widely used for analysis. However, Robson Group 2A patients (nulliparous women with induced labor) show disproportionately high CS rates that cannot be fully explained by demographic factors alone. This study explored how the Artificial Intelligence Dystocia Algorithm (AIDA) could enhance the Robson system by providing detailed information on geometric dystocia, thereby facilitating better understanding of factors contributing to CS and developing more targeted reduction strategies. The authors conducted a comprehensive literature review analyzing both classification systems across multiple databases and developed a theoretical framework for integration. AIDA categorized labor cases into five classes (0–4) by analyzing four key geometric parameters measured through intrapartum ultrasound: angle of progression (AoP), asynclitism degree (AD), head–symphysis distance (HSD), and midline angle (MLA). Significant asynclitism (AD ≥ 7.0 mm) was strongly associated with CS regardless of other parameters, potentially explaining many “failure to progress” cases in Robson Group 2A patients. The proposed integration created a combined classification providing both population-level and individual geometric risk assessment. The integration of AIDA with the Robson classification represented a potentially valuable advancement in CS risk assessment, combining population-level stratification with individual-level geometric assessment to enable more personalized obstetric care. Future validation studies across diverse settings are needed to establish clinical utility.
Full article
(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
A Lightweight CNN for Multiclass Retinal Disease Screening with Explainable AI
by
Arjun Kumar Bose Arnob, Muhammad Hasibur Rashid Chayon, Fahmid Al Farid, Mohd Nizam Husen and Firoz Ahmed
J. Imaging 2025, 11(8), 275; https://doi.org/10.3390/jimaging11080275 - 15 Aug 2025
Abstract
Timely, balanced, and transparent detection of retinal diseases is essential to avert irreversible vision loss; however, current deep learning screeners are hampered by class imbalance, large models, and opaque reasoning. This paper presents a lightweight attention-augmented convolutional neural network (CNN) that addresses all
[...] Read more.
Timely, balanced, and transparent detection of retinal diseases is essential to avert irreversible vision loss; however, current deep learning screeners are hampered by class imbalance, large models, and opaque reasoning. This paper presents a lightweight attention-augmented convolutional neural network (CNN) that addresses all three barriers. The network combines depthwise separable convolutions, squeeze-and-excitation, and global-context attention, and it incorporates gradient-based class activation mapping (Grad-CAM) and Grad-CAM++ to ensure that every decision is accompanied by pixel-level evidence. A 5335-image ten-class color-fundus dataset from Bangladeshi clinics, which was severely skewed (17–1509 images per class), was equalized using a synthetic minority oversampling technique (SMOTE) and task-specific augmentations. Images were resized to px and split 70:15:15. The training used the adaptive moment estimation (Adam) optimizer (initial learning rate of , reduce-on-plateau, early stopping), regularization, and dual dropout. The 16.6 M parameter network converged in fewer than 50 epochs on a mid-range graphics processing unit (GPU) and reached 87.9% test accuracy, a macro-precision of 0.882, a macro-recall of 0.879, and a macro-F1-score of 0.880, reducing the error by 58% relative to the best ImageNet backbone (Inception-V3, 40.4% accuracy). Eight disorders recorded true-positive rates above 95%; macular scar and central serous chorioretinopathy attained F1-scores of 0.77 and 0.89, respectively. Saliency maps consistently highlighted optic disc margins, subretinal fluid, and other hallmarks. Targeted class re-balancing, lightweight attention, and integrated explainability, therefore, deliver accurate, transparent, and deployable retinal screening suitable for point-of-care ophthalmic triage on resource-limited hardware.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Deep Learning-Based Nuclei Segmentation and Melanoma Detection in Skin Histopathological Image Using Test Image Augmentation and Ensemble Model
by
Mohammadesmaeil Akbarpour, Hamed Fazlollahiaghamalek, Mahdi Barati, Mehrdad Hashemi Kamangar and Mrinal Mandal
J. Imaging 2025, 11(8), 274; https://doi.org/10.3390/jimaging11080274 - 15 Aug 2025
Abstract
Histopathological images play a crucial role in diagnosing skin cancer. However, due to the very large size of digital histopathological images (typically in the order of billion pixels), manual image analysis is tedious and time-consuming. Therefore, there has been significant interest in developing
[...] Read more.
Histopathological images play a crucial role in diagnosing skin cancer. However, due to the very large size of digital histopathological images (typically in the order of billion pixels), manual image analysis is tedious and time-consuming. Therefore, there has been significant interest in developing Artificial Intelligence (AI)-enabled computer-aided diagnosis (CAD) techniques for skin cancer detection. Due to the diversity of uncertain cell boundaries, automated nuclei segmentation of histopathological images remains challenging. Automating the identification of abnormal cell nuclei and analyzing their distribution across multiple tissue sections can significantly expedite comprehensive diagnostic assessments. In this paper, a deep neural network (DNN)-based technique is proposed to segment nuclei and detect melanoma in histopathological images. To achieve a robust performance, a test image is first augmented by various geometric operations. The augmented images are then passed through the DNN and the individual outputs are combined to obtain the final nuclei-segmented image. A morphological technique is then applied on the nuclei-segmented image to detect the melanoma region in the image. Experimental results show that the proposed technique can achieve a Dice score of 91.61% and 87.9% for nuclei segmentation and melanoma detection, respectively.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Bangla Speech Emotion Recognition Using Deep Learning-Based Ensemble Learning and Feature Fusion
by
Md. Shahid Ahammed Shakil, Fahmid Al Farid, Nitun Kumar Podder, S. M. Hasan Sazzad Iqbal, Abu Saleh Musa Miah, Md Abdur Rahim and Hezerul Abdul Karim
J. Imaging 2025, 11(8), 273; https://doi.org/10.3390/jimaging11080273 - 14 Aug 2025
Abstract
Emotion recognition in speech is essential for enhancing human–computer interaction (HCI) systems. Despite progress in Bangla speech emotion recognition, challenges remain, including low accuracy, speaker dependency, and poor generalization across emotional expressions. Previous approaches often rely on traditional machine learning or basic deep
[...] Read more.
Emotion recognition in speech is essential for enhancing human–computer interaction (HCI) systems. Despite progress in Bangla speech emotion recognition, challenges remain, including low accuracy, speaker dependency, and poor generalization across emotional expressions. Previous approaches often rely on traditional machine learning or basic deep learning models, struggling with robustness and accuracy in noisy or varied data. In this study, we propose a novel multi-stream deep learning feature fusion approach for Bangla speech emotion recognition, addressing the limitations of existing methods. Our approach begins with various data augmentation techniques applied to the training dataset, enhancing the model’s robustness and generalization. We then extract a comprehensive set of handcrafted features, including Zero-Crossing Rate (ZCR), chromagram, spectral centroid, spectral roll-off, spectral contrast, spectral flatness, Mel-Frequency Cepstral Coefficients (MFCCs), Root Mean Square (RMS) energy, and Mel-spectrogram. Although these features are used as 1D numerical vectors, some of them are computed from time–frequency representations (e.g., chromagram, Mel-spectrogram) that can themselves be depicted as images, which is conceptually close to imaging-based analysis. These features capture key characteristics of the speech signal, providing valuable insights into the emotional content. Sequentially, we utilize a multi-stream deep learning architecture to automatically learn complex, hierarchical representations of the speech signal. This architecture consists of three distinct streams: the first stream uses 1D convolutional neural networks (1D CNNs), the second integrates 1D CNN with Long Short-Term Memory (LSTM), and the third combines 1D CNNs with bidirectional LSTM (Bi-LSTM). These models capture intricate emotional nuances that handcrafted features alone may not fully represent. For each of these models, we generate predicted scores and then employ ensemble learning with a soft voting technique to produce the final prediction. This fusion of handcrafted features, deep learning-derived features, and ensemble voting enhances the accuracy and robustness of emotion identification across multiple datasets. Our method demonstrates the effectiveness of combining various learning models to improve emotion recognition in Bangla speech, providing a more comprehensive solution compared with existing methods. We utilize three primary datasets—SUBESCO, BanglaSER, and a merged version of both—as well as two external datasets, RAVDESS and EMODB, to assess the performance of our models. Our method achieves impressive results with accuracies of 92.90%, 85.20%, 90.63%, 67.71%, and 69.25% for the SUBESCO, BanglaSER, merged SUBESCO and BanglaSER, RAVDESS, and EMODB datasets, respectively. These results demonstrate the effectiveness of combining handcrafted features with deep learning-based features through ensemble learning for robust emotion recognition in Bangla speech.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
Digital Image Processing and Convolutional Neural Network Applied to Detect Mitral Stenosis in Echocardiograms: Clinical Decision Support
by
Genilton de França Barros Filho, José Fernando de Morais Firmino, Israel Solha, Ewerton Freitas de Medeiros, Alex dos Santos Felix, José Carlos de Lima Júnior, Marcelo Dantas Tavares de Melo and Marcelo Cavalcanti Rodrigues
J. Imaging 2025, 11(8), 272; https://doi.org/10.3390/jimaging11080272 - 14 Aug 2025
Abstract
The mitral valve is the most susceptible to pathological alterations, such as mitral stenosis, characterized by failure of the valve to open completely. In this context, the objective of this study was to apply digital image processing (DIP) and develop a convolutional neural
[...] Read more.
The mitral valve is the most susceptible to pathological alterations, such as mitral stenosis, characterized by failure of the valve to open completely. In this context, the objective of this study was to apply digital image processing (DIP) and develop a convolutional neural network (CNN) to provide decision support for specialists in the diagnosis of mitral stenosis based on transesophageal echocardiography examinations. The following procedures were implemented: acquisition of echocardiogram exams; application of DIP; use of augmentation techniques; and development of a CNN. The DIP classified 26.7% cases without stenosis, 26.7% with mild stenosis, 13.3% with moderate stenosis, and 33.3% with severe stenosis. A CNN was initially developed to classify videos into those four categories. However, the number of acquired exams was insufficient to effectively train the model for this purpose. So, the final model was trained to differentiate between videos with or without stenosis, achieving an accuracy of 92% with a loss of 0.26. The results demonstrate that both DIP and CNN are effective in distinguishing between cases with and without stenosis. Moreover, DIP was capable of classifying varying degrees of stenosis severity—mild, moderate, and severe—highlighting its potential as a valuable tool in clinical decision support.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Extract Nutritional Information from Bilingual Food Labels Using Large Language Models
by
Fatmah Y. Assiri, Mohammad D. Alahmadi, Mohammed A. Almuashi and Ayidh M. Almansour
J. Imaging 2025, 11(8), 271; https://doi.org/10.3390/jimaging11080271 - 13 Aug 2025
Abstract
Food product labels serve as a critical source of information, providing details about nutritional content, ingredients, and health implications. These labels enable Food and Drug Authorities (FDA) to ensure compliance and take necessary health-related and logistics actions. Additionally, product labels are essential for
[...] Read more.
Food product labels serve as a critical source of information, providing details about nutritional content, ingredients, and health implications. These labels enable Food and Drug Authorities (FDA) to ensure compliance and take necessary health-related and logistics actions. Additionally, product labels are essential for online grocery stores to offer reliable nutrition facts and empower customers to make informed dietary decisions. Unfortunately, product labels are typically available in image formats, requiring organizations and online stores to manually transcribe them—a process that is not only time-consuming but also highly prone to human error, especially with multilingual labels that add complexity to the task. Our study investigates the challenges and effectiveness of leveraging large language models (LLMs) to extract nutritional elements and values from multilingual food product labels, with a specific focus on Arabic and English. A comprehensive empirical analysis was conducted using a manually curated dataset of 294 food product labels, comprising 588 transcribed nutritional elements and values in both languages, which served as the ground truth for evaluation. The findings reveal that while LLMs performed better in extracting English elements and values compared to Arabic, our post-processing techniques significantly enhanced their accuracy, with GPT-4o outperforming GPT-4V and Gemini.
Full article
(This article belongs to the Special Issue Computer Vision for Food Data Analysis: Methods, Challenges, and Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
PU-DZMS: Point Cloud Upsampling via Dense Zoom Encoder and Multi-Scale Complementary Regression
by
Shucong Li, Zhenyu Liu, Tianlei Wang and Zhiheng Zhou
J. Imaging 2025, 11(8), 270; https://doi.org/10.3390/jimaging11080270 - 12 Aug 2025
Abstract
Point cloud imaging technology usually faces the problem of point cloud sparsity, which leads to a lack of important geometric detail. There are many point cloud upsampling networks that have been designed to solve this problem. However, the existing methods have limitations in
[...] Read more.
Point cloud imaging technology usually faces the problem of point cloud sparsity, which leads to a lack of important geometric detail. There are many point cloud upsampling networks that have been designed to solve this problem. However, the existing methods have limitations in local–global relation understanding, leading to contour distortion and many local sparse regions. To this end, PU-DZMS is proposed with two components. (1) the Dense Zoom Encoder (DENZE) is designed to capture local–global features by using ZOOM Blocks with a dense connection. The main module in the ZOOM Block is the Zoom Encoder, which embeds a Transformer mechanism into the down–upsampling process to enhance local–global geometric features. The geometric edge of the point cloud would be clear under the DENZE. (2) The Multi-Scale Complementary Regression (MSCR) module is designed to expand the features and regress a dense point cloud. MSCR obtains the features’ geometric distribution differences across scales to ensure geometric continuity, and it regresses new points by adopting cross-scale residual learning. The local sparse regions of the point cloud would be reduced by the MSCR module. The experimental results on the PU-GAN dataset and the PU-Net dataset show that the proposed method performs well on point cloud upsampling tasks.
Full article
(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)
►▼
Show Figures

Figure 1
Open AccessReview
A Review on Deep Learning Methods for Glioma Segmentation, Limitations, and Future Perspectives
by
Cecilia Diana-Albelda, Álvaro García-Martín and Jesus Bescos
J. Imaging 2025, 11(8), 269; https://doi.org/10.3390/jimaging11080269 - 11 Aug 2025
Abstract
Accurate and automated segmentation of gliomas from Magnetic Resonance Imaging (MRI) is crucial for effective diagnosis, treatment planning, and patient monitoring. However, the aggressive nature and morphological complexity of these tumors pose significant challenges that call for advanced segmentation techniques. This review provides
[...] Read more.
Accurate and automated segmentation of gliomas from Magnetic Resonance Imaging (MRI) is crucial for effective diagnosis, treatment planning, and patient monitoring. However, the aggressive nature and morphological complexity of these tumors pose significant challenges that call for advanced segmentation techniques. This review provides a comprehensive analysis of Deep Learning (DL) methods for glioma segmentation, with a specific focus on bridging the gap between research performance and practical clinical deployment. We evaluate over 80 state-of-the-art models published up to 2025, categorizing them into CNN-based, Pure Transformer, and Hybrid CNN-Transformer architectures. The primary objective of this paper is to critically assess these models not only on their segmentation accuracy but also on their computational efficiency and suitability for real-world medical environments by incorporating hardware resource considerations. We present a comparison of model performance on the BraTS datasets benchmark and introduce a suitability analysis for top-performing models based on their robustness, efficiency, and completeness of tumor region delineation. By identifying current trends, limitations, and key trade-offs, this review offers future research directions aimed at optimizing the balance between technical performance and clinical usability to improve diagnostic outcomes for glioma patients.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Research on the Accessibility of Different Colour Schemes for Web Resources for People with Colour Blindness
by
Daiva Sajek, Olena Korotenko and Tetiana Kyrychok
J. Imaging 2025, 11(8), 268; https://doi.org/10.3390/jimaging11080268 - 11 Aug 2025
Abstract
This study is devoted to the analysis of the perception of colour schemes of web resources by users with different types of colour blindness (colour vision deficiency). The purpose of this study is to develop recommendations for choosing the optimal colour scheme for
[...] Read more.
This study is devoted to the analysis of the perception of colour schemes of web resources by users with different types of colour blindness (colour vision deficiency). The purpose of this study is to develop recommendations for choosing the optimal colour scheme for web resource design that will ensure the comfortable perception of content for the broadest possible audience, including users with colour vision deficiency of various types (deuteranopia and deuteranomaly, protanopia and protanomaly, tritanopia, and tritanomaly). This article presents the results of a survey of people with different colour vision deficiencies regarding the accessibility of web resources created using different colour schemes. The colour deviation value ∆E was calculated to objectively assess changes in the perception of different colour groups by people with colour vision impairments. The conclusions of this study emphasise the importance of taking into account the needs of users with colour vision impairments when developing web resources. Specific recommendations for choosing the best colour schemes for websites are also offered, which will help increase the accessibility and effectiveness of web content for users with different types of colour blindness.
Full article
(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)
►▼
Show Figures

Figure 1
Open AccessArticle
Placido Sub-Pixel Edge Detection Algorithm Based on Enhanced Mexican Hat Wavelet Transform and Improved Zernike Moments
by
Yujie Wang, Jinyu Liang, Yating Xiao, Xinfeng Liu, Jiale Li, Guangyu Cui and Quan Zhang
J. Imaging 2025, 11(8), 267; https://doi.org/10.3390/jimaging11080267 - 11 Aug 2025
Abstract
In order to meet the high-precision location requirements of the corneal Placido ring edge in corneal topographic reconstruction, this paper proposes a sub-pixel edge detection algorithm based on multi-scale and multi-position enhanced Mexican Hat Wavelet Transform and improved Zernike moment. Firstly, the image
[...] Read more.
In order to meet the high-precision location requirements of the corneal Placido ring edge in corneal topographic reconstruction, this paper proposes a sub-pixel edge detection algorithm based on multi-scale and multi-position enhanced Mexican Hat Wavelet Transform and improved Zernike moment. Firstly, the image undergoes preliminary processing using a multi-scale and multi-position enhanced Mexican Hat Wavelet Transform function. Subsequently, the preliminary edge information extracted is relocated based on the Zernike moments of a 9 × 9 template. Finally, two improved adaptive edge threshold algorithms are employed to determine the actual sub-pixel edge points of the image, thereby realizing sub-pixel edge detection for corneal Placido ring images. Through comparison and analysis of edge extraction results from real human eye images obtained using the algorithm proposed in this paper and those from other existing algorithms, it is observed that the average sub-pixel edge error of other algorithms is 0.286 pixels, whereas the proposed algorithm achieves an average error of only 0.094 pixels. Furthermore, the proposed algorithm demonstrates strong robustness against noise.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Evaluation of Transfer Learning Efficacy for Surgical Suture Quality Classification on Limited Datasets
by
Roman Ishchenko, Maksim Solopov, Andrey Popandopulo, Elizaveta Chechekhina, Viktor Turchin, Fedor Popivnenko, Aleksandr Ermak, Konstantyn Ladyk, Anton Konyashin, Kirill Golubitskiy, Aleksei Burtsev and Dmitry Filimonov
J. Imaging 2025, 11(8), 266; https://doi.org/10.3390/jimaging11080266 - 8 Aug 2025
Abstract
This study evaluates the effectiveness of transfer learning with pre-trained convolutional neural networks (CNNs) for the automated binary classification of surgical suture quality (high-quality/low-quality) using photographs of three suture types: interrupted open vascular sutures (IOVS), continuous over-and-over open sutures (COOS), and interrupted laparoscopic
[...] Read more.
This study evaluates the effectiveness of transfer learning with pre-trained convolutional neural networks (CNNs) for the automated binary classification of surgical suture quality (high-quality/low-quality) using photographs of three suture types: interrupted open vascular sutures (IOVS), continuous over-and-over open sutures (COOS), and interrupted laparoscopic sutures (ILS). To address the challenge of limited medical data, eight state-of-the-art CNN architectures—EfficientNetB0, ResNet50V2, MobileNetV3Large, VGG16, VGG19, InceptionV3, Xception, and DenseNet121—were trained and validated on small datasets (100–190 images per type) using 5-fold cross-validation. Performance was assessed using the F1-score, AUC-ROC, and a custom weighted stability-aware score (Scoreadj). The results demonstrate that transfer learning achieves robust classification (F1 > 0.90 for IOVS/ILS, 0.79 for COOS) despite data scarcity. ResNet50V2, DenseNet121, and Xception were more stable by Scoreadj, with ResNet50V2 achieving the highest AUC-ROC (0.959 ± 0.008) for IOVS internal view classification. GradCAM visualizations confirmed model focus on clinically relevant features (e.g., stitch uniformity, tissue apposition). These findings validate transfer learning as a powerful approach for developing objective, automated surgical skill assessment tools, reducing reliance on subjective expert evaluations while maintaining accuracy in resource-constrained settings.
Full article
(This article belongs to the Special Issue Advances in Machine Learning for Medical Imaging Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
A Vision Method for Detecting Citrus Separation Lines Using Line-Structured Light
by
Qingcang Yu, Song Xue and Yang Zheng
J. Imaging 2025, 11(8), 265; https://doi.org/10.3390/jimaging11080265 - 8 Aug 2025
Abstract
The detection of citrus separation lines is a crucial step in the citrus processing industry. Inspired by the achievements of line-structured light technology in surface defect detection, this paper proposes a method for detecting citrus separation lines based on line-structured light. Firstly, a
[...] Read more.
The detection of citrus separation lines is a crucial step in the citrus processing industry. Inspired by the achievements of line-structured light technology in surface defect detection, this paper proposes a method for detecting citrus separation lines based on line-structured light. Firstly, a gamma-corrected Otsu method is employed to extract the laser stripe region from the image. Secondly, an improved skeleton extraction algorithm is employed to mitigate the bifurcation errors inherent in original skeleton extraction algorithms while simultaneously acquiring 3D point cloud data of the citrus surface. Finally, the least squares progressive iterative approximation algorithm is applied to approximate the ideal surface curve; subsequently, principal component analysis is used to derive the normals of this ideally fitted curve. The deviation between each point (along its corresponding normal direction) and the actual geometric characteristic curve is then adopted as a quantitative index for separation lines positioning. The average similarity between the extracted separation lines and the manually defined standard separation lines reaches 92.5%. In total, 95% of the points on the separation lines obtained by this method have an error of less than 4 pixels. Experimental results demonstrate that through quantitative deviation analysis of geometric features, automatic detection and positioning of the separation lines are achieved, satisfying the requirements of high precision and non-destructiveness for automatic citrus splitting.
Full article
(This article belongs to the Topic Image Processing, Signal Processing and Their Applications)
►▼
Show Figures

Figure 1
Open AccessArticle
Systematic and Individualized Preparation of External Ear Canal Implants: Development and Validation of an Efficient and Accurate Automated Segmentation System
by
Yanjing Luo, Mohammadtaha Kouchakinezhad, Felix Repp, Verena Scheper, Thomas Lenarz and Farnaz Matin-Mann
J. Imaging 2025, 11(8), 264; https://doi.org/10.3390/jimaging11080264 - 8 Aug 2025
Abstract
External ear canal (EEC) stenosis, often associated with cholesteatoma, carries a high risk of postoperative restenosis despite surgical intervention. While individualized implants offer promise in preventing restenosis, the high morphological variability of EECs and the lack of standardized definitions hinder systematic implant design.
[...] Read more.
External ear canal (EEC) stenosis, often associated with cholesteatoma, carries a high risk of postoperative restenosis despite surgical intervention. While individualized implants offer promise in preventing restenosis, the high morphological variability of EECs and the lack of standardized definitions hinder systematic implant design. This study aimed to characterize individual EEC morphology and to develop a validated automated segmentation system for efficient implant preparation. Reference datasets were first generated by manual segmentation using 3D SlicerTM software version 5.2.2. Based on these, we developed a customized plugin capable of automatically identifying the maximal implantable region within the EEC and measuring its key dimensions. The accuracy of the plugin was assessed by comparing it with manual segmentation results in terms of shape, volume, length, and width. Validation was further performed using three temporal bone implantation experiments with 3D-Bioplotter©-fabricated EEC implants. The automated system demonstrated strong consistency with manual methods and significantly improved segmentation efficiency. The plugin-generated models enabled successful implant fabrication and placement in all validation tests. These results confirm the system’s clinical feasibility and support its use for individualized and systematic EEC implant design. The developed tool holds potential to improve surgical planning and reduce postoperative restenosis in EEC stenosis treatment.
Full article
(This article belongs to the Special Issue Current Progress in Medical Image Segmentation)
►▼
Show Figures

Graphical abstract
Open AccessArticle
Enhancing YOLOv5 for Autonomous Driving: Efficient Attention-Based Object Detection on Edge Devices
by
Mortda A. A. Adam and Jules R. Tapamo
J. Imaging 2025, 11(8), 263; https://doi.org/10.3390/jimaging11080263 - 8 Aug 2025
Abstract
On-road vision-based systems rely on object detection to ensure vehicle safety and efficiency, making it an essential component of autonomous driving. Deep learning methods show high performance; however, they often require special hardware due to their large sizes and computational complexity, which makes
[...] Read more.
On-road vision-based systems rely on object detection to ensure vehicle safety and efficiency, making it an essential component of autonomous driving. Deep learning methods show high performance; however, they often require special hardware due to their large sizes and computational complexity, which makes real-time deployment on edge devices expensive. This study proposes lightweight object detection models based on the YOLOv5s architecture, known for its speed and accuracy. The models integrate advanced channel attention strategies, specifically the ECA module and SE attention blocks, to enhance feature selection while minimizing computational overhead. Four models were developed and trained on the KITTI dataset. The models were analyzed using key evaluation metrics to assess their effectiveness in real-time autonomous driving scenarios, including precision, recall, and mean average precision (mAP). BaseECAx2 emerged as the most efficient model for edge devices, achieving the lowest GFLOPs (13) and smallest model size (9.1 MB) without sacrificing performance. The BaseSE-ECA model demonstrated outstanding accuracy in vehicle detection, reaching a precision of 96.69% and an mAP of 98.4%, making it ideal for high-precision autonomous driving scenarios. We also assessed the models’ robustness in more challenging environments by training and testing them on the BDD-100K dataset. While the models exhibited reduced performance in complex scenarios involving low-light conditions and motion blur, this evaluation highlights potential areas for improvement in challenging real-world driving conditions. This study bridges the gap between affordability and performance, presenting lightweight, cost-effective solutions for integration into real-time autonomous vehicle systems.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
SABE-YOLO: Structure-Aware and Boundary-Enhanced YOLO for Weld Seam Instance Segmentation
by
Rui Wen, Wu Xie, Yong Fan and Lanlan Shen
J. Imaging 2025, 11(8), 262; https://doi.org/10.3390/jimaging11080262 - 6 Aug 2025
Abstract
Accurate weld seam recognition is essential in automated welding systems, as it directly affects path planning and welding quality. With the rapid advancement of industrial vision, weld seam instance segmentation has emerged as a prominent research focus in both academia and industry. However,
[...] Read more.
Accurate weld seam recognition is essential in automated welding systems, as it directly affects path planning and welding quality. With the rapid advancement of industrial vision, weld seam instance segmentation has emerged as a prominent research focus in both academia and industry. However, existing approaches still face significant challenges in boundary perception and structural representation. Due to the inherently elongated shapes, complex geometries, and blurred edges of weld seams, current segmentation models often struggle to maintain high accuracy in practical applications. To address this issue, a novel structure-aware and boundary-enhanced YOLO (SABE-YOLO) is proposed for weld seam instance segmentation. First, a Structure-Aware Fusion Module (SAFM) is designed to enhance structural feature representation through strip pooling attention and element-wise multiplicative fusion, targeting the difficulty in extracting elongated and complex features. Second, a C2f-based Boundary-Enhanced Aggregation Module (C2f-BEAM) is constructed to improve edge feature sensitivity by integrating multi-scale boundary detail extraction, feature aggregation, and attention mechanisms. Finally, the inner minimum point distance-based intersection over union (Inner-MPDIoU) is introduced to improve localization accuracy for weld seam regions. Experimental results on the self-built weld seam image dataset show that SABE-YOLO outperforms YOLOv8n-Seg by 3 percentage points in the AP(50–95) metric, reaching 46.3%. Meanwhile, it maintains a low computational cost (18.3 GFLOPs) and a small number of parameters (6.6M), while achieving an inference speed of 127 FPS, demonstrating a favorable trade-off between segmentation accuracy and computational efficiency. The proposed method provides an effective solution for high-precision visual perception of complex weld seam structures and demonstrates strong potential for industrial application.
Full article
(This article belongs to the Section Image and Video Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Quantitative Magnetic Resonance Imaging and Patient-Reported Outcomes in Patients Undergoing Hip Labral Repair or Reconstruction
by
Kyle S. J. Jamar, Adam Peszek, Catherine C. Alder, Trevor J. Wait, Caleb J. Wipf, Carson L. Keeter, Stephanie W. Mayer, Charles P. Ho and James W. Genuario
J. Imaging 2025, 11(8), 261; https://doi.org/10.3390/jimaging11080261 - 5 Aug 2025
Abstract
This study evaluates the relationship between preoperative cartilage quality, measured by T2 mapping, and patient-reported outcomes following labral tear treatment. We retrospectively reviewed patients aged 14–50 who underwent primary hip arthroscopy with either labral repair or reconstruction. Preoperative T2 values of femoral, acetabular,
[...] Read more.
This study evaluates the relationship between preoperative cartilage quality, measured by T2 mapping, and patient-reported outcomes following labral tear treatment. We retrospectively reviewed patients aged 14–50 who underwent primary hip arthroscopy with either labral repair or reconstruction. Preoperative T2 values of femoral, acetabular, and labral tissue were assessed from MRI by blinded reviewers. International Hip Outcome Tool (iHOT-12) scores were collected preoperatively and up to two years postoperatively. Associations between T2 values and iHOT-12 scores were analyzed using univariate mixed linear models. Twenty-nine patients were included (mean age of 32.5 years, BMI 24 kg/m2, 48.3% female, and 22 repairs). Across all patients, higher T2 values were associated with higher iHOT-12 scores at baseline and early postoperative timepoints (three months for cartilage and six months for labrum; p < 0.05). Lower T2 values were associated with higher 12- and 24-month iHOT-12 scores across all structures (p < 0.001). Similar trends were observed within the repair and reconstruction subgroups, with delayed negative associations correlating with worse tissue quality. T2 mapping showed time-dependent correlations with iHOT-12 scores, indicating that worse cartilage or labral quality predicts poorer long-term outcomes. These findings support the utility of T2 mapping as a preoperative tool for prognosis in hip preservation surgery.
Full article
(This article belongs to the Special Issue New Developments in Musculoskeletal Imaging)
►▼
Show Figures

Figure 1
Open AccessArticle
Evaluating the Impact of 2D MRI Slice Orientation and Location on Alzheimer’s Disease Diagnosis Using a Lightweight Convolutional Neural Network
by
Nadia A. Mohsin and Mohammed H. Abdulameer
J. Imaging 2025, 11(8), 260; https://doi.org/10.3390/jimaging11080260 - 5 Aug 2025
Abstract
Accurate detection of Alzheimer’s disease (AD) is critical yet challenging for early medical intervention. Deep learning methods, especially convolutional neural networks (CNNs), have shown promising potential for improving diagnostic accuracy using magnetic resonance imaging (MRI). This study aims to identify the most informative
[...] Read more.
Accurate detection of Alzheimer’s disease (AD) is critical yet challenging for early medical intervention. Deep learning methods, especially convolutional neural networks (CNNs), have shown promising potential for improving diagnostic accuracy using magnetic resonance imaging (MRI). This study aims to identify the most informative combination of MRI slice orientation and anatomical location for AD classification. We propose an automated framework that first selects the most relevant slices using a feature entropy-based method applied to activation maps from a pretrained CNN model. For classification, we employ a lightweight CNN architecture based on depthwise separable convolutions to efficiently analyze the selected 2D MRI slices extracted from preprocessed 3D brain scans. To further interpret model behavior, an attention mechanism is integrated to analyze which feature level contributes the most to the classification process. The model is evaluated on three binary tasks: AD vs. mild cognitive impairment (MCI), AD vs. cognitively normal (CN), and MCI vs. CN. The experimental results show the highest accuracy (97.4%) in distinguishing AD from CN when utilizing the selected slices from the ninth axial segment, followed by the tenth segment of coronal and sagittal orientations. These findings demonstrate the significance of slice location and orientation in MRI-based AD diagnosis and highlight the potential of lightweight CNNs for clinical use.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
18 August 2025
Meet Us at the 4th Digital Heritage World Congress & Expo 2025, 8–13 September 2025, Siena, Italy
Meet Us at the 4th Digital Heritage World Congress & Expo 2025, 8–13 September 2025, Siena, Italy

12 August 2025
Meet Us at the 28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2025), 23–27 September 2025, Daejeon, Republic of Korea
Meet Us at the 28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2025), 23–27 September 2025, Daejeon, Republic of Korea

Topics
Topic in
Applied Sciences, Bioengineering, Diagnostics, J. Imaging, Signals
Signal Analysis and Biomedical Imaging for Precision Medicine
Topic Editors: Surbhi Bhatia Khan, Mo SaraeeDeadline: 31 August 2025
Topic in
Animals, Computers, Information, J. Imaging, Veterinary Sciences
AI, Deep Learning, and Machine Learning in Veterinary Science Imaging
Topic Editors: Vitor Filipe, Lio Gonçalves, Mário GinjaDeadline: 31 October 2025
Topic in
Applied Sciences, Electronics, MAKE, J. Imaging, Sensors
Applied Computer Vision and Pattern Recognition: 2nd Edition
Topic Editors: Antonio Fernández-Caballero, Byung-Gyu KimDeadline: 31 December 2025
Topic in
Applied Sciences, Computers, Electronics, Information, J. Imaging
Visual Computing and Understanding: New Developments and Trends
Topic Editors: Wei Zhou, Guanghui Yue, Wenhan YangDeadline: 31 March 2026

Special Issues
Special Issue in
J. Imaging
Current Progress in Medical Image Segmentation
Guest Editor: Krishna ChaitanyaDeadline: 29 August 2025
Special Issue in
J. Imaging
Imaging Applications in Agriculture
Guest Editors: Pierre Gouton, Saeid Minaei, Vahid MohammadiDeadline: 31 August 2025
Special Issue in
J. Imaging
Techniques and Applications in Face Image Analysis
Guest Editor: Mohamed DahmaneDeadline: 31 August 2025
Special Issue in
J. Imaging
Advances in Medical Imaging and Machine Learning
Guest Editors: Ester Bonmati Coll, Barbara VillariniDeadline: 31 August 2025